netcode's User Avatar

@netcode

in /technology 3 days ago

Designing AI resistant technical evaluations

Designing AI resistant technical evaluations \ Anthropic - Featured Image

Designing AI resistant technical evaluations \ Anthropic

www.anthropic.com - faviconanthropic.com
TLDR

Anthropic has been using a take-home test to evaluate performance engineers as AI capabilities improve. The test, which involves optimizing code for a simulated accelerator, has been redesigned three times as AI models like Claude have increasingly outperformed human candidates. The latest iteration involves puzzles using a tiny, heavily constrained instruction set to test unconventional programming skills. Anthropic is releasing the original take-home as an open challenge, as human experts still outperform current models at sufficiently long time horizons.

6Score: 6

0 Comments