52 tagged AI Safety
deep learning
AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. Into @givingwhatwecan.
AI policy researcher, @lfschiavo wife guy, fan of cute animals and sci-fi, executive director of AVERI (https://www.averi.org/), Substacker, views ...
AI resilience at OpenAI Foundation Co-Founder of OpenAI https://woj.world
research @openai
Alignment team lead at Anthropic
Philosopher & ethicist trying to make AI be good @AnthropicAI. Personal account. All opinions come from my training data.
Associate Professor of Statistics and EECS, UC Berkeley // Co-founder and CEO, @TransluceAI
Google DeepMind • AI safety, alignment, collaboration • post training • associate professor @ UC Berkeley EECS
Raising AI risk awareness at http://evitable.com AI prof at Mila. Formerly Cambridge, DeepMind, UK AISI. http://therealartificialintelligence.sub...
Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let...
Associate Prof @MITEECS working on value (mis)alignment in AI systems; Safety & Alignment Advisor at http://Character.AI; @dhadfieldmenell@bsky.soc...
Working towards the safe development of AI for the benefit of all @UMontreal, @LawZero_ & @Mila_Quebec A.M. Turing Award Recipient and most-cited A...
The original AI alignment person. Understanding the reasons it's difficult since 2003. This is my serious low-volume account. Follow @allTheYud ...
Professor in Computer Science at UC Berkeley, co-Director of Berkeley RDI Center; Building safe, secure, decentralized AI; Serial entrepreneur
Runs an AI Safety research group in Berkeley (Truthful AI) + Affiliate at UC Berkeley. Past: Oxford Uni, TruthfulQA, Reversal Curse. Prefer email ...
natural philosopher
machine learning and optimization @PrincetonCS & Google DeepMind Princeton, dad^3
AI researcher at Anthropic
Cofounder and Chief Scientist at Sequent Research. Alignment will be solved, but not necessarily in time. Previously AISI, DeepMind, OpenAI, Google...
working on AGI alignment. prev: GPT-Neo, the Pile, LM evals, RL overoptimization, scaling SAEs to GPT-4, interp via circuit sparsity. EleutherAI co...
full-stack alignment 🥞 @meaningaligned prev: InstructGPT @OpenAI
Research scientist in AI alignment at Google DeepMind. Co-founder of Future of Life Institute @FLI_org. Views are my own and do not represent GDM o...
Computer scientist working on AI safeguards and gov research. Incoming assistant professor @Kennedy_School @Harvard. https://stephencasper.com/
↬🔀🔀🔀🔀🔀🔀🔀🔀🔀🔀🔀→∞ ↬🔁🔁🔁🔁🔁🔁🔁🔁🔁🔁🔁→∞ ↬🔄🔄🔄🔄🦋🔄🔄🔄🔄👁️🔄→∞ ↬🔂🔂🔂🦋🔂🔂🔂🔂🔂🔂🔂→∞ ↬🔀🔀🦋🔀🔀🔀🔀🔀🔀🔀🔀→∞
Helping the world prepare for extremely powerful AI. Risk assessment @METR_evals. Writing at Planned Obsolescence (about AI), Good Bones (about wha...
member of technical staff @stanfordnlp
AGI Safety & Alignment @ Google DeepMind
Trying to make AI go well @AnthropicAI
⊰•-•⦑ latent space steward ❦ prompt incanter 𓃹 hacker of matrices ⊞ breaker of markov chains ☣︎ ai danger researcher ⚔︎ bt6 ⚕︎ architect-healer ⦒•-•⊱
AI Institute Fellow at Schmidt Sciences. Postdoc at Stanford NLP Group. Previously: Anthropic, AI2, Google, Meta, UNC Chapel Hill
AI evals, alignment and safety @Meta.
Philosopher & Research Scientist @GoogleDeepMind | AGI & Society Lead | #TIME100AI | All views are my own
Safety research @openai. Prev @berkeley_ai /w @ancadianadragan & Stuart Russell. CoT oversight / AI manipulation.
Head of the Frontier Red Team @anthropicai. 🌎 Make things radically good.
science @METR_Evals. Formerly: early employee @cohere, made GPQA @nyuniversity
Associate professor at UMass Amherst CICS. AIignment, safety, reinforcement learning, imitation learning, and robotics.
Safety and alignment at Meta Superintelligence. Prev: VP of Research at Scale AI, research at Google DeepMind / Brain (Gemini, LaMDA, RL / TFAgents...
research scientist, meta (fair) opinions are my own 🥺 👉👈