5:42 Adam Kalai - Consensus Sampling for Safer Generative AI [Alignment Workshop] FAR․AI 64 views - 10 hours ago
5:23 Cozmin Ududec - Toy Models for Task-Horizon Scaling [Alignment Workshop] FAR․AI 123 views - 3 days ago
5:08 Chirag Agarwal - Polarity-Aware Probing for Quantifying Latent Alignment in LMs [Alignment Workshop] FAR․AI 83 views - 4 days ago
5:15 Niloofar Mireshghallah - What Does It Mean for Agentic AI to Preserve Privacy? [Alignment Workshop] FAR․AI 149 views - 5 days ago
30:57 Yoshua Bengio - Disentangling Agency & Predictive Power Without Solving ELK [Alignment Workshop] FAR․AI 360 views - 5 days ago
9:30 Anna Gausen - Measuring AI Systems’ Ability to Influence Humans [Alignment Workshop] FAR․AI 224 views - 1 week ago
5:08 Santosh Vempala - Why Language Models Hallucinate [Alignment Workshop] FAR․AI 170 views - 1 week ago
5:19 Natasha Jaques - Multi-agent RL for Provably Robust LLM Safety [Alignment Workshop] FAR․AI 212 views - 1 week ago
10:21 Marius Hobbhahn - Eval Awareness is Becoming a Problem [Alignment Workshop] FAR․AI 386 views - 2 weeks ago
5:05 Chris Cundy - Peril and Potentials of Training with Lie Detectors [Alignment Workshop] FAR․AI 203 views - 2 weeks ago
5:42 Sarah Schwettmann - Scalable Oversight and Understanding [Alignment Workshop] FAR․AI 175 views - 3 weeks ago
10:25 Stephen Casper - Powerful Open-Weight AI Models: Wonderful, Terrible & Inevitable [Alignment Worksho FAR․AI 540 views - 3 weeks ago
55:19 Owain Evans - Weird Generalizations and Backdoors: New Ways to Corrupt LLMs FAR․AI 436 views - 1 month ago
9:21 Asa Cooper Stickland - AI Control Needs Redteaming [Alignment Workshop] FAR․AI 347 views - 1 month ago
9:25 Adam Gleave – STACK: Adversarial Attacks on LLM Safeguard Pipelines [AAAI 2026] FAR․AI 293 views - 1 month ago
9:41 Tomek Korbak - Chain of Thought Monitorability for AI Safety [Alignment Workshop] FAR․AI 506 views - 1 month ago
22:58 Adam Gleave - AI in 2025: Faster Progress, Harder Problems [Alignment Workshop] FAR․AI 766 views - 1 month ago
9:51 Sam Bowman - Lessons Learned from the First Misalignment Safety Case [Alignment Workshop] FAR․AI 598 views - 2 months ago
10:18 Maja Trębacz - Scalable Oversight: A Practical Approach to Verifying Code at Scale [Alignment Works FAR․AI 277 views - 2 months ago
10:20 Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop] FAR․AI 2K views - 2 months ago
10:01 Anka Reuel - How do we know what AI can (and can't) do? [Alignment Workshop] FAR․AI 411 views - 2 months ago
21:30 Yoshua Bengio - An Argument for the Safety of Scientist Al [UK AISI Alignment Conference] FAR․AI 359 views - 2 months ago
30:58 Alex Bores - How the States Should Regulate AI [Journalism Workshop] FAR․AI 304 views - 2 months ago
54:29 Gillian Hadfield - Alignment is Social: Lessons from Human Alignment for AI FAR․AI 211 views - 2 months ago
9:17 Sarah Cen - AI Supply Chains: An Emerging Ecosystem of AI Dependencies [Technical AI Policy] FAR․AI 376 views - 3 months ago
5:32 Irene Solaiman - Access Considerations for Generative AI Systems Beyond Release [Technical AI Polic FAR․AI 155 views - 3 months ago
5:41 Hamza Chaudhry - AI-Military Integration Under President Trump [Technical AI Policy] FAR․AI 176 views - 3 months ago
5:28 Tina Morrison - Verifiable Compute & Building Trust in Agentic Supply Chains [Technical AI Policy] FAR․AI 217 views - 3 months ago
10:07 Steve Kelly - National Security & Defense [Technical AI Policy] FAR․AI 132 views - 3 months ago
5:12 Olivia Shoemaker - Sandpaper Socks: Operational Considerations in AI Evals [Technical AI Policy] FAR․AI 194 views - 3 months ago
20:33 Ben Bucknall - An Overview of Technical AI Governance [Technical AI Policy] FAR․AI 363 views - 4 months ago
15:16 Mary Phuong - AI Control: Addressing Risks from Agentic Internal Deployments [Technical AI Policy] FAR․AI 295 views - 4 months ago
33:19 Aaron Scher - What Would it Take to Stop the Development of Superintelligence? FAR․AI 267 views - 4 months ago
17:12 Miranda Bogen - Making Sense of the AI Auditing Ecosystem [Technical AI Policy] FAR․AI 159 views - 4 months ago
5:30 Robert Trager - Prioritizing Technical Governance Investments: Verification [Technical AI Policy] FAR․AI 151 views - 4 months ago
12:51 Charles Yang - Hydropower: the Missing Piece [Technical AI Policy] FAR․AI 80 views - 4 months ago
6:30 Fatemeh Ganji - Application of Cryptographic Primitives in AI Accelerators [Technical AI Policy] FAR․AI 166 views - 4 months ago
7:46 Kevin Wei - Policy-Oriented AI Evaluations [Technical AI Policy] FAR․AI 236 views - 5 months ago
3:43 Ben Cottier - Rising Cost of Evaluations, Falling Cost of Intelligence [Technical AI Policy] FAR․AI 158 views - 5 months ago
4:36 Baoyuan Wu - Unthinking Vulnerability of Large Reasoning Models [Alignment Workshop] FAR․AI 134 views - 5 months ago
5:28 Huiqi Deng - Unified Explanation of DNN Inference Logic & Representation [Alignment Workshop] FAR․AI 95 views - 5 months ago
5:15 Cassidy Laidlaw - A New Definition & Improved Mitigation for Reward Hacking [Alignment Workshop] FAR․AI 137 views - 5 months ago
5:57 Animesh Mukherjee - Safety Alignment of LLMs [Alignment Workshop] FAR․AI 317 views - 5 months ago
5:39 Sara McNaughton - AI National Security Policy: Industry and Government [Technical AI Policy] FAR․AI 90 views - 5 months ago
10:10 Arnab Datta - Compute in America: Playbook for Secure Clusters at Home [Technical AI Policy] FAR․AI 47 views - 5 months ago
5:14 Onni Aarne - Hardware-Enabled Verifiability as a Data Center Security Property [Technical AI Policy] FAR․AI 163 views - 5 months ago
5:37 Asad Ramzanali - Normal Policy Tools for a Normal Technology [Technical AI Policy] FAR․AI 95 views - 5 months ago
42:02 Yoshua Bengio - AI Catastrophic Risks & Scientist AI Solution [Alignment Workshop] FAR․AI 7K views - 6 months ago
4:34 Jiaming Ji - Deceptive Alignment & Thinking Monitor in LLMs [Alignment Workshop] FAR․AI 211 views - 6 months ago