Back to events

Paper Club: A Moore's Law for AI Capabilities

Date
Thursday 19 June 2025
Time
19:00 - 21:00
Location
Singapore

About the event

Technical Note: This event is intended for participants with a technical background. We strongly encourage reading the paper ahead of time to fully engage with the discussion. Join us as we explore "Measuring AI Ability to Complete Long Tasks," a fascinating paper that introduces a new way to track AI progress using an intuitive, human-centered metric. Instead of relying on traditional benchmarks that often saturate quickly, the researchers propose measuring AI capabilities through "task completion time horizon" - essentially asking: how long are the tasks that AI can complete with 50% reliability? By combining three diverse task suites (HCAST, RE-Bench, and a new suite called SWAA), they create a comprehensive evaluation spanning everything from 2-second decisions to 8-hour software projects.