OpenAI’s DeepResearch can complete 26% of ‘Humanity’s Last Exam’ — a benchmark for the frontier of human knowledge

OpenAI’s o1 and DeepSeek’s R1 models, which previously sat atop the leaderboard, could only get through roughly 9% of the exam.

Feb 12, 2025 - 08:31

0

OpenAI’s DeepResearch can complete 26% of ‘Humanity’s Last Exam’ — a benchmark for the frontier of human knowledge

OpenAI’s o1 and DeepSeek’s R1 models, which previously sat atop the leaderboard, could only get through roughly 9% of the exam. Read More

Tags:

Previous Article

A Quiet New York Fashion Week Leaves Behind Big Questions

Super Micro’s finance chief — who is leaving as soon as the company can hire som...

Related Posts

Private capital: Investors’ cautious stance in 2024 may give way to a more aggressive approach

Private capital: Investors’ cautious stance in 2024 may...

Feb 19, 2025 0

Gen Z’s expensive drug habit: More want brand-name meds...

Feb 15, 2025 0

Op-Ed | Can Ralph Lauren and Coach Reboot American Luxury?

Op-Ed | Can Ralph Lauren and Coach Reboot American Luxury?

Mar 17, 2025 0

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.