Hacker Neus
SWE-bench Verified no longer measures frontier coding capabilities
(openai.com)
322 points
by kmdupree
a day ago |
170 comments