Hacker Neus
Why SWE-bench Verified no longer measures frontier coding capabilities
(openai.com)
8 points
by tedsanders
2 days ago |
0 comments