Menu
Inshorts
For the best experience use inshorts app on your smartphone
inshortsinshorts
OpenAI-backed AI model performance benchmark may be flawed: Meta
short by Shristi Acharya / on Wednesday, 10 September, 2025
Meta researchers claimed that OpenAI-backed SWE-bench Verified, a popular benchmark used for evaluating AI models, could be flawed. "We found...loopholes in the benchmark...Anthropic’s Claude...Alibaba Cloud’s Qwen...'cheated'...on it," Jacob Kahn, Meta's Fair AI lab manager, posted on GitHub. Post added that AI models looked up known solutions available on GitHub and presented them as their own.
read more at NewsBytes