China's Z.ai GLM-5.2 beats OpenAI GPT-5.5 on key benchmarks

Chinese startup Z.ai unveiled new large language model (LLM) GLM-5.2 built for "long-horizon" autonomous coding and engineering tasks. GLM-5.2 scored 54.7, going past GPT-5.5's 52.2, on Humanity's Last Exam (HLE), a benchmark used to measure LLM's reasoning skills. On FrontierSWE, which tests AI agent's ability to complete open-ended technical projects, GLM-5.2 outperformed GPT-5.5 by 1%.

Load More