AI struggles to solve novel research-level math problems

A new "First Proof" benchmark tested four leading AI systems against 10 unpublished, research-level math problems. While human mathematicians solved all 10, the top-performing AI model solved only six. The study highlights that while AI excels at recognising patterns and applying known techniques, it still lacks the genuine creative reasoning required for original mathematical discovery.