OpenAI’s o3 and o4-mini models hallucinate more than older models

OpenAI's newly-launched o3 and o4-mini models hallucinate or make answers up more often than the startup's older models, its technical report said. The o3 and o4-mini models hallucinated 33% and 48% of the time respectively. OpenAI's older o3-mini model scored 14.8% and its o1 model scored 16% in hallucination rating. OpenAI said it needs to research more on hallucinations' cause.