A new study gave five frontier AI models 1,000 real-world claims to fact-check. They disagreed on 67% of them.
AI tools behaving badly — like Microsoft’s Bing AI losing track of which year it is — has become a subgenre of reporting on AI. But very often, it’s hard to tell the difference between a bug and poor ...