Jesus Christ, AI journalism is so bad right now.
Exhibit one, Gizmodo is raving about an article that 'proves' LLMs have a "mathematical limit" and can't possibly do tasks that are too complex.
Let's look at the actual paper, shall we?
gizmodo.com/ai-agents-ar...
AI Agents Are Poised to Hit a Mathematical Wall, Study Finds
LLMs have their limits.
So first thing to note is that it's an Arxiv preprint, meaning it has no peer review. Also, it's the only paper by the two coauthors.
(Gotta give props to Gizmodo, at least they linked the paper. Some places don't do that and it's infuriating)
arxiv.org/abs/2507.07505
Hallucination Stations: On Some Basic Limitations of Transformer-Based Language Models
In this paper we explore hallucinations and related capability limitations in LLMs and LLM-based agents from the perspective of computational complexity. We show that beyond a certain complexity, LLMs...
Then we get to the paper and... hoo boy.
The central claim is that LLM output has O(n^2) computational complexity (n being input length), meaning it cannot solve problems harder than O(n^2).
Later they give the example of printing all length k binary strings, which would take 2^k steps.
Jan 24, 2026 06:39So how can an 2^k step problem be solved in n^2 operations????!?!!
This is a pretty egregious misunderstanding, assuming "one output" is the same as "solving a problem".
I'm no LLM expert, but to my layfolk understanding, LLMs output one token, over and over again *in sequence*
So how does an LLM output 2^k strings?
By outputting one character at a time, k*2^k times.
It doesn't matter if this takes quadratic time, it's still solvable!
(You could argue that this overflows the context window, but the strings have an ordering! LLM just needs to know *last* string printed)
An analogy: it is impossible for computers to uppercase a string, because iterating through a string is an O(n) operation and a CPU cycle can only do O(1) work.
(BTW their [7] reference in that screenshot does to one of the author's personal blogs. It doesn't list the calculations they claimed)
One more basic misunderstanding of the problem domain: "if a problem takes O(n^3) to solve, it must take O(n^3) to verify a solution"
Our best algorithm for solving 3SAT is ≈O(1.4^n) (n being number of variables). That's exponential. But we can CHECK IF A SOLUTION IS CORRECT in linear time.
Now, I want to be clear, I'm not mad at Gizmodo for taking all the paper claims at face value. You don't need to know a lot of CS to realize this paper is bunk, but you need to know at least *some* CS.
No, I'm mad because they didn't bother to google the author.
Because if they did...
...they'd know the paper was written by A HIGH SCHOOLER.
First result on Kagi is his LinkedIn profile. He started college THIS YEAR. Two months AFTER publishing this preprint!
And no, he doesn't have any attestable experience in computer science
He's, like, a linguistics and entrepreneurship guy
So that's why AI Journalism sucks. Journalists see an article that affirms their priors, don't have the experience to fact check it, don't bother to do the basic due diligence of making sure the author isn't an actual child, and push it to their audience as proving a "mathematical limit" of LLMs
Maybe I'm being too optimistic. Maybe they did check and did know that their source was AT MOST eighteen years old and published it anyway
(I'm picking on anti-AI journalism here but pro-AI journalism is equally as bad on this matter)
Futurism uncritically sharing the same bad article
futurism.com/artificial-i...
AI Agents Are Mathematically Incapable of Doing Functional Work, Paper Finds
A paper claims to mathematically prove that AI agents have a hard ceiling to their capabilities that they will never surpass.