computer-science

A recent experience with ChatGPT 5.5 Pro

Hacker News · May 9, 2026, 2:41 AM

Key takeaways

We are all having to keep revising upwards our assessments of the mathematical capabilities of large language models.
The background is that, as has been widely reported, LLMs are now capable of solving research-level problems, and have managed to solve several of the Erdős problems listed on Thomas Bloom s wonderful website.
I decided to try something a little bit different.

We are all having to keep revising upwards our assessments of the mathematical capabilities of large language models. I have just made a fairly large revision as a result of Chat GPT 5.5 Pro, to which I am fortunate to have been given access, producing a piece of Ph D-level research in an hour or so, with no serious mathematical input from me.

The background is that, as has been widely reported, LLMs are now capable of solving research-level problems, and have managed to solve several of the Erdős problems listed on Thomas Bloom s wonderful website. Initially it was possible to laugh this off: many of the solutions consisted in the LLM noticing that the problem had an answer sitting there in the literature already, or could be very easily deduced from known results. But little by little the laughter has become quieter. The message I am getting from what other mathematicians more involved in this enterprise have been saying is that LLMs have got to the point where if a problem has an easy argument that for one reason or another human mathematicians have missed (that reason sometimes, but not always, being that the problem has not received all that much attention), then there is a good chance that the LLMs will spot it. Conversely, for problems where one s initial reaction is to be impressed that an LLM has come up with a clever argument, it often turns out on closer inspection that there are precedents for those arguments, so it is still just about possible to comfort oneself that LLMs are merely putting together existing knowledge rather than having truly original ideas. How much of a comfort that is I will not discuss here, other than to note that quite a lot of perfectly good human mathematics consists in putting together existing knowledge and proof techniques.

I decided to try something a little bit different. At least in combinatorics, there are quite a lot of papers that investigate some relatively new combinatorial parameter that leads naturally to several questions. Because of the sheer number of questions one can ask, the authors of such papers will not necessarily have the time to spend a week or two thinking about each one, so there is a decent probability that at least some of them will not be all that hard. This makes such papers very valuable as sources of problems for mathematicians who are doing research for the first time and who will be hugely encouraged by solving a problem that was officially open. Or rather, it used to make them valuable in that way, but it looks as though the bar has just been raised. It is no longer enough that somebody asks a problem: it needs to be hard enough for an LLM not to be able to solve it.

Article preview — originally published by Hacker News. Full story at the source.

Read full story on Hacker News → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from Hacker News alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

A recent experience with ChatGPT 5.5 Pro

Key takeaways

More in computer-science