AI for TCS?

Jukka Suomela · July 29, 2025

While following STOC 2025 presentations, I started to pay attention to not what is present, but what is conspicuously missing: any references to the use of modern generative AI and machine learning to accelerate research in theoretical computer science.

Sure, there were a couple of lighthearted references to chatbots in the business meeting, and Scott Aaronson’s keynote concluded with a joke about generative AI doing our work soon, but it seemed like nobody is taking AI seriously as something that could transform TCS community into a better place.

It is not the case that TCS in general is still living in the 1970s: the TCS community is happy to work on post-quantum blockchain and other topics that might be considered modern or even hype. There were even some talks of the flavor “TCS for AI”, with TCS researchers doing work that aims at helping AI researchers understand the capabilities and limitations of generative chatbots. But where is work of the flavor “AI for TCS”?

If you go to a conference on software engineering or corpus linguistics in 2025, you will of course find a range of papers discussing how AI can or could help with research and practice in those areas. Why not in TCS?

When I asked various colleagues about this, the first reaction was along the lines “but what could we possibly do with AI in TCS, they can’t prove our theorems (yet)?” So let me try to explain what I mean by “AI for TCS”, with the help of a couple of example scenarios of possible futures. I try to make the scenarios concrete, so that they are easy to understand (please feel free to substitute words like Lean and Rust with your own favorite tools).

Scenario 1: formalizations

Imagine a future, say, approx. 10 years from now, that looks like this:

All STOC/FOCS/SODA/ICALP submissions come with a link to an online appendix, where the main theorems of the paper have been formalized in Lean. This is little extra work for the authors, as they only need to write the statement of the theorem in Lean, and generative AI tools will then work together with Lean to translate the informal human-written proof into something that makes Lean happy.

I think this would be an amazing future. It would put advances in TCS on a much more robust foundation. It would help the program committees and reviewers get confidence that the submissions are correct. We would catch mistakes much earlier, before they influence follow-up work. It would make it much easier to build on prior work, as all assumptions are stated in a formally precise manner.

Scenario 2: implementations

Now imagine another possible future, say, approx. 10 years from now, that looks like this:

Most STOC/FOCS/SODA/ICALP submissions come with a link to an online appendix, with efficient Rust implementations of all the new algorithms (and other algorithmic entities such as reductions) that were presented in the paper. The implementations come with test cases that demonstrate that the algorithm indeed works correctly, and there are benchmarks that compare the practical performance of the algorithm with all relevant prior algorithms. Everything is packaged as a neat Rust crate that practitioners can readily use with “cargo add” as part of their programs. This is little extra work for the authors, as generative AI tools will translate the informal human-written pseudocode into a concrete Rust implementation.

I think this would also be an amazing future. It would again help with gaining confidence that the results are indeed correct, but it would also help tremendously with ensuring that the work done in the TCS community will have broad impact in the industry: why not use the latest, fastest algorithms also in practice, if they are readily available as a library? It would also help to identify which algorithmic ideas are primarily of theoretical interest, and which also lead to efficient solutions on real-world computers, and it could inspire new work when TCS researchers see concretely that e.g. current algorithms are not well-compatible with the intricacies of modern superscalar pipelined vectorized CPUs and their memory hierarchies.

Combined with scenario 1, in more distant future we could eventually have automatically-generated formally verified implementations of the algorithms!

Can we get there?

Both of the above scenarios share these commonalities:

We will never get there without the help of some automated tools like generative AI. Formalizing informal proofs and engineering efficient implementations of algorithms is a huge amount of manual work, and this is also outside the skill set of many TCS researchers. The research community will never agree to take this extra burden if it is manual work.
It is not unreasonable to expect that generative AI might eventually be able to do this, if we start pushing it in the right direction. After all, both scenarios seem to boil down to translating text from one “language” to another and filling-in some boilerplate code. AI tools are already very good at turning informal human-written descriptions into correct program code. There is also little need for human verification, especially in the formalization scenario—if Lean is convinced, I am convinced.
These scenarios are highly unlikely to happen if we just sit and wait for advances in generative AI. We need to start to plan, build scaffolding, and do small-scale experiments. For both scenarios, we need to have some concrete starting point (e.g. key definitions commonly used in TCS already formalized in Lean so that one can at least write theorem statements easily). We need to start to build our corpus of training material that we could use to train and fine-tune language models to best suit our needs.

So this takes me to the main question: Where is this work going to take place? When do we get started? Are we already going to see work along these lines at STOC/FOCS/SODA/ICALP 2026?