AI and the Turing Test

This workshop explores historical, philosophical, and scientific questions relating to Turing Tests, LLMs, and the comparison between human and machine intelligence.

Please email karim.thebault@bristol.ac.uk to register.

28th April 2026, G2 Cotham House, University of Bristol, 29 Cotham Hill, Bristol BS6 6JL

Programme

10:00 – 11:00: Peter Millican (Oxford & Singapore), The Logical Route to the Turing Machine

11:00 – 12:00: Michele Pizzochero (Bath), Human or Machine?

12:00 — 13:15 Lunch (not catered)

13:15 – 14:15: James Ladyman, Max Jones, and Emanuele Ratti (Bristol), LLLMs and the Evidence for General Intelligence

14:15 – 15:15: Theodor Nenu (Oxford), Computability, Incompleteness, and Intelligence

15:15 – 16:00: Panel Discussion, Chair: Ana-Maria Crețu (Bristol)

Abstracts

The Logical Route to the Turing Machine

Peter Millican
Gilbert Ryle Fellow and Professor of Philosophy, Hertford College Oxford
Tan Chorh Chuan Professor, National University of Singapore

Alan Turing’s 1936 paper in the Proceedings of the London Mathematical Society presented his classic model of a universal computing machine, thus creating the discipline of theoretical Computer Science, to which his work remains central. But what problem was he attempting to solve when he came up with what we now call a “Turing machine”? This talk aims to answer that question.

The 1936 paper starts from the concept of a computable number, and the Turing machine is widely assumed to be intended as an abstract model of the processes that a human “computer” might employ in calculation. The paper ends with “an application to the Entscheidungsproblem”, David Hilbert’s decision problem, and Turing has generally been understood as setting out to give a negative answer to this famous – and then very topical – problem.

However, a neglected reference within Turing’s paper points in a different direction, suggesting that his primary inspiration came neither from human computation nor from Hilbert’s problem, but instead, from a 1905 paradox about definable numbers which also inspired Gödel and Church. Turing’s novel idea was to follow through how the paradoxical logic would work if definability were interpreted as mechanical computability, which then required him to devise a model of a computing machine.

Unlike its rivals, this account makes excellent sense of the sequence and structure of Turing’s mathematical argument, explaining how it leads directly to a close relative of the halting problem. Turing then recognised that this invited a fairly obvious (albeit tricky) “application” to Hilbert’s decision problem, which in turn required addressing the issues of universality and of human computation. But these issues came last in Turing’s work on the paper rather than first, as indeed the ordering of the sections suggests.

Turing later became further intrigued by the comparison between human and machine thinking, presenting the “Turing Test” as the focus of his equally famous (but far less substantial) 1950 paper in Mind. If the interpretation given here is correct, however, the 1936 paper is squarely based on a mathematical and logical problem, rather than on the project of modelling human computation.

Human or Machine?

Michele Pizzochero
Department of Physics, University of Bath, UK

In this talk, I will present some of our recent quantitative results concerning the ability of large language models (LLMs) to imitate humans. First, I will begin by outlining our large-scale study involving more than 72,000 Turing-like tests across language and vision tasks. I will demonstrate that current LLMs are rapidly approaching the point where they can convincingly impersonate humans by deceiving human judges in a variety of domains [1]. Next, I will focus on Turing-like tests carried out at the population level, rather than the individual level. Drawing from a recent survey [2], I will show that the philosophical views of populations of physicists on science are nearly indistinguishable from those of their LLM-generated personas [3]. Finally, I will conclude with some considerations on the role of LLMs in learning physics at the university level, quantifying their ability to solve physics exams [4] and their potential to support students’ learning [5]. More broadly, these quantitative studies offer a framework for understanding how Turing-style tests can translate into the real-world deployment of LLMs across practical domains.

[1] M. Zhang, M. Pizzochero, …, G. Kreiman, Nature Human Behaviour, accepted (2026)
[2] C. Henne, H. Tomczyk, C. Sperber, Eur. J. Philos. Sci 14, 27 (2025)
[3] M. Pizzochero & G. Dellaferrera, arXiv:2507.00675 (2025)
[4, 5] B. Dowsett, M. Rey, M. Pizzochero, submitted (2026)

LLLMs and the Evidence for General Intelligence

James Ladyman, Max Jones, and Emanuele Ratti
University of Bristol

The history of AI is characterized by numerous attempts to evaluate new approaches through variations of the Turing Test. In a proactive commentary, Chen et al (2026) take stock of the evidence concerning LLMs and how they perform in the imitation game and comparable settings. Their conclusion is that the evidence is clear and unambiguous: “LLMs have shown many signs of the sort of broad, flexible cognitive competence that was Turing’s focus – what we now call ‘general intelligence’” (p 36).
This talk advances three claims that challenge Chen et al.’s views. First, the notion of ‘general intelligence’ assumed in the article is tied to the notion of ‘cognition’ in such a way that it raises difficult conceptual problems. In light of these issues, we should reconsider what the evidence presented supports. The second claim presented here is that the evidence unambiguously indicates only that LLMs are capable of executing tasks that we typically associate with human-level thinking. However, this claim is neither particularly controversial nor problematic, unless LLMs are capable of executing those tasks autonomously. This leads to the third claim: if LLMs are not autonomous agents (as Chen et al. suggest), then assessing whether LLMs can execute tasks that are associated with thinking is as crucial as discussing whether pocket calculators can perform tasks (i.e. calculating) that are traditionally associated with humans.

Chen et al. 2026. Does AI Have Human-Level Intelligence? The Evidence Is Clear. Nature, Vol 650

Computability, Incompleteness, and Intelligence

Theodor Nenu
University of Oxford

This talk will examine various miscellaneous themes which arise in the context of Alan Turing’s mathematical and philosophical work, including the question of whether he can be properly credited with proving the undecidability of the halting problem, but also the question of whether his response to the “mathematical objection” to his proposed test of machine intelligence is philosophically viable.