It must be tough being a kid these days. Born too late to actually enjoy the internet, too early to declare yourself god-emperor of a desert wasteland run on water scarcity and guzzoline – and should you try to numb the pain with a little light math, you’ll most likely have to put up with coming second to a robot.
“The International Mathematical Olympiad is a modern-day arena for the world's brightest high-school mathematicians,” write Trieu Trinh and Thang Luong, research scientists at Google DeepMind, in a new blog post about their breakthrough artificial intelligence (AI) system, AlphaGeometry.
AlphaGeometry is "an AI system that solves complex geometry problems at a level approaching a human Olympiad gold-medalist – a breakthrough in AI performance,” they announce. “In a benchmarking test of 30 Olympiad geometry problems, AlphaGeometry solved 25 within the standard Olympiad time limit. For comparison […] the average human gold medalist solved 25.9 problems.”
It’s not just the system’s score in the contest that’s impressive. It’s been almost 50 years since the first ever mathematical proof by computer – essentially a brute-force workthrough of the four-color theorem – and since then, the admittedly controversial realm of computer-assisted proofs has come on leaps and bounds.
But very recently, with the dawn of things like big data and advanced machine learning techniques, we’ve started to see a shift – however slight – away from using computers as simple number-crunchers, and towards artificial intelligence that can produce genuinely creative proofs.
The fact that AlphaGeometry can tackle the kinds of complex mathematical problems faced by Olympiad mathletes may signal a key milestone in AI research, Trinh and Luong believe.
Until now, such a program would face at least two major hurdles. Firstly, computers are, well, computers; as anybody who’s ever written out 50 pages of code only to have the whole thing foiled by one mistyped semicolon in line 337 can tell you, they’re not great at things like reasoning or deduction. Secondly, math is kind of difficult to teach even the most cutting-edge machine learning system.
“Learning systems like neural networks are quite bad at doing ‘algebraic reasoning’,” David Saxton, also of DeepMind, told New Scientist back in 2019.
“Humans are good at [math],” he added, “but they are using general reasoning skills that current artificial learning systems don’t possess.”
AlphaGeometry, however, takes on these challenges by combining a neural language model – good at making quick predictions, but rubbish at making actual sense – with a symbolic deduction engine. These latter machines are “based on formal logic and use clear rules to arrive at conclusions,” Trinh and Luong write, making them better at rational deduction, but also slow and inflexible – “especially when dealing with large, complex problems on their own.”
Together, the two systems worked in a sort of loop: the symbolic deduction engine would chug away at the problem until it got stuck, at which point the language model would suggest a tweak to the argument. It was a great theory – there was just one problem. What would they train the language model on?
Ideally, the program would be fed millions if not billions of human-made geometric proofs, which it could then chew up and spit back out in varying levels of gobbledegook. But “human-made” and “geometric” don’t exactly work well with “computer program” – “[AlphaGeometry] does not ‘see’ anything about the problems that it solves,” Stanislas Dehaene, a cognitive neuroscientist at the Collège de France who studies foundational geometric knowledge, told the New York Times. “There is absolutely no spatial perception of the circles, lines and triangles that the system learns to manipulate.”
So the team had to come up with a different solution. “Using highly parallelized computing, the system started by generating one billion random diagrams of geometric objects and exhaustively derived all the relationships between the points and lines in each diagram,” Trinh and Luong explain.
“AlphaGeometry found all the proofs contained in each diagram, then worked backwards to find out what additional constructs, if any, were needed to arrive at those proofs,” they continue. They call this process "symbolic deduction and traceback".
And it was evidently successful: not only was the AI nearly as good as the average human IMO gold medalist, but it was 2.5 times as successful as the previous state-of-the-art system to attempt the challenge. “Its geometry capability alone makes it the first AI model in the world capable of passing the bronze medal threshold of the IMO in 2000 and 2015,” the pair note.
While the system is currently confined to geometry problems, Trinh and Luong hope to expand the capabilities of math AI across far more disciplines.
“We’re not making incremental improvement,” Trinh told the Times. “We’re making a big jump, a big breakthrough in terms of the result.”
“Just don’t overhype it,” he added.