AI and Proving Theorems

Paul Erdős famously said that mathematicians are machines that turn coffee into theorems. A couple of recent articles in the Wall Street Journal revealed the current state-of-the-art for AI to do the same.

In July, Ben Cohen published the article The High-Schoolers Who Just Beat the World’s Smartest AI Models. The focus of the article was the 2025 International Mathematical Olympiad, the pinnacle of the calendar for high school mathematics competitions. In the United States, the pathway to the IMO is first excellng at a sequence of increasing difficult exams: the AMC->12 (or possibly AMC->10), then the American Invitational Mathematics Exam (AIME), and then USA Mathematical Olympiad (USAMO) or USA Junior Mathematical Olympiad (USJAMO). The top USAMO and USJAMO participants then get invited to a special camp from which the participants in that year’s IMO are selected.

My personal story: back in high school, my score on the AMC->12 (then called the AHSME) qualified me for the AIME my sophomore and junior years, where my run in the competition ended with a resounding thud. My senior year, I caught lightning in a bottle and somehow qualified for the USAMO; I’m not sure what the cut-off is these days, but back then only 150 or so high school students qualified for the USAMO each year. My excitement at qualifying for the USAMO gave way to utter humiliation after I actually attempted the exam (to say that I “took” the exam is probably a misuse of the work “took”.) All this to say: I never came close to sniffing the IMO. From the Wall Street Journal article:

The famously grueling IMO exam is held over two days and gives students three increasingly difficult problems a day and more than four hours to solve them. The questions span algebra, geometry, number theory and combinatorics—and you can forget about answering them if you’re not a math whiz. You’ll give your brain a workout just trying to understand them. 

Because those problems are both complex and unconventional, the annual math test has become a useful benchmark for measuring AI progress from one year to the next. In this age of rapid development, the leading research labs dreamed of a day their systems would be powerful enough to meet the standard for an IMO gold medal, which became the AI equivalent of a four-minute mile. 

But nobody knew when they would reach that milestone or if they ever would—until now. 

The unthinkable occurred earlier this month when an AI model from Google DeepMind earned a gold-medal score at IMO by perfectly solving five of the six problems. In another dramatic twist, OpenAI also claimed gold despite not participating in the official event. The companies described their feats as giant leaps toward the future—even if they’re not quite there yet. 

In fact, the most remarkable part of this memorable event is that 26 students got higher scores on the IMO exam than the AI systems. 

A second article by Ben Cohen, The Math Legend Who Just Left Academic — for an AI Startup Run by a 24-Year-Old, might be a precursor of things to come. One of the two starts of the article is number theorist Dr. Ken Ono. From the article:

In recent years, Ono began tracking AI’s remarkable progress as it rapidly improved. He was intrigued, though not intimidated. AI was astonishing at cognitive tasks and solving problems it had already seen, but it struggled with the creative elements of his field, which require intuition and abstract thinking.

That creativity is so fundamental to pure mathematics that Ono figured his job would be safe for decades.

But last spring, he was one of 30 mathematicians invited to curate research-level problems as a test of the AI models. He left the symposium profoundly shaken by what he’d seen.

“The lead I had on the models was shrinking,” he said. “And in areas of mathematics that were not in my wheelhouse, I felt like the models were already blowing me away.”

For months afterward, Ono felt like he was grieving his identity. He didn’t know what to do next, knowing that AI models would only get smarter.

“Then I had an epiphany,” he said. “I realized what the models were offering was a different way of doing math.”

Dr. Ono is now taking an extended leave from the University of Virginia to join a new AI startup company called Axiom. From Tech Funding News:

Led by Carina Hong, Axiom Math is developing an AI system that not only solves complex math problems but also generates new mathematical knowledge by proposing conjectures: mathematical statements that have yet to be proven.

The model produces rigorous, step-by-step proofs that can be independently verified using proof assistants such as Lean and Coq. This approach aims to transform English-language math from textbooks and research papers into code, enabling the AI to create and validate new problems that push the boundaries of existing knowledge…

Currently, Axiom is working on models that can discover and solve new math problems. The researchers also hope to apply their work in areas like finance, aircraft design, chip design, and quantitative trading.

Beyond pure mathematics, Axiom’s AI tool is being tested for practical applications in fields requiring rigorous computational precision, including finance, aircraft and chip design, and quantitative trading.

Time will tell if the intersection of AI with mathematics can generate a profitable company. What I don’t doubt is that the previously unthinkable — original mathematical work by AI — will eventually happen, given enough time.

Different Ways of Expressing Small Proportions

I recently read the article Pipe Dreams about treating wastewater. I’m not an engineer and make no claims of expertise about the accuracy of the article. What did catch my attention, as a mathematician, is how the author chose to express small proportions. For example, the opening sentence:

Wastewater is 99.9 percent water, but boy, that last little bit.

Later in the article:

Orange County’s is an example of indirect potable reuse, where wastewater is cleansed to 99.9999999999 percent free of pathogens before it goes to an environmental buffer like a reservoir or an aquifer for further natural filtering and then to homes. 

And later:

After treating the water to even higher standards—demonstrating a 99.999999999999999999 percent removal rate of viruses and similarly high removal rates of protozoa—they may send the cleansed water directly into the water distribution system.

I was struck about the psychology of communicating all those consecutive 9s when expressing these proportions. For example, if the proportion of impurities instead of the proportion of water was given, the previous sentences could be rewritten as:

Only one part per thousand of wastewater is impurities.

Orange County’s is an example of indirect potable reuse, where impurities are reduced to one part per trillion before it goes to an environmental buffer like a reservoir or an aquifer for further natural filtering and then to homes. 

After treating the water to even higher standards, reducing impurities to one part per 100 million trillion, they may send the cleansed water directly into the water distribution system.

Of these two different ways of expressing the same information, it seems to me that the author’s original prose is perhaps most psychologically comforting. “One part per trillion” seems a little abstract, as most people don’t have an intuitive notion of just how big a trillion is. The phrase “99.9999999999 percent,” on the other hand, seems at first reading to be ridiculously close to 100 percent (which, of course, it is).

High School Students Finding New Proofs of Old Theorems (Part 2): Pythagorean theorem

This is a new favorite story to share with students: two high school students recently figured out multiple new proofs of the Pythagorean theorem.

Professional article in the American Mathematical Monthly (requires a subscription): https://maa.tandfonline.com/doi/full/10.1080/00029890.2024.2370240

Video describing one of their five ideas:

Interview in MAA Focus: http://digitaleditions.walsworthprintgroup.com/publication/?i=836749&p=14&view=issueViewer

Interview by 60 Minutes:

https://www.youtube.com/watch?v=VHeWndnHuQs

Praise from Michelle Obama: https://www.facebook.com/michelleobama/posts/i-just-love-this-story-about-two-high-school-students-calcea-johnson-and-nekiya-/750580956432311/

High School Students Finding New Proofs of Old Theorems (Part 1): Dividing a line segment with straightedge and compass

This is one of my all-time favorite stories to share with students: how a couple of ninth graders in 1995 played with Geometer’s Sketchpad and stumbled upon a brand-new way of using only a straightedge and compass to divide a line segment into any number of equal-sized parts. This article was published in 1997 and made quite a media sensation at the time.

Higher derivatives in ordinary speech

Just about every calculus student is taught that the first derivative is useful for finding the slope of a curve and finding velocity from position, and that the second derivative is useful for finding the concavity of a curve and finding acceleration from position.

I recently came across a couple of quotes that, taken literally, are statements about third and fifth derivatives.

Per Wikipedia, President Nixon announced in 1972 that the rate of increase of inflation was decreasing. Taken literally, this claims that “the second derivative of inflation is negative, and so the third derivative of purchasing power [since inflation is the derivative of purchasing power] is negative.” As dryly stated in the Notices of the American Mathematical Society, “[t]his was the first time a sitting president used the third derivative to advance his case for reelection”; the article then ponders the implications of the abuse of mathematics.

More recently, the popular blog Math With Bad Drawings had some fun analyzing a clause that appeared in a 2013 op-ed piece: “As the rate of acceleration of innovation increases…” Taken literally, the words rate, innovation and increases all refer to a first derivative (innovation would be the rate at which technology changes), while the word acceleration refers to a second derivative. Therefore, taken literally and not rhetorically (which was clearly the authors’ intent), this brief clause is a claim that the fifth derivative of technology is positive.

Solving Problems Submitted to MAA Journals (Part 6e)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment overline{PQ} lies entirely in the interior of the unit circle?

Let D_r be the interior of the circle centered at the origin O with radius r. Also, let C(P,Q) denote the circle with diameter \overline{PQ}, and let R = OP be the distance of P from the origin.

In the previous post, we showed that

\hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) = \sqrt{1-r^2}.

To find \hbox{Pr}(C(P,Q) \subset D_1), I will integrate over this conditional probability:

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr,

where F(r) is the cumulative distribution function of R. For 0 \le r \le 1,

F(r) = \hbox{Pr}(R \le r) = \hbox{Pr}(P \in D_r) = \displaystyle \frac{\hbox{area}(D_r)}{\hbox{area}(D_1)} = \frac{\pi r^2}{\pi} = r^2.

Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr

= \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr.

To calculate this integral, I’ll use the trigonometric substitution u = 1-r^2. Then the endpoints r=0 and r=1 become u = \sqrt{1-0^2} = 1 and u = \sqrt{1-1^2} = 0. Also, du = -2r \, dr. Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr

= \displaystyle \int_1^0 -\sqrt{u} \, du

= \displaystyle \int_0^1 \sqrt{u} \, du

= \displaystyle \frac{2}{3} \left[  u^{3/2} \right]_0^1

=\displaystyle  \frac{2}{3}\left[ (1)^{3/2} - (0)^{3/2} \right]

= \displaystyle \frac{2}{3},

confirming the answer I had guessed from simulations.

Solving Problems Submitted to MAA Journals (Part 6d)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment overline{PQ} lies entirely in the interior of the unit circle?

As discussed in a previous post, I guessed from simulation that the answer is 2/3. Naturally, simulation is not a proof, and so I started thinking about how to prove this.

My first thought was to make the problem simpler by letting only one point be chosen at random instead of two. Suppose that the point P is fixed at a distance t from the origin. What is the probability that the point Q, chosen at random, uniformly, from the interior of the unit circle, has the desired property?

My second thought is that, by radial symmetry, I could rotate the figure so that the point P is located at (t,0). In this way, the probability in question is ultimately going to be a function of t.

There is a very nice way to compute such probabilities since Q is chosen at uniformly from the unit circle. Let A_t be the set of all points Q within the unit circle that have the desired property. Since the area of the unit circle is \pi(1)^2 = \pi, the probability of desired property happening is

\displaystyle \frac{\hbox{area}(A_t)}{\pi}.

Based on the simulations discussed in the previous post, my guess was that A_t was the interior of an ellipse centered at the origin with a semimajor axis of length 1 and a semiminor axis of length \sqrt{1-t^2}. Now I had to think about how to prove this.

As noted earlier in this series, the circle with diameter \overline{PQ} will lie within the unit circle exactly when MO+MP < 1, where M is the midpoint of \overline{PQ}. So suppose that P has coordinates (t,0), where t is known, and let the coordinates of Q be (x,y). Then the coordinates of M will be

\displaystyle \left( \frac{x+t}{2}, \frac{y}{2} \right),

so that

MO = \displaystyle \sqrt{ \left( \frac{x+t}{2} \right)^2 + \left( \frac{y}{2} \right)^2}

and

MP = \displaystyle \sqrt{ \left( \frac{x+t}{2} - t\right)^2 + \left( \frac{y}{2} \right)^2} =  \sqrt{ \left( \frac{x-t}{2} \right)^2 + \left( \frac{y}{2} \right)^2}.

Therefore, the condition MO+MP < 1 (again, equivalent to the condition that the circle with diameter \overline{PQ} lies within the unit circle) becomes

\displaystyle \sqrt{ \left( \frac{x+t}{2} \right)^2 + \left( \frac{y}{2} \right)^2} + \sqrt{ \left( \frac{x-t}{2} \right)^2 + \left( \frac{y}{2} \right)^2} < 1,

which simplifies to

\displaystyle \sqrt{ \frac{1}{4} \left[ (x+t)^2 + y^2 \right]} + \sqrt{ \frac{1}{4} \left[ (x-t)^2 + y^2 \right]} < 1

\displaystyle \frac{1}{2}\sqrt{   (x+t)^2 + y^2} +  \frac{1}{2}\sqrt{  (x-t)^2 + y^2} < 1

\displaystyle \sqrt{   (x+t)^2 + y^2} +  \sqrt{  (x-t)^2 + y^2} < 2.

When I saw this, light finally dawned. Given two points F_1 and F_2, called the foci, an ellipse is defined to be the set of all points Q so that QF_1 + QF_2 = 2a, where a is a constant. If the coordinates of Q, F_1, and F_2 are (x,y), (c,0), and (-c,0), then this becomes

\displaystyle \sqrt{   (x+c)^2 + y^2} +  \sqrt{  (x-c)^2 + y^2} = 2a.

Therefore, the set A_t is the interior of an ellipse centered at the origin with a = 1 and c = t. Furthermore, a = 1 is the semimajor axis of the ellipse, while the semiminor axis is equal to b = \sqrt{a^2-c^2} = \sqrt{1-t^2}.

At last, I could now return to the original question. Suppose that the point P is fixed at a distance t from the origin. What is the probability that the point Q, chosen at random, uniformly, from the interior of the unit circle, has the property that the circle with diameter \overline{PQ} lies within the unit circle? Since A_t is a subset of the interior of the unit circle, we see that this probability is equal to

\displaystyle \frac{\hbox{area}(A_t)}{\hbox{area of unit circle}} = \frac{\pi \cdot 1 \cdot \sqrt{1-t^2}}{\pi (1)^2} = \sqrt{1-t^2}.

In the next post, I’ll use this intermediate step to solve the original question.

Solving Problems Submitted to MAA Journals (Part 6c)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment \overline{PQ} lies entirely in the interior of the unit circle?

As discussed in the previous post, I guessed from simulation that the answer is 2/3. Naturally, simulation is not a proof, and so I started thinking about how to prove this.

My first thought was to make the problem simpler by letting only one point be chosen at random instead of two. Suppose that the point P is fixed at a distance t from the origin. What is the probability that the point Q, chosen at random, uniformly, from the interior of the unit circle, has the desired property?

My second thought is that, by radial symmetry, I could rotate the figure so that the point P is located at (t,0). In this way, the probability in question is ultimately going to be a function of t.

There is a very nice way to compute such probabilities since Q is chosen at uniformly from the unit circle. Let A_t be the probability that the point Q has the desired property. Since the area of the unit circle is \pi(1)^2 = \pi, the probability of desired property happening is

\displaystyle \frac{\hbox{area}(A_t)}{\pi}.

So, if I could figure out the shape of A_t, I could compute this conditional probability given the location of the point P.

But, once again, I initially had no idea of what this shape would look like. So, once again, I turned to simulation with Mathematica. As noted earlier in this series, the circle with diameter \overline{PQ} will lie within the unit circle exactly when MO+MP < 1, where M is the midpoint of \overline{PQ}. For my initial simulation, I chose P to have coordinates (0.5,0).

To my surprise, I immediately recognized that the points had the shape of an ellipse centered at the origin. Indeed, with a little playing around, it looked like this ellipse had a semimajor axis of 1 and a semiminor axis of about 0.87.

My next thought was to attempt to find the relationship between the length of the semiminor axis at the distance t of P from the origin. I thought I’d draw of few of these simulations for different values of t and then try to see if there was some natural function connecting t to my guesses. My next attempt was t = 0.6; as it turned out, it looked like the semiminor axis now had a length of 0.8.

At this point, something clicked: (6,8,10) is a Pythagorean triple, meaning that

6^2 + 8^2 = 10^2

(0.6)^2 + (0.8)^2 = 1^2

(0.8)^2 = 1 - (0.6)^2

0.8 = \sqrt{1 - (0.6)^2}

Also, 0.87 is very close to \sqrt{3}/2, a very familiar number from trigonometry:

\displaystyle \frac{\sqrt{3}}{2} = \sqrt{1 - (0.5)^2}

So I had a guess: the semiminor axis has length \sqrt{1-t^2}. A few more simulations with different values of t confirmed this guess. For instance, here’s the picture with t = 0.9.

Now that I was psychologically certain of the answer for A_t, all that remain was proving that this guess actually worked. That’ll be the subject of the next post.

Solving Problems Submitted to MAA Journals (Part 6b)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment \overline{PQ} lies entirely in the interior of the unit circle?

As discussed in the previous post, I guessed from simulation that the answer is 2/3. Naturally, simulation is not a proof, and so I started thinking about how to prove this.

My first thought was to make the problem simpler by letting only one point be chosen at random instead of two. Suppose that the point P is fixed at a distance t from the origin. What is the probability that the point Q, chosen at random, uniformly, from the interior of the unit circle, has the desired property?

My second thought is that, by radial symmetry, I could rotate the figure so that the point P is located at (t,0). In this way, the probability in question is ultimately going to be a function of t.

There is a very nice way to compute such probabilities since Q is chosen at uniformly from the unit circle. Let A_t be the probability that the point Q has the desired property. Since the area of the unit circle is \pi(1)^2 = \pi, the probability of desired property happening is

\displaystyle \frac{\hbox{area}(A_t)}{\pi}.

So, if I could figure out the shape of A_t, I could compute this conditional probability given the location of the point P.

But, once again, I initially had no idea of what this shape would look like. So, once again, I turned to simulation with Mathematica.

First, a technical detail that I ignored in the previous post. To generate points (x,y) at random inside the unit circle, one might think to let x = r \cos \theta and y = r \sin \theta, where the distance from the origin r is chosen at random between 0 and 1 and the angle \theta is chosen at random from 0 to 2\pi. Unfortunately, this simple simulation generates too many points that are close to the origin and not enough that are close to the circle:

To see why this happened, let R denote the distance of a randomly chosen point from the origin. Then the event R < r is the same as saying that the point lies inside the circle centered at the origin with radius r, so that the probability of this event should be

F(r) = P(R < r) = \displaystyle \frac{\pi r^2}{\pi (1)^2} = r^2.

However, in the above simulation, R was chosen uniformly from [0,1], so that P(R < r) = r. All this to say, the above simulation did not produce points uniformly chosen from the unit circle.

To remedy this, we employ the standard technique of using the inverse of the above function F(r), which is clearly F^{-1}(r) = \sqrt{r}. In other words, we will chose randomly chosen radius to have the form R= \sqrt{U}, where U is chosen uniformly on [0,1]. In this way,

P(R < r) = P( \sqrt{U} < r) = P(U < r^2) = r^2,

as required. Making this modification (highlighted in yellow) produces points that are more evenly distributed in the unit circle; any bunching of points or empty spaces are simply due to the luck of the draw.

In the next post, I’ll turn to the simulation of A_t.

Solving Problems Submitted to MAA Journals (Part 6a)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment \overline{PQ} lies entirely in the interior of the unit circle?

It took me a while to wrap my head around the statement of the problem. In the figure, the points P and Q are chosen from inside the unit circle (blue). Then the circle (pink) with diameter \overline{PQ} has center M, the midpoint of \overline{PQ}. Also, the radius of the pink circle is MP=MQ.

The pink circle will lie entirely the blue circle exactly when the green line containing the origin O, the point M, and a radius of the pink circle lies within the blue circle. Said another way, the condition is that the distance MO plus the radius of the pink circle is less than 1, or

MO + MP < 1.

As a first step toward wrapping my head around this problem, I programmed a simple simulation in Mathematica to count the number of times that MO + MP < 1 when points P and Q were chosen at random from the unit circle.

In the above simulation, out of about 61,000,000 attempts, 66.6644% of the attempts were successful. This leads to the natural guess that the true probability is 2/3. Indeed, the 95% confidence confidence interval (0.666524, 0.666764) contains 2/3, so that the difference of 0.666644 from 2/3 can be plausibly attributed to chance.

I end with a quick programming note. This certainly isn’t the ideal way to perform the simulation. First, for a fast simulation, I should have programmed in C++ or Python instead of Mathematica. Second, the coordinates of P and Q are chosen from the unit square, so it’s quite possible for P or Q or both to lie outside the unit circle. Indeed, the chance that both P and Q lie in the unit disk in this simulation is (\pi/4)^2 \approx 0.617, meaning that about 38.3\% of the simulations were simply wasted. So the only sense that this was a quick simulation was that I could type it quickly in Mathematica and then let the computer churn out a result. (I’ll talk about a better way to perform the simulation in the next post.)