# Should I rigorously prove math theorems

## Computers - the more rigorous mathematicians?

Mathematicians can be wrong too. That is why there are efforts today to put theorems on a solid basis by reducing them to a few axioms with the aid of a computer. In the future, attempts will be made to generate theorems entirely using computers.

Rigorous proof of theorems is the daily bread of mathematicians. The truth content of mathematical statements must be absolute. Nothing should remain unclear, no tacit assumptions should be made, nor should intuition simply be appealed to. But who makes sure that a proof meets these strict requirements? A new trend in mathematics is to entrust computers with verifying mathematical evidence. After the introduction of mathematical proof by the ancient Greeks in the 4th century BC. BC and the introduction of mathematical rigor in the 19th and early 20th centuries, the trend now emerging is another paradigm shift.

### When is proof a proof?

It may come as a surprise that some of the much-touted mathematical proofs are not as watertight as commonly believed. In general, a theorem is considered proven if enough colleagues are convinced that the derivation does not contain any errors or gaps. But even if every step of a proof is visible to everyone, human inadequacies also play a role in mathematics, and even the most conscientious mathematician can be wrong. After all, mathematics is also a social undertaking that aims to convince reviewers and colleagues of the correctness of their own conclusions with arguments that are as effective as possible. But even if this succeeds, a theorem cannot be considered proven with absolute certainty.

A typical example where the mathematician community has failed is the four-color theorem. According to this theorem, four colors are sufficient to color the countries on any possible geographical map in such a way that no two adjacent areas are the same color. In 1879 an English lawyer published a proof in the American Journal of Mathematics. The theorem was considered proven for eleven years, until a reader found a critical flaw in 1890. That made the problem an open question again and remained so for another eighty years.

In a similar sense it would be astonishing if the 15,000 pages of the classification of finite simple groups completed in 1982 did not hide one or the other error - perhaps even one that would ruin the whole project. Mathematicians are concerned that such complicated proofs cannot be fully understood by a single person, just as a nuclear power plant can unsettle the population because the complex functioning of the safety systems cannot be seen through by an individual.

The four-color theorem was finally proven in 1976. The mathematicians Kenneth Appel and Wolfgang Haken reduced all possible geographical maps to around 2000 basic shapes and were able to prove that none of them required more than four colors. This confirmed the truth of the theorem, but the proof had a flaw: the investigation of the basic forms was carried out by a computer. And with that, mathematical purists landed on the crucial question: Can a proof that is largely based on the computing power of a computer be accepted as rigorous proof? Nobody can understand the countless calculation steps of a computer program.

In 1998 the problem arose again. The centuries-old Kepler conjecture, according to which spheres are most densely packed when they are stacked in honeycomb-like layers, was proven by Thomas Hales by calculating the density of 5000 basic packings with a computer. To Hales' displeasure, the journal "Annals of Mathematics" initiated the publication of his work with the remark that the reviewers had not found any errors, but could not verify the correctness of the proof with absolute certainty.

### Translation into a formal language

Hales did not let the matter rest. Disappointed with the half-hearted approval of the journal, he and other mathematicians decided to use computers not only to comb through individual cases, but also to examine the validity of all the evidence and the correctness of the programs used. The first successes were not long in coming. In 2004, computer scientist Georges Gonthier from Microsoft Corporation succeeded in formalizing the proof of the four-color theorem and verifying it step by step with the aid of a computer.

The idea that Gonthier followed was not entirely new. Gottfried Leibniz recognized in the 17th century that a traditional proof formulated in natural language must first be translated into a formal language consisting only of symbols for logical verification. Translating a printed page into a formal, machine-readable form today takes an average of about a week of work, and the symbolic text is about four times as long as the original wording. The formalized evidence is then entered into what are known as evidence assistants. These are specialized computer programs, of which there are now about a dozen. Starting with the axioms, these test the proof step by step for its logical validity. Each line is either an axiom or an expression derived from the preceding lines with the help of the laws of deduction. Logical fallacies such as "Every French person speaks French, John speaks French, consequently John is French" are immediately uncovered by the algorithms.

The question immediately arises whether the evidence assistants themselves cannot be faulty. Is a computer even able to check its own correctness? Mathematicians and computer scientists try to get around the problem of self-reference by using different proof assistants to test them against each other. Even then, it cannot be ruled out with absolute certainty that several programs contain exactly the same error, but this is extremely unlikely. Since the system kernels of the evidence assistants are very slim with typically 500 lines (the Linux operating system comprises over 280 million lines), such checks can be controlled. In summary, Hales believes that the probability that a proof assistant will overlook an error when reviewing formal evidence is several orders of magnitude smaller than with the classic expert review procedure in specialist journals.

### Computers create new things

Today, as computers are no longer just used to comb through individual cases, but increasingly also to check existing evidence, the next trend is already emerging. Recently, mathematicians have been trying to have proofs or mathematical conjectures generated entirely by computers. The first successes have already been achieved. In a special issue of the "Notices of the American Mathematical Society", which was entirely devoted to formal proofs, Hales reported in December last year that graph theory (a branch of mathematics that deals with networks of points, for example) was already about a thousand conjectures have been formulated by computers. However, experts assess the development differently. Some see the future of mathematics in computer-verified or computer-generated evidence, others are frustrated at the perceived danger that mathematics could be degraded to a purely mechanical enterprise.

The fear that mathematicians could become superfluous because they rely on computers is unfounded, says Doron Zeilberger of Rutgers University, who has integrated computers into his research for many years. With all due respect for computer assistance, he recommends that his colleagues do not allow themselves to be choked by preoccupation with formal evidence. Mathematicians would never have used the axiomatic method to derive new theorems, only to verify the theorems they had found. Therefore, one should not allow oneself to be absorbed by the axiomatic tradition founded by the ancient Greeks, but rather adopt the empirical style that the Babylonians and Chinese would have used. Stephen Smale, who taught at the University of California at Berkeley, believes that proof in pure mathematics is generally overrated. Recognizing and understanding mathematical relationships is more important than deriving theorems.

### It's all about insight

But that's exactly where the formal evidence is lacking. Insight into the reasons why prime numbers or geometrical figures behave this way and not differently is usually lost when tracing back to the axioms. Ernst Specker, now a professor emeritus for mathematical logic at the ETH Zurich, remembers how more than fifty years ago he proved on three pages a thesis on set theory by the Harvard philosopher Willard Van Orman Quine. The evidence was convincing to any connoisseur, but Quine had commissioned a student to present it in such a way that it met the requirements of formal evidence. As a result, the proof was 70 pages long and hardly comprehensible even for experts. His colleague emeritus Christian Blatter also regards the attempts to formally develop existing mathematics from the axioms of logic and set theory as studies in programming without any knowledge of their own. The true value of a theorem lies in the mathematical insight, not in the formal proof.

While opinions differ about formal proofs in pure mathematics, such proofs are undisputed elsewhere. With the same means as those used by mathematicians today for the formal checking of theorems, it is possible to prove that computer programs are free of errors. This could prevent many a breakdown. One example is the "Pentium bug" discovered in 1994. Although an army of beta testers had put Intel's Pentium processor through its paces, a bug in the division of certain numbers went undetected. For Intel, this was one of the greatest embarrassments in its history. The error would have been avoidable if one had not contented oneself with checking the processor, but had rigorously proven that it delivers the correct result for all numbers without exception when dividing.