The Pentium FDIV bug is the most famous (or infamous) of the Intel microprocessor bugs. It was caused by an error in a lookup table that was a part of Intel’s SRT algorithm that was to be faster and more accurate.
With a goal to boost the execution of floating-point scalar code by 3 times and vector code by 5 times, compared to the 486DX chip, Intel decided to use the SRT algorithm that can generate two quotient bits per clock cycle, while the traditional 486 shift-and-subtract algorithm was generating only one quotient bit per cycle. This SRT algorithm uses a lookup table to calculate the intermidiate quotients necessary for floating-point division. Intel’s lookup table consists of 1066 table entries, of which, due to a programming error, five were not downloaded into the programmable logic array (PLA). When any of these five cells is accessed by the floating point unit (FPU), it (the FPU) fetches zero instead of +2, which was supposed to be contained in the “missing” cells. This throws off the calculation and results in a less precise number than the correct answer(Byte Magazine, March 1995).
At its worst, this error can occur as high as the fourth significant digit of a decimal number, but the possibilities of this happening are 1 in 360 billion. It is most common that the error appears in the 9th or 10th decimal digit, which yields a chance of this happening of 1 in 9 billion.
Intel has clasified the bug (or the flaw, as they refer to it) with the following characteristics:
- On certain input data, the FPDI (Floating Point Divide Instructions) on the Pentium processor produce inaccurate results.
- The error can occur in any of the three operating precisions, namely single, double, or extended, for the divide instruction. However, it has been noted that far fewer failures are found in single precision than in double or extended precisions.
- The incidence of the problem is independent of the processor rounding modes.
- The occurrence of the problem is highly dependent on the input data. Only certain data will trigger the problem. There is a probability that 1 in 9 billion randomly fed divide or remainder instructions will produce inaccurate results.
- The degree of inaccuracy depends on the input data and upon the instruction involved.
- The problem does not occur on the specific use of the divide instruction to compute the reciprocal of the input operand in single precision.
- Furthermore, the bug affects any instruction that references the lookup table or calls FDIV. Related instructions that are affected by the bug are FDIVP, FDIVR, FDIVRP, FIDIV, FIDIVR, FPREM, and FPREM1. The instructions FPTAN and FPATAN are also susceptible. The instructions FYL2X, FYL2XP1, FSIN, FCOS, and FSINCOS, were a suspect but are now considered safe.
A 3-D plot of the ratio 4195835/3145727 calculated on a Pentium with FDIV bug. The depressed triangular areas indicate where incorrect values have been computed. The correct values all would round to 1.3338, but the returned values are 1.3337, an error in the fifth significant digit. Byte Magazine, March 1995.
The Pentium Chip Story: A Learning Experience
by Vince Emery
The pandemonium over Intel’s Pentium chip cost the company millions of dollars and could easily have been prevented. The uproar started and grew on the Internet.
June 1994: Intel testers discover a division error in the Pentium chip. Intel managers decide that the error will not affect many people and do not inform anyone outside the company. This was Intel’s first mistake. The company was right in that the division error could affect only a few customers, but not disclosing the information made Intel appear to hide a sinister secret. It sent the message to customers that Intel was not trustworthy. Disclosing the flaw upon discovery would have created only minor news, on the same low level as an automaker announcing a minor defect. (Today Intel posts all known flaws on the Internet to avoid a reccurrance of this problem.) The same month, Dr. Thomas R. Nicely, a professor of mathematics at Lynchburg College, Virginia, notices a small difference in two sets of numbers. He double-checks all his work by computing everything twice, in two different ways. Dr. Nicely spends months successively eliminating possible causes such as PCI bus errors and compiler artifacts.
Wednesday, October 19: After testing on several 486 and Pentium-based computers, Dr. Nicely is certain that the error is caused by the Pentium processor.
Monday, October 24: Dr. Nicely contacts Intel technical support. Intel’s contact person duplicates the error and confirms it, but says that it was not reported before.
Sunday, October 30: After receiving no more information from Intel, Dr. Nicely sends an email message to a few people, announcing his discovery of a “bug” in Pentium processors. (Dr. Nicely’s original email message)
That same day, Andrew Schulman, author of Unauthorized Windows 95, receives Dr. Nicely’s email.
Tuesday, November 1: Schulman forwards Dr. Nicely’s message to Richard Smith, president of Phar Lap Software in Cambridge, MA. Phar Lap’s customers write number-crunching software that could be affected by the Pentium flaw. Phar Lap programmers test and confirm the division error. Realizing the significance of the flaw, Smith immediately forwards Dr. Nicely’s message to important Phar Lap customers, to Intel, and to people at compiler companies, including Microsoft, Borland, Metaware and Watcom. He also posts the message to the Canopus forum of CompuServe with a note asking people to run Dr. Nicely’s test and report results back to Smith. This is the first public disclosure of the flaw.
Wednesday, November 2: Smith receives about ten confirmations of the error from Canopus readers. Alex Wolfe, a reporter for Electronic Engineering Times, sees Smith’s post on Canopus and starts research for a story. He forwards Dr. Nicely’s message to several people, including Terje Mathisen of Norsk Hydro in Norway.
Thursday, November 3: Mathisen confirms the flaw and emails his findings back to reporter Wolfe. Mathisen goes to the Internet newsgroup comp.sys.intel and posts a message titled “Glaring FDIV Bug in Pentium!” Within 24 hours, hundreds of technical people all over the world know about the Pentium flaw. (Note that only two days have passed since Schulman forwarded Dr. Nicely’s original message.) All hell breaks loose on the newsgroup.
Monday, November 7: Wolfe’s article runs in Electronic Engineering Times, headlined INTEL FIXES A PENTIUM FPU GLITCH. In the story, Intel says it has corrected the glitch in subsequent runs of the chip, and Steve Smith of Intel dismisses the importance of the flaw, saying, “This doesn’t even qualify as an errata (sic).” This is only the first print article about the flaw, but by this time there are hundreds of postings about it in CompuServe forums and Internet newsgroups. All
research results are posted in public on the Net for the world to criticize and contribute to.
Wednesday, November 9: The ruckus spills out of the technical newsgroups and into business and investment newsgroups.
Tuesday, November 15: Tim Coe of Vitesse Semiconductors and Mike Carleton of USC announce on the Net that they have reverse-engineered the way the Pentium chip handles division and created a model that predicts when the chip is wrong. By this time, a furor has erupted on the Net. Intel still claims there is no problem. Intel’s stock drops 1 3/8 points.
Tuesday, November 22: CNN’s Moneyline program looks at the issue. Steve Smith of Intel says the Pentium processor’s problem is minor.
Wednesday, November 23: MathWorks sends out what is a press release on the issue, MATHWORKS DEVELOPS FIX FOR THE INTEL PENTIUM FLOATING POINT PROCESSOR.
Thursday, November 24 (Thanksgiving holiday): The New York Times runs a story by John Markoff, CIRCUIT FLAW CAUSES PENTIUM CHIP TO MISCALCULATE, INTEL ADMITS. In the story, an Intel spokesman says the company is still sending out the flawed chips. A similar story by the Associated Press is printed by more than 200 newspapers and run on radio and television news. Intel Applications Support Manager Ken Hendren posts a message on America Online and the Internet, revealing that Intel has no one providing customer support on the Internet. Intel seems unaware of the solidity of opinion on the Net about the Pentium processor’s flaw. At this point, an offer by Intel to replace any flawed Pentium chips would have smoothed the waters. Instead, Intel makes an offer to replace a Pentium chip only after Intel had determined you used the chip in an application in which it would cause a problem. Intel customers are irate. The chip hits the fan.
Friday, November 25: This weekend, the Internet’s humor newsgroups sprout Pentium jokes.
Sunday, November 27: A notice appears on the Internet newsgroup comp.sys.intel, from Intel’s president, Dr. Andrew Grove, but bearing someone else’s “return address”. (Dr. Grove’s original posting)
Monday, November 28: Internet newsgroups are flooded with furious messages such as, “Having conclusively demonstrated themselves utterly unworthy of the public’s trust, they still seem unable to comprehend what that means.” No one from Intel responds to these posts.
November 29 – December 11: Intel receives thousands of messages and phone calls saying that Intel misses the point. Intel becomes a laughingstock on the Internet joke circuit:
- At Intel, quality is job 0.999999998.
- Q: Know how the Republicans can cut taxes and pay the deficit at the same time? A: Their spreadsheet runs on a Pentium
- We are Pentium of Borg. Resistance is futile. You will be approximated.
- The Intel version of Casablanca: “Round off the usual suspects.”
- Q: How many Pentium designers does it take to screw in a light bulb? A: .99995827903, but that’s close enough for
The situation degrades to a point past any logical response. People believe Intel does not stand behind its products. While the fury grows, Intel remains silent.
Monday, December 12: IBM issues a press release: IBM HALTS SHIPMENTS OF PENTIUM-BASED PCS. Intel counters with INTEL SAYS IBM SHIPMENT HALT IS UNWARRANTED. Internet analysts immediately demonstrate that IBM’s claims are exaggerated, but at the same time no one believes Intel.
Wednesday, December 14: Intel releases a white paper explaining the situation rationally. Too late. Intel’s communications are jammed with tens of thousands of phone calls and email messages from worried and angry customers.
Friday, December 16: Intel stock closes at $59.50, down $3.25 for the week.
Monday December 19: The New York Times prints a story by Laurie Flynn headlined INTEL FACING A GENERAL FEAR OF ITS PENTIUM CHIP. It says that eight product liability lawsuites and two shareholder suits were filed against Intel. Flynn quotes Florida Deputy Attorney General Pete Antonacci: “They’ve got to stop acting like a rinky-dink two-
person operation in a garage and start acting like the major corporation they are.” About the same time, a New York Times story about the New Jersey Nets basketball team is headlined MENTALLY SPEAKING, NETS ARE PENTIUMS. Intel’s lavishly promoted brand name has become an insult.
Tuesday, December 20: Intel finally apologizes and says it will replace all flawed Pentiums upon request. It sets aside a reserve of $420 million to cover costs. It hires hundreds of customer service employees to deal with customer requests. And it dedicates four fulltime employees to read Internet newsgroups and respond immediately to any postings about Intel or its products.
January, 1995: Intel has received commitments to purchase all the Pentium chips it can manufacture through the end of 1995.