IEEE 754 Revisions Committee ---------------------------- EXTRA MEETING Hewlett-Packard, Cupertino, CA Thursday, September 7, 1:00-5:00 PM Room: TBA Phone: TBA Host: Jim Thomas Proposed Agenda --------------- 1. Roll Call & Introductions In attendence were: Steve Canon, Steven Carlough, Marius Cornea, Mike Cowlishaw, John Crawford, Joe Darcy, Bob Davis, Mark Davis, Dick Delp, Mark Erle, Warren Ferguson, Alex Fit-Floria, Roger Golliver, Michel Hack, David Hough, Jerry Huck, Dave James, Prof Kahan, Jeff Kidder, Alex Liu, Raymond Mak, Peter Markstein, Jon Okada, Eric Postpischil, Eric Schwarz, Ron Smith, Pamela Taylor, Jim Thomas, Charles Tsen, Fred Zemke, & Dan Zuras. 2. Show Patent Slides Shown. 3. Call Meeting to Order 4. Approval of Agenda (http://nonabelian.com/754/agenda) Approved as amended. 5. Approval of Minutes (http://nonabelian.com/754/minutes/754/060809) Approved without objection. 11. Motion: (Jim Thomas) move scaled-product operations to their own Annex (Fri, 25 Aug 2006 17:35:28 -0700) (2441) Dave James seconded. The motion PASSED 20-2-4. 12. Motion: (Ron Smith) Comparison of Infinities (Sat, 26 Aug 2006 18:10:42 -0400) (2462) Eric Schwarz seconded. The motion PASSED without objection. 13. Tabled motion 28 from August: (Ron Smith) Motion: Relaxation of Invalid Operation for Convert Float to Fixed (Thu, 3 Aug 2006 22:30:54 -0400) (2238) July: Ron moved. Eric Schwarz seconded. Jim moved to amend motion so that it deletes clause h) in the draft and deletes the last clause ["it is language defined what results are delivered and whether invalid operation is signaled"] in the motion's text about clause l). Jeff seconded. Ron moved to table the motion-amendment stack. Prof Kahan seconded. Motion TABLED 18-3-2. In September, motion untabled. Amendment accepted as friendly. The motion PASSED 13-4-9. 14. Motion: (Jim Thomas) Correct parameters for external decimal character sequences (Mon, 28 Aug 2006 13:47:01 -0700) (2489) Michel Hack seconded. The motion PASSED without objection. 15. Motion: (Jim Thomas) Change requirement to detect tininess in the same way to a recommendation (Mon, 28 Aug 2006 13:51:11 -0700) (2491) Peter Markstein seconded. Michel Hack amends. Amendment seconded by David Hough. Amendment PASSED 21-5-0. Motion PASSED 26-0-0. Amendment to agenda: 16. Motion: (Warren Ferguson) Motion: Make the draft consistent in distinguishing between numbers and non-numbers (NaNs). (Tue, 5 Sep 2006 13:57:55 -0700) (2576) Amendment to agenda: 16. Amendment: (Michel Hack) Amendment to motion addressing the need for consistency in distinguishing between numbers and NaNs (Tuesday 05 Sep 2006 at 6:14 p.m. EDT) (2579) Amendment to agenda: 16. Clarifications: (Michel Hack) Clarifications for unqualified "number" (Tuesday 05 Sep 2006 at 6:55 p.m. EDT) (2582) Jim Thomas seconded. Michel Hack amended (to use 'number'). Warren seconded. Dan Zuras amended (to use 'datum'). Dave James seconded. Dave James amended (to use 'item'). Peter Markstein seconded. After 20 minutes of a 10 minute recess & more votes than the amendments were due, 'datum' was selected by 14-13-0. The amended motion PASSED 15-4-2. 17. Prof Kahan wished to have a chance to speak to the committee. His remarks were wide ranging & too extensive for contemporaneous notes. But Eric very kindly transcribed them from a phone conversation held with Prof Kahan after the meeting. (Thank you both.) I reproduce them here without further edits. From: "Eric Postpischil" To: "Dan Zuras" , "William Kahan Ph. D." Subject: Transcription of Professor Kahan's comments Date: Tue, 12 Sep 2006 21:43:21 -0700 Dan, here is my transcription of Professor Kahan's phone repetition of his comments at the meeting. I sent Professor Kahan a copy Sunday and haven't received any corrections, but I'll leave it to him to confirm. -- edp (Eric Postpischil) http://edp.org "Geniuses are like thunderstorms. They go against the wind, terrify people, cleanse the air." -- Kierkegaard ------=_NextPart_000_0003_01C6D6B4.77E9B4A0 Content-Type: text/plain; name="Kahan20060907.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="Kahan20060907.txt" Regarding the termination of the old standard: The fact that the old standard is terminated is not what is wrong with the new draft; it is what is wrong with the process that we have been obliged to endure. It is a decision made by some people according to some policy. It is a policy decision made by people who won't have to suffer the consequences of the expiration of the old standard. There really is not any particular reason to terminate the old standard even if a new proposal comes out because while the new proposal is being looked at but not yet implemented, we are still going to need something to guide constructors and to prevent someone from claiming to conform when in fact he does not. But, as I said, that is not why the present draft is a bad draft. That is just a bad policy that is being implemented. Regard the current draft: So, what makes the draft a bad one. Well, there were a number of things that I mentioned. [I believe this refers to Professor Kahan's original comments in the meeting. These notes are prepared from a telephone call to recapture those comments for the record.] One was a failure to exercise due diligence. An example of such a failure was brought to our attention by Mori Nobuyoshi. We have been told, mainly by him, that having two schemes to encode the new decimal standard is going to cause enormous trouble. We have got some ideas of our own, that it will not cause enormous trouble, and I believe that, but my belief is based on experience that is 30 and 40 years old. We have not actually tried to find out what people are going to do when they exchange decimal data, to see whether our decision is justified. And we have to do something better than conduct a marketing poll. The typical marketing poll asks, "What do you want?", and people have to make a response, so they say something that is notionally what they say they want, but it is notorious among marketing surveys that what people say they want is not actually what they want, and what they want is not what they need, and what they need is not what they are going to get. So we have to do better than a survey. We have to actually go out into the field and look at current practice to see what people are doing, and then try to predict, on the basis of this observation, what they will do if there are two decimal encodings and if that will be significantly worse for them than if there were only one. But instead of conducting that kind of investigation, what we did was to attempt to leave the decision to two proponents of different schemes, neither of whom performed this kind of investigation in order to decide the issue, and so they did not decide. We have also made other decisions which have left out consideration of important questions. There are important questions we have hardly discussed or not discussed at all. One of these questions is who should be the beneficiaries of the standard we intend to promulgate. We have not asked ourselves who should feel obliged to conform and who should not be expected to conform. For example, we have representatives from a database company. There is no reason for Oracle (the product Oracle, which is a body of software, rather than the company) to conform to the standard. On the other hand, the programming environment in which the various subprograms that constitute Oracle are programmed should indeed conform to the standard. Similarly, one can ask about Java. Java does not conform to 754 now. Is there any reason why Java should conform in the future? We have not asked the question. I think it should be modified to conform in the future. But we have not discussed it. So some programming languages should clearly conform, but there are probably many that should not care. And since we have not discussed that, we do not know what things ought to be in the standard and obligatory, because we do not know who will benefit or suffer according to the way we decide. We do not know where conformity should end. For example, the hardware people at present construe the flags and the modes as something provided by the hardware, in some status or control register. But, in fact, these things should not be bits in the register. Those bits should merely be used to help implement in software variables that represent the modes and the flags, so that they come under the usual scoping rules and customary disciplines in programming languages, rather than be this mysterious global variable. So our decisions have not been guided by an appreciation of who is expected to conform and in whose interest or for whom's benefit is the standard to be promulgated. Because we decided, I think mistakenly, to include decimal arithmetic along with binary, we have attracted a number of participants whose principal interest and experience comes from the world of decimal computation, most of which is commercial and administrative. The practices in that field are essentially fixed-point arithmetic with very little exposure to floating-point arithmetic. In consequence, the people who have been voting according to their interests and understanding in that field have not been informed by their experience about what is going to happen when they and their fellow practitioners start to work in the field of floating-point arithmetic while their practices are still rooted in the field of fixed-point arithmetic. I think there are not very many people as old as I am and able to remember the IBM 650. That was a machine with a decimal arithmetic and optional decimal floating-point, an option which my university exercised. And at that time, I was what would now be called a help desk, and I know what anguish was caused to people who used floating-point unwittingly while they were thinking in the fixed-point mode of thought. Many of the things that happened to them are going to happen again, and the people who voted against making certain provisions obligatory have done so without realizing what is hurtling towards them if the standard's decimal arithmetic is adopted. They do not realize to what extent the rules for evaluations of subexpressions in various languages are going to conflict with long-standing practice in the field of fixed-point decimal arithmetic. They do not realize what is going to happen, when, inadvertently, the numbers that are computed happen not to conform to the rules of expected values that Mike Cowlishaw as embedded in their properly. For example, his rules about how many digits to maintain after the decimal point and how many leading zeroes to retain are perfectly reasonable. But we have not made obligatory some kind of warning when those rules are inadvertently violated because of what would have been regarded as a field overflow but will now merely cause a number to go into a floating-point representation that a programmer did not expect. In the past, that field overflow would have stopped computation. But what will happen now is that the field overflow will introduce a numerical anomaly that will not be detected by the program, because the programmer did not think to ask the question, that will not be detected by the user of that program, because the user does not have the ability to insert the test at the right place in the program. He has a precompiled module; he cannot tamper with its innards. And there is no flag of any sort to warn him that something anomalous has occurerd, which might explain subsqequent discrepancies. I had experience with those discrepancies, and I know how nearly imposible they are for the perpetrators and the users to discover. Well, David Hough has worked for years on his alternate exception handling. Now, the point about alternate exception handling is that, if made obligatory, it would be possible for programmers to use the exception modes and flags with at least the confidence that their programs would be portable to other conforming systems, and that would make it possible for the industry to discover which of these capabilities gets used enough that they should influence the design of the hardware. But right now, the very different computer architectures implement the various exception behavior and access to the modes and flags in very different ways, so that even where a language attempts to make access to these things a part of the language, the variation in performance is so great, because of the gratuitous non-uniformity, that it is practically impossible to write programs that will use the exception-handling capabilities in ways that will not penalize performance appallingly badly on at least some commercial machines. For example, on many machines, a division by zero may indeed create the default infinity, but it may take tens of thousands of times as long to do so as a multipliation would have taken. What that means then is a programmer who believed it okay to divide by zero if he believes an infinity will be absorbed later harmlessly discovers when he actually runs his program on some machines that those machines turn his program into cold molasses. If we cannot provide adequate guidance to both the hardware producers and the language people about the uses to which the exception handling will be put, and that is an important part of the auxiliary exception handling, if we do not give them that guidance and then make its capabilities obligatory, then we will never be able to write portable software that handles exceptions in a decent way. This problem has come up over and over again in certain of the LAPACK and ScaLAPACK codes. These are programs for handling large matrix computations, and ScaLAPACK is supposed to provide code that can perform large matrix computations on machines that provide concurrency in diverse ways and still scale up in performance as the amount of available concurrency increases. So, in effect, when it comes to exception handling, what we have done is pass the buck. Now, we have done similar things with the debugging capabilities. Again, we have relegated them to an annex, made them entirely optional, and the votes to do that have come in large measure from people in the decimal community who have not had experience with the way floating-point roundoff has to be debugged. They do not appreciate what has happened in my community of scientific and engineering computation. That meakes the transient roundoff anomalies impossible to debug nowadays. In order to exploit concurrency, we have been compelled to use algorithms with which we have a lot less experience than we built up with the strictly sequential algorithms. In consequence, we occasionally encounter misbehavior on some computer or a few computers when programs are tested on them, which programs have worked very satisfactorily on all the other computers. The misbehavior may in fact be due to some compiler anomaly. But it takes us longer to find this out than the expected service lifetime of the underlying platform, be it hardware or compiler. And so what has happened over and over again is that the underlying platform gets upgraded, and then the anomaly goes away -- actually it moves somewhere else -- and we never do find out what caused it. The inadequacy of debugging tools for purpose of floating-point computation in today's environment can be appreciated if you read any of the manusl for debuggers, such as the manual for gdb, the GNU debugger, and you see that their capabilities do nothing toward debugging modern floating-point codes. These kinds of difficulties are going to befall the decimal community as they start using decimal floating-point increasingly, as they start using decimal floating-point often and unwittingly. They need these debugging capabilities just as badly as the scientifric community, but because of their inexperience, they have not voted to make any sort of debugging capability mandatory in any system that conforms to the standard. By failing to do this, by promulgating a standard that does not make these very minimal capabiilities mandatory, what we are doing is licensing the design of hardware that for reasons of compatibliity will be subsequently unable to incorporate these capaiblities. So we will never get them. What that is going to do will be to throw us back to a situation like that in the middle and late 1950s when it was generally accepted, wrongly as it happened, that floating-point arithmetic was intrinisically refractory to error analysis. Most people believed then that you could not do an error analysis of floating-point, and, even though some of us succeeded in the late 50s and early 60s, it was not until the 70s that you could see this reflected in what was being taught and generally believed about floating-point. If people come to believe that floating-point roundoff problems simply cannot be debugged, then there will be no obligation of due diligence for people to try to debug their floating-point code. Why obligate people to try to do something that is generally believed to be impossible. And that is how things were in the late 1950s. So we had floating-point code which would work much of time, and every now and then you would get an answer that was plausible but quite wrong, and that was considerd normal. I remember when there were I do not know how many dozens of different ways to compute Eigenvalues of matrices, and each method was attached to somebody's name. Milne's method, and Danilewski's method, and Souriau-Leverrier-Faddeev's method. There were just dozens of these methods. Each of worked some of the time and failed much of the time if you knew it. But because floating-point was deemed instrinsically impossible to error analyze, it was thought that this type of unpredictable behavior was something you just had to endure. Well, we are going to re-enter that mindset if we allow systems to be built and conform to our standard when we cannot debug these transient rounding error difficulties. Now, there may be people who believe that something else could be done, but I posted on my web page a careful analysis of all the schemes that have been proposed seriously for attempting to do error analysis automatically if possible, and none of them is 100% reliable. Most of them are extremely unreliable. There is only one that has a reasonable chance of working at a tolerable cost, and that is the one we have failed to make obligatory. Of course, there are objections to many of our proposals, objections to this kind of exception handling, objections to this kind of debugging capability, from people who can see the difficulties but have simply rejected the proposal rather than try to overcome the difficulties in their environment. So instead of trying, we have just pushed these things out of our minds into an annex. Of course, people have made arguments that putting in the the annex will serve as a guide for language people and others. But all of the language people with whom I have had extensive conversations about this are unanimous that unless we make certain capabiilities obligatory in the standard, the language community will always find that there is something else they have to do first. To diminish the incidence of these almost intractable roundoff problems, some of us have advocated that the default rules for expression evaluation should resemble the old-fashioned Kernighan-Ritchie C in the world of binary floating-point or some of the conventions in COBOL and Ada in the world of decimal arithmetic. These are conventions where the default mode of expresison evaluation is to evaluate in the widest supported precision at a tolerable speed and then round to a narrower precision upon assignment to programmer's specified narrower destination. This is what COBOL does with what used to be called its computational format of something in excess of 30 significant decimals, and, in Kernighan-Ritchie C, it used to be double-precision as the default even if all the operands in an expression were floats. The ability to declare all scratch variables to have this wider precision made it possible to avoid what are called spill anomalies where the compiler spills subexpressions from registers when it runs out of them and then reloads later when registers become available. We used to get these spill anomalies on machines that had this wider precision, but it was not supported by the language. If we make the widest evaluation mode the default, we will be doing what we can to dimish the incidence of roundoff anomalies that are very difficult to debug in codes written by people that are simply not accustomed to the discipline of experienced floating-point programmers. But we fail to do that. We fail because we have not agreed on the population that is supposed to be served by our standard. So, for reasons like these, I have moved that we not submit our draft to the Microprocessor Standards Committee, and I would add that we still have work to do and should be allowed to do it instead of being disbanded, and that the old standard should continue in force regardless of when and whether a new standard is promulgated. 20. Invitation for sponsor ballot. 50. Adjournment Next Meetings ------------- Wednesday 9/20 full meeting, David Hough at Sun in Menlo Park Thursday 9/21 overflow meeting Monday 10/9 6:00 MSC Wednesday 10/18 full meeting, Fred Zemke at Oracle in Redwood City Thursday 10/19 overflow meeting, Fred Zemke at Oracle in Redwood City Thursday 11/2 1:00-5:00 Eric Postpischil at Apple, presentations by Jean-Michel Muller & Paul Zimmermann Wednesday 11/15 full meeting, Eric Postpischil at Apple in Ten Forward Thursday 11/16 overflow meeting, Conference Room at 2 Infinite Loop, Cupertino Wednesday 12/20 full meeting, TBA Thursday 12/21 overflow meeting 754's PAR expires 12/06 Monday 1/8 6:00 MSC Monday 4/9 6:00 MSC Monday 7/9 6:00 MSC Monday 10/8 6:00 MSC