This memo provides an overview of important differences between IEEE 754R draft 1.9.0 and IEEE 754-1985. That overview is followed by a ballot and specific comments to be submitted with the ballot. This memo highlights certain important issues to draw them to the attention of balloters and commenters. There might or might not be a subsequent review draft after 1.9.0, and ballot comments might or might not influence that subsequent draft. The intent of the 754R Ballot Review Committee is to respond to all comments on 1.9.0 without changing the text of 1.9.0.
Readers new to this draft standard will find some important differences between the current draft 1.9.0 and the previous 754-1985 standard. Some things were so obviously in the air that they amounted to unsurprising standardizing of existing practice: thus fused multiply add and 128-bit basic binary format. Others were more controversial: basic decimal formats with two possible encodings, min/max instructions with controversy about signed zeros and NaNs, and recommended correctly-rounded transcendental functions. Perhaps the most fundamental controversy was between those who wanted to extend 754 orthogonally, keeping it as close as possible to a hardware specification, and those who wanted to complete 754's unfinished business of creating suitable programming environments for diverse purposes. 754's most pressing gaps involve expression evaluation and exception handling.
In particular it came as a surprise to many programmers and users with less than graduate degrees in error analysis that the 754 "standard" didn't even standardize the result of z = x + y or z = x * y for double-precision variables. Differences among "conforming" implementations have been interpreted as hardware bugs or software bugs or performance bugs if the remedy were expensive, and in that form attracted the attention of executives who would not normally concern themselves with matters of language definition and implementation. 754R tries to recommend ways for languages to permit programmers to say what matters when they care.
A difference in expectations between 754 and 754R underlies the subtler structural differences between the two. 754 was drafted with no expectations about immediate language or operating system support, and so the hope was that some local implementations would arise without such support. Not surprisingly, the implementations that did arise did not permit coding the subtler aspects of 754 in a form portable across implementations, nor did these implementations facilitate obtaining reproducible results across implementations. Subsequent language and operating system standardizations made portability of source code possible, but not necessarily portable performance or reproducible results, as those subsequent standards generally did not remove any implementation freedoms granted by 754.
Based upon the success of 754 and subsequent language and operating system standardizations of aspects of it, 754R permits languages to focus on portability and reproducibility or on performance and encourages languages to provide means for programmers to select either one as needed. For parts of programs where performance is important, some languages might choose to delegate some choices to implementations. So many aspects of 754 that were explicitly or implicitly undefined or implementation-defined are called language-defined in 754R, with the understanding that a language standard might always, or upon programmer request, delegate some aspects to implementations.
Similarly many aspects of 754 that were best characterized as implementation mechanisms are replaced by end-user programmability features in 754R. Examples in this category were 754's extended rounding precision modes and traps, replaced by 754R's preferredWidth attributes and alternate exception handling attributes. 754R is explicit about what was implicit in 754: global dynamic modes and flags registers are only one possible implementation mechanism, and for many purposes not the best, so languages are encouraged to provide means for modifying semantics of operations and expressions in forms suitable for typical application programs rather than in the forms suitable for a particular hardware implementation. Thus languages intending to fully support 754R would do well to consider how to provide uniform and convenient syntactic means to associate static attributes with sections of program text. Not only could that apply to 754R's rounding directions, alternate exception handling, preferredWidth, optimization, and reproducibility attributes, but also to directives about parallelization and debugging that have grown into many languages by accretion from other sources. The experience of 754 implementations has shown that attribute changes are seldom invoked and limited in scope, yet the dynamic implementation mechanism in most 754 implementations is a performance and correctness burden for the entire hardware and software stack when programming conventions require instantaneous semantic response to controls changed asynchronously and remotely. The static attribute model of 754R is intended to accomplish what most programmers want most of the time without directly or indirectly burdening the majority of programs and programmers that find the default attributes satisfactory.
For one purpose, global dynamic modes and flags (and 754 trap enable) registers are an excellent implementation: debugging numerical programs without source code. Flags and traps allow some insight into where exceptional behavior is occurring, and modes allow varying roundoff characteristics to provide some insight into where numerical sensitivity is high. 754 recommends such capabilities in an informative annex but explicitly does not require that such modes and flags be supported by debuggers.
In summary, the most important results of the 754R revision are to
Virtually every statement about 754 exceptions has itself an exception having to do with underflow. An exact subnormal result signals an underflow exception but does not raise a status flag by default. It's a feature of 754 for which consensus holds that 754R must maintain compatible behavior. Properly encompassing this case complicates every statement about exceptions.
If compatibility were not required, there are at least three plausible approaches to removing the special case:
The inexact exception has a couple of meanings: 1) "arithmetic roundoff" of a result not representable exactly in the destination format, and 2) "result value differs from operand value", which applies to all kinds of conversions between formats and conversions to integral values.
Optional alternative exception handling attributes of 754R replace the optional traps of 754. Those traps were easy to specify and implement but difficult to use in a portable way, even within one language.
Languages vary greatly in their control structures. Consequently 754R doesn't require that languages specify any particular control transfer idiom for alternate exception handling. Each language standard should adopt syntax and semantics that are idiomatic for that language.
The delayed alternate exception handling attributes could be implemented at reasonable cost with the existing status flags, for exceptions (but not invalid sub-exceptions). Applications programs that exploit such delayed alternate exception handling could then avoid most direct references to status flags and the awkward programming interface specified in 5.7.4.
The immediate transfers of control were envisioned to be implementable by (immediately) trapping on exceptions signaled. The delayed transfers of control were envisioned to be implementable by (eventually) testing flags raised by default exception handling. That leads to a slight difference between delayed and immediate trapping on underflow: an exact underflow will signal an exception but will not raise a flag by default.
The resuming alternate exception handling attributes do not involve control transfers but present a different kind of difficulty: efficient implementation might require resumable traps or some other equivalent hardware complexity. So none of these are required by 754R. But each language standard oriented toward numerical computation should specify these.
The scope of Clause 9's recommended correctly-rounded transcendental operations is rather broad and significantly increases the weight of a fully conforming implementation.
One positive aspect of clause 9 is its suggestion that languages, libraries, and application programs can avoid some of the gratuitous complexities of transcendental operations typically defined in programming languages. Thus while trigonometric operations are naturally measured in radians in mathematical analysis, as a practical matter radian measure poses a dilemma in finite precision: whether to perform correctly-rounded argument reduction with as many digits of pi as are necessary, or to use some finite approximation of pi for better performance but less accuracy and reproducibility... neither of which often matter much. Arguments expressed in revolutions can be reduced accurately and quickly, however, and so the trigPi operations in clause 9 avoid the issue entirely... and furthermore, they incorporate the factor of pi appearing in trig arguments of many physics formulas.
Likewise by specifying pown, rootn, and powr separately, it would be possible for a language to avoid a rather complicated definition of a pow function that tries to preserve the behavior of pown for integral floating-point exponents.
Clause 9.4's recommended scaled product reduction operations replace 754's specification for wrapped exponents to be delivered to overflow/underflow trap handlers. Wrapped exponents and counting mode registers remain possible implementation mechanisms, among others.
The greatest shortcoming of 754-1985 was a lack of direction about how languages should evaluate expressions involving more than one 754 operation. That combined with sometimes-available extended-precision expression evaluation had the practical effect that nobody could predict the actual semantics of numerical expressions across different languages, compiler releases, optimization levels, and operating systems. 754R recommends that languages define expression evaluation and, if they tolerate or require more than one method, provide attributes for programmers to use to select the desired method.
Subclause 10.3 preferredWidth attributes specify the format of anonymous intermediate destinations of generic operations. A preferredWidth specification is mandatory for an implementation regardless of performance, roundoff, or reproducibility considerations. Subclause 10.4 also lists a licensable optimization to use wider intermediates for anonymous intermediate destinations which is suggestive; an implementation might act on that license to optimize performance. Both features are optional for a language standard to define and mandatory for an implementation of a language standard that defines them.
Gratuitous variation in numerical results among implementations conforming to 754 proved to be a significant obstacle to portable programming. It's relatively easy for anybody to determine if two results are the same or not, but much harder for even a numerical specialist to determine if they are equally good for a particular purpose.
754R addresses this by encouraging languages to provide a means of obtaining reproducible numerical results and exceptions across conforming platforms, subject to a number of restrictions on programs and programmers.
However in the course of discussing comments accompanying ballots, it became apparent that the critical issue to be confronted is NOT a separate reproducibility structure distinct from regular expression evaluation. Such a optional separable structure invites languages to not consider it.
Instead the critical recommendation is that each language fully define a primary mapping between the features of the language and the features of 754R, and require implementations to provide an implementation of that mapping. (This mapping is called "the literal meaning of the source code" in 754R). There is a separate recommendation that the primary mapping be the default mapping, but some languages defer more decisions to implementations for default performance reasons. Other separate recommendations are that languages define one or more attributes enabling value-changing optimizations as part of the portable source code of the language rather than as part of a separate, implementation-dependent mechanism, often implemented outside the source code in scripts and Makefiles.
Implementations are ultimately responsible for conformance to 754R. But to maximize portability of application programs, subclause 1.5 spells out 754R's preference that its incompletely-specified parts should be completed by language standards and required of their implementations; otherwise by class or library extensions to language standards; and lacking either, by implementations. In fact "language-defined" is used throughout 754R in exactly that sense. So there is no justification for calling out certain features as "implementation-defined" or placing requirements directly on implementations. Yet "implementations shall provide" is used for all the operations in clause 5. This leads to the confusing language about convertTo/FromHexCharacter: languages should provide but implementations shall provide.
In several place there is room for confusion introduced by vagueness about the names of attributes ("roundingDirection") and their values ("roundTiesToEven"). Each attribute should have been given a specific name, along with names for its 754R-defined values, if any. Subclause 9.3 adopts such an approach for the case of dynamic modes. It's gratuitously harder to specify something with no name.
There are a number of indeterminate cases in 754R:
These are not equally well known. Perhaps they should all be listed together. Certainly they must all be called out under reproducibility.
When one sometimes requires a simple specific facility and sometimes a general complex general facility, which is better to standardize? It's all Turing machines; anything can simulate anything else, at some performance cost. But more specifically in matters such as formatOf operations and direct flag manipulation operations, there are simple cases which could be provided as macro invocations of more general cases, and there are more complex cases that could be provided as compositions of simpler cases. Frequent arguments of this sort should be resolved in terms that favor application programmers.
One of the ongoing sources of discomfort in standardization is how far to standardize the exceptional cases. For operations that can be implemented entirely in hardware, exceptional cases can often be handled without slowing down the non-exceptional case. For software, however, specification of exceptional cases is more likely to slow down non-exceptional cases. The extreme case is deciding whether to specify that the inexact exception be always signaled correctly for a transcendental function. It's an expensive feature to provide that is destined to be used far more often by test programs than by applications. Similarly the tail/residual operations that were part of early 754R drafts were undone by uncertainty about whether to specify exceptional cases for hardware or leave them unspecified for software. 754R's first specification for min and max functions was likewise arguably more slow than useful, again to provide deterministic results for NaNs, signed zeros, and differing values within a cohort... that seldom matter to applications. But that specification was revised so that now these mathematically commutative operations are numerically commutative in a more limited sense than 754R addition and multiplication.
Some proposals received by 754R are outside its scope and might be the subjects of future orthogonal standards activities built upon 754R, when the implementation tradeoffs are understood well enough to standardize.
In this category are complex arithmetic, varying-width arithmetic, interval arithmetic, and complete arithmetic. What these have in common is no complete hardware implementations current or likely and insufficient consensus in existing software implementations.
Remember that 854 was built upon 754, in 1987, as a separate standards activity, and that experience was incorporated much later into the specification of decimal formats in 754R. Similarly complex arithmetic could be a future standards activity built upon 754R.
Particularly when encoding of data types is considered, it matters greatly whether the types will be implemented in entirely in hardware, partially in hardware, or entirely in software. Complicated data types are usually not going to be implemented entirely in hardware, and one would like enough software implementation experience to know what hardware helper operations might be useful.
The particular problem of complex arithmetic is open issues concerning the representation of complex zeros and infinities and NaNs (what is the interpretation of [+inf,NaN] ?), and whether multiple representations should be allowed and whether any meaning or history can be distinguished among multiple representations of the same object.
The particular problem of interval arithmetic is that its encoding and its exception handling are tightly bound up with each other and with whether there should be a native interval type or one composed of a pair of ordinary floating-point numbers. In any event, software implementations of interval methods are available now, and on the whole not well accepted, not so much because of performance of single interval operations, but because of the massive intellectual effort required to reformulate existing point-oriented approaches to computational problems into interval methods that exploit the inherent advantages of intervals.
754R does not impose any implementation method or performance goal for any required or recommended feature. That is left to the competitive marketplace. As with other computer performance features, the way to get faster performance for interval methods is to get interval-oriented computations adopted as benchmarks by organizations like SPEC or in large government procurements.
The particular problem of complete arithmetic is that it encompasses a very high precision format that can exactly represent all dot products expressible in an underlying working format. Exact arithmetic on very high precision is most often of interest in mathematical problems on exact data, often associated with symbolic computation. Thus several existing varying-width floating-point and integer arithmetic formats are associated with symbolic algebra systems. The paradigms of expression evaluation in such systems are very different from those in 754R, which is oriented toward computation on physical problems for which the initial data and analytical models are only known to limited precision. Thus varying-width arithmetic and complete arithmetic are properly subjects of another standards effort directed by the considerations of exact mathematical symbolic algebra. That requires an entirely different set of expertise than that represented by the 754R working group.
Most of the members of the 754R ballot review committee have been involved with the 754R effort for over seven years. Consequently there is some urgency to "finishing" the draft and submitting it. I would rather take a month or two to eliminate as many ambiguities as possible than take a decade or two explaining why we didn't and thereby encouraged diverse gratuitous minor discrepancies to arise.
I vote DISAPPROVE on the proposed (recirculation ballot) DRAFT Standard for Floating-Point Arithmetic P754, draft 1.9.0 of 24 April 2008. Negative ballots must be accompanied by explicit changes required to change the negative ballot to affirmative, which may be found below.
Although the 754R Sponsor Ballot group membership is closed, anybody may submit comments on draft 1.9.0 during the ballot period to the MSC chair, Bob Davis, bob@scsi.com. To protect the IEEE's copyright, Draft 1.9.0 is not publicly available, but copies can be obtained for review purposes from Bob Davis.
According to IEEE-SA procedure, a DISAPPROVE vote must be accompanied by a list of changes required to convert it to an APPROVE vote.
The formatOf-based formulation of 5.4.1 for arithmetic operations should make manifest that part which is familiar to conventional programming languages and that part which is unfamiliar. Most languages have generic arithmetic operations +-*/. Most languages do not require an underlying implementation to directly support any combination of operand formats rounded only once to any narrower or wider format, yet that is what 5.4.1 requires.
This is not a hardware issue at all; there are many possible hardware implementations. The issue is strictly a matter of good exposition to an audience of programming language designers.
formatof.pdf is an attempt to better match the requirements of 754R and languages: homogeneous 754R arithmetic operations are defined that correspond to familiar language generic operators like +-*/, and 754R narrowing operations are defined that encapsulate the new capabilities that languages are required to provide. There are other reduced subsets of formatOf operations that can be composed to provide all the required variants, but these are closest to existing language structures.
The conversion operations of 5.4.1 and 5.4.2 could also be composed from a reduced subset, but these operations do not place unfamiliar requirements on programming languages.
5.12 and 5.12.3 specify different requirements for conversions between binary formats and hex character sequences. Implementations shall provide, but languages should provide. That's rather out of sync with the rest of the clause. And since hex character sequences are intended to promote exact interchange of data among systems, it's not very useful if the required interfaces are implementation-defined. As with every other feature of 754R, language-defined is better than implementation-defined.
Change "Language standards should provide" to "Language standards shall provide" in the same sense as elsewhere in 754R, that a language silent on some point defers it to implementation. Or alternately, be consistent in clause 5 that languages and implementations should provide the hex character sequences.
sin/cos/tan/asin/acos/atan/atan2 are specified, but only sinPi/cosPi/atanPi/atan2Pi. Either tan/asin/acos should be removed or tanPi/asinPi/acosPi should be added. Whatever the merit of any particular direct or inverse trigonometric function, it's the same regardless of how angles are measured.
Change: remove tan and add asinPi and acosPi. Special cases of asinPi and acosPi are the same as for asin and acos.
If instead it were desired to retain tan and add tanPi, its special cases can be derived as quotients of sinPi/cosPi:
tanpi(+-0) is +-0; for n>0, tanpi(2n) is +0, tanpi(2n-1) is -0; for n<0 tanpi(2n) is -0, tanpi(2n+1) is +0, reflecting the oddness of tanPi; for n>0, tanpi(2n-3/2) is +inf, tanpi(2n-1/2) is -inf; for n<0, tanpi(2n+3/2) is -inf, tanpi(2n+1/2) is +inf.These special cases do not arise for tan (but would arise for tandegrees).
Clause 10 has been repeatedly patched until it says almost the same things slightly differently in various places, and does so in an order that unfolds with less than mathematical logic. Clause 11 then revisits many of the same issues, from a slightly different perspective. Neither clause has attracted enough critical review because both are optional. Both need a coordinated rewrite, which does not fundamentally change the recommendation.
If nobody bothers to understand and implemement the far-from-orthogonal concepts about expression evaluation rules, literal meaning, preferredWidth, optimization, and reproducibility, then 754R has failed to address 754 users' most serious issue: after 30 years nobody can be sure what z = x + y will be. What good is that kind of standard?
Rewrite completely, such as clause10.pdf. A patchwork can't be patched into wholeness.
The following arguments were submitted in previous ballots but rejected by the Ballot Review Committee.
The Scope and Purpose were taken from the 754R Project Authorization which was never amended. As a result, there are some embarrassing mislocutions:
The working group received several end-user declarations in favor of settling on one decimal interchange format encoding, and most of the testimony was in favor of the DPD encoding. There was some sentiment that end users shouldn't care about the exchange encoding, and if that were true, then there would be no reason for the standard to discriminate against a millennial encoding or even a straight BCD encoding. Straight BCD would likely be more efficient on some low end hardware, but would require either reducing the precision and range requirements of the interchange format valuesets, or allowing interchange format sizes to exceed 64 or 128 bits - but if the content is not interchangeable anyway, why does it need to be a fixed size?
The one thing that's certain is that there is no end-user demand for specifying two interchange encodings. Two encodings was strictly a business compromise between implementors. If the consensus were that there wasn't a consensus on the best decimal encoding for 64 and 128 bit formats, then the appropriate response would have been to just define decimal arithmetic formats and not define any decimal interchange formats. But while many aspects of 754-1985 could be said to be observed in the breach almost as often as not, everybody observes the binary encoding formats for 32 and 64 bits, and that could be said to be the most successful aspect of that standard.
Clause 3.3 defines seven basic formats suitable for arithmetic and interchange. They were chosen in the expectation that they would be suitable for almost all applications. By considering common applications, it was possible to allocate the bits to exponent and significand in a broadly-applicable way. Overall widths were multiples of 32 bits as befits almost all modern computer architectures.
Clause 3.6 attempts to extrapolate principles inferred from the limited data represented by the seven basic formats in order to define narrow and wide arithmetic and interchange format parameters. But specification of narrower and wider fixed-width formats is fraught with difficulties. Relatively few applications can profitably exploit such formats, and each of these applications has particular and differing requirements as to formats and also, often, as to other aspects of arithmetic as well. So the value of standardizing some choices of narrow or wide formats is much less than for the basic formats.
The principal issue with narrow formats is allocating the wordsize between precision and exponent range. This is very application-dependent. The principal issues with wide fixed-width formats are:
Fundamental physical constant accuracy seems to be refined by much less than one bit per decade. And historically in technical computing, most complaints of "I need higher precision" have been more economically resolved by other means like applying compensated summation to critical areas of ODE solvers, and QR factorizations to linear least squares problems. So it seems likely that demand for fixed width binary floating-point formats greater than 128 bits will not manifest itself earlier than the next 754 revision.
So this clause seems unlikely to have much influence for good or ill. Better recommendations about formats might be:
3.7's extendable (should that be "extensible" ?) formats permit users to define precision and range, but it doesn't say whether those parameters may vary dynamically at run time, nor does it suggest how expressions of variables with different parameters should be evaluated: to what precision and range should results be rounded? Thus 754R's attempts to specify varying precision amount in the end to hardly any specification at all. That may be the most that is warranted at this time, suggesting that the specification might as well have been omitted.
The value of a theorem is the extent by which it enables us to know MORE while remembering LESS.
Certain operations listed in clause 5 are exceptions to one or more general principles about clause 5 operations, which in most cases are fully specified and mandatory in all implementations. Perhaps these should have been part of clause 9, which would then include both required specialized operations and recommended operations.
The minNum, maxNum, minNumMag, maxNumMag operations have a unique property. They are not strictly commutative, or even deterministic, in certain cases of signed zeros and members of the same cohort.
In addition, one quiet NaN operand will disappear, but one signaling NaN operand will not. If the invalid signal is handled by default, then according to the rules for signaling NaNs, a quiet NaN will be the result of min/max. But why should min(1,qNaN) be different from min(1,sNaN) handled by default?
754 specifies that operations should be provided to raise, lower, and test individual status flags, and to save, restore, and lower all status flags as a group. The following 18 operations are a direct mapping of that requirement:
But the interfaces that 754R specifies for accessing flags seem to be derived from some other consideration than actual needs of application programmers using the standard. They allow all possible combinations of flags to be manipulated as a group. That interface seems to be especially tailored to the Itanium fchkf instruction. But how many application programs could exploit such flexibility? Why not make common tasks easy to program and easy to optimize?
In contrast to either of these approaches, W. Kahan finds most useful "swap" versions of operations that store and load in one operation. Kahan also favors explicit operations for dealing with opaque types of flags and flagsets since flags could contain more implementation-defined or language-defined information than a boolean state of being raised or lowered.
A list of various possible application programming interfaces is in 574.pdf.
One possible conclusion is that the specific flag operations don't matter much if programmers don't need to explicitly invoke them for the most common tasks. So the recommended change to avoid explicit application program reference to flags is to require that conforming implementations provide delayed alternate exception handling attributes, which are designed to be implementable by testing flags. See comment p 48, 8.3, l 35: Require delayed alternate exception handling.
Signaling NaNs are yet another approach to alternative exception handling. 754 defined signaling NaNs in a way that facilitates few portable applications yet imposes an onerous burden on all implementations and on software that must be written to support those unlikely applications. 754R would have done well to allow existing practice to continue while encouraging something better - either stnan.pdf a complete implementation or snan.pdf allowing their omission entirely.
It appears that Table 5.2 defines the default value for a predicate generating an invalid exception handled by default; when the relationship is UN,
compareSignaling{Equal,Greater,Less} is FALSE
compareSignalingNot{Equal,Greater,Less} is TRUE
Do we need to add a statement that the default value of an unordered-signaling predicate applied to unordered operands is the same as the value of the corresponding unordered-quiet predicate applied to the same operands?
In the sentence before Table 5.2, after "signal an invalid operation exception on quiet NaNs" add "and return, by default, the logical result of the corresponding unordered-quiet predicate of Table 5.3."
Much of the 5.11 text refers to quiet NaNs - nothing on signaling NaNs. Is it obvious, or should we say, that a signaling NaN operand with default invalid handling produces the same result as a quiet NaN?
After Table 5.3, add "All of these predicates signal invalid exception for signaling NaN operand(s) and return, by default, the logical result of the same predicate applied to quiet Nan operand(s)."
The other logical places to discuss the default result of logical predicates would be in 6.2 and 7.2 but 5.11 is more central.
Add a NOTE - in 6.2: "The sign bit and payload of a quiet NaN is not required to be deterministic."
The current specification can't decide whether 0 * inf + qNaN signals an invalid exception.
The preferred solution is to require the current common practice in the future: 0 * inf + NaN signals an invalid exception whether invoked as two separate operations or one FMA.
If existing implementations are a concern, why not grandfather them in and prescribe something definite for the future? There are no existing implementations of 754R, and the adoption process before languages, compilers, and operating systems support 754R features and applications can exploit them will cover several hardware generations, leaving plenty of time to adapt.
The preferred solution is that new implementations detect underflow before rounding - easier to understand, easier to document, no harder to implement.
The less preferable solution is to let languages decide, to change:
The implementor shall choose how tininess is detected, but shall detect tininess in the same way for all operations of a given radix.to
Tininess shall be detected in the same language-defined way for all operations of a given radix.
The choice of underflow tininess criterion is essentially no different from requiring or excluding extended-precision expression evaluation; a language that takes a stand one way or another will exclude high-performance conforming implementations on some platforms. As with all this standard's implementation options, a language that prescribes one option and proscribes others will impose a performance burden on hardware that exercised other options permitted by this standard. So language definitions that value performance over many platforms higher than reproducibility over many platforms will defer such choices to implementations.
If existing implementations are a concern, why not grandfather them in and prescribe something definite for the future?
The special cases of atan2 and atan2Pi are taken from C99 and are inherently inconsistent due to a desire to coincide with previous definitions of C. The case of atan2(+-x,x) is +-pi/4 or +-3pi/4 for all infinite and all finite x except 0. Although the angle/argument/phase associated with zero and infinity is somewhat arbitrary and unlikely to affect any correct programs except test programs, it might as well be consistent; the next version of C could adopt a consistent 754R definition under another name while retaining its current definition.
Change the specification of atan2(+-0,+0) to be +-pi/4, atan2(+-0,-0) to be +-3pi/4, and atan2Pi correspondingly.
The exceptional cases of atan2 and atan2pi could be presented much more systematically to make their rationale more apparent; see atan2u.pdf for an example.
Clause 9.4's recommended scaled product reductions provide higher-level support for extended products and replace 754's lower-level implementation mechanism specification of scaled exponents for trapped overflow and underflow. In each case the intent is efficient support for computations like Clebsch-Gordan coefficients.
In contrast, Clause 9.4's recommended unscaled sum reductions serve an unknown audience. The sums are implementation-defined, but they are to be computed without gratuitous intermediate exponent spill, and the sums of positive reductions have hypot's disappearing-NaN property. These requirements mean that a simple conventional reduction optimization that allows sums to be computed in any order on any number of available processors would not be an acceptable implementation. So on the one hand, performance is not a goal, while on the other hand, neither is correct rounding, so these operations don't support the Kulisch style of computation. 754R received no user request for such operations intermediate between a conventional reduction and a correctly-rounded scaled reduction.
A sufficiently careful reader might infer that short-circuit evaluation of product reductions is allowed: once the result is known to be a quiet NaN, no further terms need to be evaluated, and so a subsequent signaling NaN and its signal might be omitted. But it would not hurt to be more explicit by adding: "Reduction evaluation may be terminated once the result of a product reduction is known to be a quiet NaN."
The bulk of 754R's requirements is appropriate for Fortran or C but imposes excessive cost relative to benefit for applications like scripting languages and spreadsheets. But to encourage those languages and applications to conform where it makes sense, add a lower level of limited conformance. Add an annex defining a limited conformance level:
Language standards for application domains with limited computational requirements, for which requiring full compliance with this standard on the language's implementations would impose excessive language design and implementation costs relative to the benefit to programmers of applications in that limited domain, may choose to require conformance to this standard at a limited level: such a language need specify only one basic format, only the default rounding direction attribute, only default exception handling, and only clause 5's required operations.
The original notion of an informative Annex L providing a sample binding to one or more languages was never fulfilled, but an informative Annex L listing at least the choices that languages/implementations make would be helpful. annex-l.html corresponds to an earlier draft of the standard.
The working group discussed the possibility of including a rationale with the proposed standard, either interleaved with the normative text, or as a non-normative annex. Such an annex was started but in the end no person was willing to do the work for free and no organization was willing to pay for it. So no complaints from any of those persons or organizations if the naked standard is less than fully self-explanatory! Copious minutes, email debates, and drafts beyond count are available for study and condensation into a coherent whole. The working group's charter unfortunately omitted provision for publication of a technical report apart from a normative standard.
The late David James was a tireless advocate of a tabular representation of the standard's requirements. A tabular representation standing alone in place of an axiomatic presentation would obscure the global structure of the standard from the point of view of programmers. But as part of the supporting documentation of the final complete standard, such a tabular representation would be an invaluable aid to implementors.