Discussion:
Floating point question
(too old to reply)
jacob navia
2015-07-31 18:20:35 UTC
Permalink
Raw Message
When two values produce an overflow in some operation, should the
rounding mode influence the result?

Or the result should be always +/- inf?
James Kuyper
2015-07-31 18:43:33 UTC
Permalink
Raw Message
Post by jacob navia
When two values produce an overflow in some operation, should the
rounding mode influence the result?
Or the result should be always +/- inf?
At least for the operations performed by <math.h> functions, the
rounding mode is allowed to influence the required results:

"If a floating result overflows and default rounding is in effect, then
the function returns the value of the macro HUGE_VAL, HUGE_VALF, or
HUGE_VALL according to the return type, with the same sign as the
correct value of the function ..." 7.12.1p5

This also applies to the strto*() functions and the corresponding
wide-character functions:

"If the correct value overflows and default rounding is in effect
(7.12.1), plus or minus HUGE_VAL, HUGE_VALF, or HUGE_VALL is returned
(according to the return type and sign of the value) ..." 7.22.1.4p10,
7.29.4.1.1p10.

However, an implementation that always returns HUGE_* on overflow,
regardless of rounding mode would conform to this requirement.

It seems reasonable to me that this should also apply to floating point
arithmetic operators, but the standard doesn't say so.
jacob navia
2015-07-31 20:00:19 UTC
Permalink
Raw Message
Post by James Kuyper
"If a floating result overflows and default rounding is in effect, then
the function returns the value of the macro HUGE_VAL, HUGE_VALF, or
HUGE_VALL according to the return type, with the same sign as the
correct value of the function ..." 7.12.1p5
/tmp $ cat t.c
#include <stdio.h>
int main(void)
{
long double x = 1e4000L;
long double y = x*x;

printf("%Lg\n",y);
}
/tmp $ gcc t.c
/tmp $ ./a.out
inf
/tmp $

An overflow in a multiplication returns "inf".
Result doesn't change if I add the std=c99 option.

I am completely confused

JACOB
Martin Shobe
2015-07-31 20:30:50 UTC
Permalink
Raw Message
Post by jacob navia
Post by James Kuyper
"If a floating result overflows and default rounding is in effect, then
the function returns the value of the macro HUGE_VAL, HUGE_VALF, or
HUGE_VALL according to the return type, with the same sign as the
correct value of the function ..." 7.12.1p5
/tmp $ cat t.c
#include <stdio.h>
int main(void)
{
long double x = 1e4000L;
long double y = x*x;
printf("%Lg\n",y);
}
/tmp $ gcc t.c
/tmp $ ./a.out
inf
/tmp $
An overflow in a multiplication returns "inf".
Result doesn't change if I add the std=c99 option.
I am completely confused
Print out HUGE_VAL (and friends) and see what you get.

Martin Shobe
James Kuyper
2015-07-31 20:36:41 UTC
Permalink
Raw Message
Post by jacob navia
Post by James Kuyper
"If a floating result overflows and default rounding is in effect, then
the function returns the value of the macro HUGE_VAL, HUGE_VALF, or
HUGE_VALL according to the return type, with the same sign as the
correct value of the function ..." 7.12.1p5
/tmp $ cat t.c
#include <stdio.h>
int main(void)
{
long double x = 1e4000L;
long double y = x*x;
printf("%Lg\n",y);
}
/tmp $ gcc t.c
/tmp $ ./a.out
inf
/tmp $
An overflow in a multiplication returns "inf".
Result doesn't change if I add the std=c99 option.
I am completely confused
Why? What did you expect it to print? If the above clause applied to
floating point multiplication (and as I pointed out, it does not), then
the result would be HUGE_VALL. Unless __STDC_IEC_559__ is pre#defined,
the standard imposes only one requirement on HUGE_VALL, that it be
positive. It could even be as small as nextafterl(0.0L,1.0L). If
__STDC_IEC_559__, HUGE_VALL is required to be positive infinity, which
is consistent with your results.

#include <float.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
long double x = 1e4000L;
long double y = x*x;
long double infinity = strtold("INF", NULL);
printf("%Lg,%Lg,%Lg,%Lg\n", LDBL_MAX, HUGE_VALL, infinity, y);
}

Output:

1.18973e+4932,inf,inf,inf
jacob navia
2015-07-31 22:34:50 UTC
Permalink
Raw Message
Post by James Kuyper
HUGE_VALL is required to be positive infinity
Ahhhhhhhhhhhhhh!!!!!!!!!!!

I was confusing it with LDBL_MAX!

OK, now I see where was the bug in my float128 module.

I was returning the maximum number (FLOAT128_MAX) instead of infinity.

The representation of infinity is FLOAT128_MAX+1 (0x7fff and then a
fraction of zero), where FLOAT128_MAX is 0x7ffe and then a fraction of
all ones.

So, according to the standard an overflow in a multiplication/addition
goes into infinity regardless of the rounding mode.

OK, now is much clearer. Thank you for your answer Mr Kuyper.
Fred J. Tydeman
2015-08-01 00:59:09 UTC
Permalink
Raw Message
James Kuyper
2015-08-01 04:00:26 UTC
Permalink
Raw Message
Post by jacob navia
Post by James Kuyper
HUGE_VALL is required to be positive infinity
...
Post by jacob navia
So, according to the standard an overflow in a multiplication/addition
goes into infinity regardless of the rounding mode.
Which standard? C or IEEE-754?
His citation of my words didn't include enough of the context to make
that clear. The relevant standard is the C standard (C99 or later, to be
exact) and my comment was conditional on #ifdef __STDC_IEC_559__, which
requires that Annex F apply.
... If Annex F is in effect, then IEEE-754 is in effect.
Not quite: IEC 60559 is equivalent to IEEE-754, but "An implementation
that defines _ _STDC_IEC_559_ _ shall conform to the specifications in
this annex." (F1p1). It need not conform to any provisions of IEC 60559
not specified, directly or indirectly, in Annex F. I've heard from a
member of the British committee that there are some provisions that were
not included there, and some provisions that are even in conflict with
IEC 60559, though I cannot personally vouch for the truth of that claim.

Annex F says "The +, −, *, and / operators provide the IEC 60559 add,
subtract, multiply, and divide operations.", so whatever IEC 60559 says
about the "add" operation applies the '+" operator in C, which is,
IEEE-754 requires that overflows produce either infinity of maximum finite,
depending upon the rounding mode. By default, the rounding mode is
to nearest, which will produce an infinity. Change the rounding mode to zero
and you will get maximum finite.
--
James Kuyper
jacob navia
2015-08-01 12:22:29 UTC
Permalink
Raw Message
Post by James Kuyper
Post by jacob navia
Post by James Kuyper
HUGE_VALL is required to be positive infinity
...
Post by jacob navia
So, according to the standard an overflow in a multiplication/addition
goes into infinity regardless of the rounding mode.
Which standard? C or IEEE-754?
His citation of my words didn't include enough of the context to make
that clear. The relevant standard is the C standard (C99 or later, to be
exact)
Obviously since this is comp.std.c!

and my comment was conditional on #ifdef __STDC_IEC_559__, which
Post by James Kuyper
requires that Annex F apply.
That is why I asked here. This is so complicated that you need good
experts to see all possible stuff!
Post by James Kuyper
... If Annex F is in effect, then IEEE-754 is in effect.
Not quite: IEC 60559 is equivalent to IEEE-754, but "An implementation
that defines _ _STDC_IEC_559_ _ shall conform to the specifications in
this annex." (F1p1). It need not conform to any provisions of IEC 60559
not specified, directly or indirectly, in Annex F. I've heard from a
member of the British committee that there are some provisions that were
not included there, and some provisions that are even in conflict with
IEC 60559, though I cannot personally vouch for the truth of that claim.
Complications upon complications. Now, what is the difference between
754 and 60559 ?

IEEE wouldn't have brought a NEW standard if it wasn't *somehow*
different from the old one isn't it?

Are they compatible? If not I have to figure out which to follow???
Post by James Kuyper
Annex F says "The +, −, *, and / operators provide the IEC 60559 add,
subtract, multiply, and divide operations.", so whatever IEC 60559 says
about the "add" operation applies the '+" operator in C, which is,
IEEE-754 requires that overflows produce either infinity of maximum finite,
depending upon the rounding mode. By default, the rounding mode is
to nearest, which will produce an infinity. Change the rounding mode to zero
and you will get maximum finite.
Does 60559 say the same thing?

AAAAAARGH!
jacob navia
2015-08-01 12:40:16 UTC
Permalink
Raw Message
IEEE-754 requires that overflows produce either infinity of maximum finite,
depending upon the rounding mode. By default, the rounding mode is
to nearest, which will produce an infinity. Change the rounding mode to zero
and you will get maximum finite.
So basically the result of this program depends on the rounding mode?


1 #include <stdio.h>
2 int main(void)
3 {
4 long double x = 1e4000L;
5 long double y = x*x;
6
7 printf("%Lg %d\n",y,isfinite(y));
8 }

I think this should always be

inf 0

independently of the rounding mode.

Why?

Because the representation of infinity is a special value for overflow,
a number outside the representable range.

If you use the maximum representable value as a marker for overflow, you
risk making any computation using the maximum representable value
WITHOUT OVERFLOW have the same representation as the result of an
overflow, what is utterly CONFUSING!!!

Mr Tydeman, maybe you can figure this out and write it explicitely in
the standard?

The current situation seems very confusing.
James Kuyper
2015-08-01 15:32:49 UTC
Permalink
Raw Message
Post by jacob navia
Post by James Kuyper
Post by jacob navia
Post by James Kuyper
HUGE_VALL is required to be positive infinity
...
Post by jacob navia
So, according to the standard an overflow in a multiplication/addition
goes into infinity regardless of the rounding mode.
Which standard? C or IEEE-754?
His citation of my words didn't include enough of the context to make
that clear. The relevant standard is the C standard (C99 or later, to be
exact)
Obviously since this is comp.std.c!
and my comment was conditional on #ifdef __STDC_IEC_559__, which
Post by James Kuyper
requires that Annex F apply.
That is why I asked here. This is so complicated that you need good
experts to see all possible stuff!
Post by James Kuyper
... If Annex F is in effect, then IEEE-754 is in effect.
Not quite: IEC 60559 is equivalent to IEEE-754, but "An implementation
that defines _ _STDC_IEC_559_ _ shall conform to the specifications in
this annex." (F1p1). It need not conform to any provisions of IEC 60559
not specified, directly or indirectly, in Annex F. I've heard from a
member of the British committee that there are some provisions that were
not included there, and some provisions that are even in conflict with
IEC 60559, though I cannot personally vouch for the truth of that claim.
Complications upon complications. Now, what is the difference between
754 and 60559 ?
IEEE wouldn't have brought a NEW standard if it wasn't *somehow*
different from the old one isn't it?
I would guess that it's a matter of jurisdiction - perhaps there are
some people obligated to conform to applicable IEEE standards, but not
to IEC standards (and vice versa)? I believe that's the reason why the C
standard is both an ISO standard and, separately, an ANSI standard, with
identical contents.
Post by jacob navia
Are they compatible? If not I have to figure out which to follow???
I've been told that IEC 60559 and IEEE-754 are equivalent, though I
can't vouch for that, not having a copy of either standard.

The difference I was talking about was between IEC 60559 and Annex F.
Annex F does not incorporate the entire IEC 60559 by reference, but only
bits and pieces. It incorporates most of IEC 60559 by reference, but
only one piece at a time, so it requires careful comparison to determine
whether it has incorporated the whole thing - a comparison I can't carry
out. I've been told that it does not, but I have been given no examples.
I've also been told that some things specified by Annex F are in
conflict with IEC 60559 - but the only examples I was given were not
clearly in conflict - they simply said two different things that could
both be true at the same time when describing the same implementation.
Post by jacob navia
Post by James Kuyper
Annex F says "The +, −, *, and / operators provide the IEC 60559 add,
subtract, multiply, and divide operations.", so whatever IEC 60559 says
about the "add" operation applies the '+" operator in C, which is,
IEEE-754 requires that overflows produce either infinity of maximum finite,
depending upon the rounding mode. By default, the rounding mode is
to nearest, which will produce an infinity. Change the rounding mode to zero
and you will get maximum finite.
Does 60559 say the same thing?
Since they're supposed to be equivalent, I would expect that it does.
Fred J. Tydeman
2015-08-02 13:38:28 UTC
Permalink
Raw Message
Post by jacob navia
Complications upon complications. Now, what is the difference between
754 and 60559 ?
There is supposed to be no difference. IEEE-754 is a USA standard (as far
as I understand). IEC 60559 is the international version of the same thing.
Post by jacob navia
IEEE wouldn't have brought a NEW standard if it wasn't *somehow*
different from the old one isn't it?
The 1985 version was just binary floating point.

The 2008 version has both binary and decimal floating-point.
It also has recommendations on the math library functions.
Post by jacob navia
Are they compatible? If not I have to figure out which to follow???
As far as I know, they are the same for binary floating-point
(which is what Annex F of the C standard is about).
Post by jacob navia
Annex F says "The +, , *, and / operators provide the IEC 60559 add,
subtract, multiply, and divide operations.", so whatever IEC 60559 says
about the "add" operation applies the '+" operator in C, which is,
IEEE-754 requires that overflows produce either infinity of maximum finite,
depending upon the rounding mode. By default, the rounding mode is
to nearest, which will produce an infinity. Change the rounding mode to zero
and you will get maximum finite.
Does 60559 say the same thing?
Yes.

---
Fred J. Tydeman Tydeman Consulting
***@tybor.com Testing, numerics, programming
+1 (775) 287-5904 Vice-chair of PL22.11 (ANSI "C")
Sample C99+FPCE tests: http://www.tybor.com
Savers sleep well, investors eat well, spenders work forever.
Keith Thompson
2015-08-02 20:39:58 UTC
Permalink
Raw Message
Post by Fred J. Tydeman
Post by jacob navia
Complications upon complications. Now, what is the difference between
754 and 60559 ?
There is supposed to be no difference. IEEE-754 is a USA standard (as far
as I understand). IEC 60559 is the international version of the same thing.
Not quite. IEEE, IEC, and ISO are all international organizations. The
current standard is ISO/IEC/IEEE 60559:2011, which has content identical
to IEEE 754-2008.

https://en.wikipedia.org/wiki/IEEE_floating_point

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Hans-Bernhard Bröker
2015-08-02 14:29:56 UTC
Permalink
Raw Message
Post by jacob navia
Complications upon complications. Now, what is the difference between
754 and 60559 ?
IEEE wouldn't have brought a NEW standard if it wasn't *somehow*
different from the old one isn't it?
You got that backwards. It was _ISO/IEC_ made a new standard to make
the existing IEEE one internationally applicable. IEEE then
incorporated the new one. So now everyone has the same standard.

The only real difference between two standards to be found here is that
between the original IEEE 754-1985 and its update in 2008.
Post by jacob navia
Are they compatible? If not I have to figure out which to follow???
No, you don't have to figure it out. You have to follow IEC 60559,
because that's what the C Standard tells you to. It doesn't matter what
relations 60559 might have to some other documents.
James Kuyper
2015-08-03 17:07:18 UTC
Permalink
Raw Message
Post by Hans-Bernhard Bröker
Post by jacob navia
Complications upon complications. Now, what is the difference between
754 and 60559 ?
IEEE wouldn't have brought a NEW standard if it wasn't *somehow*
different from the old one isn't it?
You got that backwards. It was _ISO/IEC_ made a new standard to make
the existing IEEE one internationally applicable. IEEE then
incorporated the new one. So now everyone has the same standard.
The only real difference between two standards to be found here is that
between the original IEEE 754-1985 and its update in 2008.
Post by jacob navia
Are they compatible? If not I have to figure out which to follow???
No, you don't have to figure it out. You have to follow IEC 60559,
because that's what the C Standard tells you to. ...
No, it does not. All that the C standard does is provide a way for a
program to determine whether a given implementation claims to conform to
the requirements of Annex F. Those requirements are almost, but not
quite, exactly the same as IEC 60559 - but conforming to them is optional.
Post by Hans-Bernhard Bröker
... It doesn't matter what
relations 60559 might have to some other documents.
Those other documents are not completely irrelevant.
Annex F cites ANSI/IEEE 854, and NOT IEC 60559, as mandating the
nearbyint() functions. The strtold() function provides the "conv"
function recommended by ANSI/IEEE 854. While the logb() function is
recommended by IEC 60559, Annex F says that it matches the newer
specification provided by ANSI/IEEE 854. That's why, when I decided to
buy a copy of a floating point standard, I chose to buy ANSI/IEEE 854,
rather than ANSI/IEEE 754 (==IEC 60559).
Bruce Evans
2015-08-01 10:05:30 UTC
Permalink
Raw Message
Post by jacob navia
Post by James Kuyper
HUGE_VALL is required to be positive infinity
Ahhhhhhhhhhhhhh!!!!!!!!!!!
I was confusing it with LDBL_MAX!
OK, now I see where was the bug in my float128 module.
I was returning the maximum number (FLOAT128_MAX) instead of infinity.
The representation of infinity is FLOAT128_MAX+1 (0x7fff and then a
fraction of zero), where FLOAT128_MAX is 0x7ffe and then a fraction of
all ones.
So, according to the standard an overflow in a multiplication/addition
goes into infinity regardless of the rounding mode.
Which standard? C or IEEE-754? If Annex F is in effect, then IEEE-754
is in effect.
IEEE-754 requires that overflows produce either infinity of maximum finite,
depending upon the rounding mode. By default, the rounding mode is
to nearest, which will produce an infinity. Change the rounding mode to zero
and you will get maximum finite.
More precisely, IEEE-754 requires that overflows produce infinity in all
3 rounding modes except towards zero, and maximum (or minumum) finite for
rounding towards zero. I didn't know of the special case for rounding
towards zero.

Whether overflow itself occurs is also determined by the rounding mode.
E.g., (float)(FLT_MAX + 1) always overflows in rounding-upwards mode.
Whether overflow occurs is determined by first rounding to the current
precision but with an infinite exponent range, and then checking if
the result is representable.

The combination of underflow (and denormal?) exceptions with rounding
is more machine-dependent. IEEE-754 allows implementations to check
for "tiny" values either before or after rounding.

Bruce
Fred J. Tydeman
2015-08-03 21:15:10 UTC
Permalink
Raw Message
Post by Bruce Evans
More precisely, IEEE-754 requires that overflows produce infinity in all
3 rounding modes except towards zero, and maximum (or minumum) finite for
rounding towards zero. I didn't know of the special case for rounding
towards zero.
Sorry, wrong. 2 modes get infinity and 2 modes get max finite.

For inexact results, the correct value is between two machine "numbers" (one
of which could be infinity), which I will call 'low' and 'high' with low < high.

For the (+/-max finite) * (+/-max finite) case:

If the correct value is positive (low = +max finite, high = +infinity)

Value | Rounding
produced | mode
---------------+---------------
high upward or towards +infinity
low downward or towards -infinity
low towards zero
high nearest

If the correct value is negative (low = -infinity, high = -max finite)

Value | Rounding
produced | mode
---------------+---------------
high upward or towards +infinity
low downward or towards -infinity
high towards zero
low nearest
---
Fred J. Tydeman Tydeman Consulting
***@tybor.com Testing, numerics, programming
+1 (775) 287-5904 Vice-chair of PL22.11 (ANSI "C")
Sample C99+FPCE tests: http://www.tybor.com
Savers sleep well, investors eat well, spenders work forever.

Loading...