Discussion:
is C implementation allowed to terminate an infinite loop?
(too old to reply)
John Regehr
2010-05-03 16:06:34 UTC
Permalink
Raw Message
The C99 standard, as I read it, is not completely clear about whether
a conforming implementation may change the termination behavior of a
conforming program. In contrast, the Java language definition makes
it clear that the JVM must not terminate an infinite loop. The C++0x
draft standard is explicit that an implementation *can* terminate an
infinite loop.

The arguments seem to be as follows:

One one hand: Non-terminating executions are not mentioned in the list
of "least requirements" in 5.1.2.3 of N1124, so the compiler is free
to terminate an infinite loop.

On the other hand: Termination of infinite loops will change the side
effecting operations performed by a program, so a conforming
implementation must not do this.

Can anyone comment on the intent of the standards committee on this
point? Is there any chance this can be explicitly cleared up in a
future standard?

Consider this example, where "icc" is the current version of Intel's
compiler for x86:

***@john-home:~$ icc fermat2.c -o fermat2
***@john-home:~$ ./fermat2
Fermat's Last Theorem has been disproved.

Where fermat2.c is this code:

int fermat (void)
{
const int MAX = 1000;
int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}

#include <stdio.h>

int main (void)
{
if (fermat()) {
printf ("Fermat's Last Theorem has been disproved.\n");
} else {
printf ("Fermat's Last Theorem has not been disproved.\n");
}
return 0;
}

A number of other C implementations have this behavior, whereas others
preserve the termination behavior of their inputs.

The other day I wrote a slightly lighthearted blog entry about this
issue and it generated quite a bit of discussion:

http://blog.regehr.org/archives/140

Thanks.

John Regehr
Wojtek Lerch
2010-05-03 17:13:21 UTC
Permalink
Raw Message
Post by John Regehr
The C99 standard, as I read it, is not completely clear about whether
a conforming implementation may change the termination behavior of a
conforming program. In contrast, the Java language definition makes
it clear that the JVM must not terminate an infinite loop. The C++0x
draft standard is explicit that an implementation *can* terminate an
infinite loop.
One one hand: Non-terminating executions are not mentioned in the list
of "least requirements" in 5.1.2.3 of N1124, so the compiler is free
to terminate an infinite loop.
If by "terminating" you mean something along the lines of "letting the next
statement after the loop execute", then I think it's a question of whether
5.1.2.3 should be interpreted as allowing implementations to produce side
effects of expressions that would never be evaluated in the abstrace machine
(and I believe the answer is no). If, on the other hand, you mean just
terminating the program instead of letting it loop forever, then I think
it's much more subtle.

There was a discussion here about this a while ago:

http://groups.google.ca/group/comp.std.c/browse_thread/thread/bbebffa8a5d5fecf/cae898722d689f49?hl=en&lnk=gst
John Regehr
2010-05-03 17:46:42 UTC
Permalink
Raw Message
http://groups.google.ca/group/comp.std.c/browse_thread/thread/bbebffa...
Ah thanks. I'd done a cursory scan of the archives and missed this.

For the curious, here's the list of x86 compilers I've tried.

Will terminate an infinite loop in such a way that the observable
behavior of a program changes: Microsoft (Visual Studio 2008 and
2010), LLVM/Clang 2.7, Open64-x86 4.2.3, Sun CC 5.10, and Intel CC
11.1.072.

Has not been observed to change the observable behavior of a program
containing an infinite loop: various versions of gcc 3.x and 4.x, Wind
River Diab C compiler.

John Regehr
Eric Sosman
2010-05-03 17:17:16 UTC
Permalink
Raw Message
Post by John Regehr
[...]
Consider this example, where "icc" is the current version of Intel's
Fermat's Last Theorem has been disproved.
[... see up-thread ...]
Conjecture: The version that terminated was built with compiler
options resulting in a 16-bit `int'. (When I simulate this with a
whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)

Is today the First of April in the Orthodox calendar or in
hexadecimal or in metric or something like that?
--
Eric Sosman
***@ieee-dot-org.invalid
John Regehr
2010-05-03 17:34:39 UTC
Permalink
Raw Message
     Conjecture: The version that terminated was built with compiler
options resulting in a 16-bit `int'.  (When I simulate this with a
whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)
No-- the Intel compiler on x86 has sizeof(int)==4. My code is
designed to not let any int exceed 2^31-1.

John
Eric Sosman
2010-05-03 17:48:34 UTC
Permalink
Raw Message
Post by John Regehr
Post by Eric Sosman
Conjecture: The version that terminated was built with compiler
options resulting in a 16-bit `int'. (When I simulate this with a
whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)
No-- the Intel compiler on x86 has sizeof(int)==4.
It still seems an "astonishing coincidence," and worth a
closer look. Try printing the value of INT_MAX, for example.
Post by John Regehr
My code is
designed to not let any int exceed 2^31-1.
... whereas what's actually needed is code that doesn't
try to exceed INT_MAX. Initializing MAX to cbrt(INT_MAX/2)
instead of to 1000 might be prudent.
--
Eric Sosman
***@ieee-dot-org.invalid
John Regehr
2010-05-03 21:45:41 UTC
Permalink
Raw Message
closer look.  Try printing the value of INT_MAX, for example.
***@john-home:~$ icc fermat2.c -o fermat2
***@john-home:~$ ./fermat2
sizeof(int)=4, INT_MAX=2147483647
Fermat's Last Theorem has been disproved.
***@john-home:~$ cat fermat2.c
int fermat (void)
{
const int MAX = 10;
int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}

#include <stdio.h>
#include <limits.h>

int main (void)
{
printf ("sizeof(int)=%d, INT_MAX=%d\n", sizeof(int), INT_MAX);
if (fermat()) {
printf ("Fermat's Last Theorem has been disproved.\n");
} else {
printf ("Fermat's Last Theorem has not been disproved.\n");
}
return 0;
}
Michal Nazarewicz
2010-05-03 17:59:03 UTC
Permalink
Raw Message
Post by John Regehr
     Conjecture: The version that terminated was built with compiler
options resulting in a 16-bit `int'.  (When I simulate this with a
whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)
No-- the Intel compiler on x86 has sizeof(int)==4. My code is
designed to not let any int exceed 2^31-1.
You could change to "unsigned long" to finish all comments like that.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
John Regehr
2010-05-03 21:52:03 UTC
Permalink
Raw Message
     Is today the First of April in the Orthodox calendar or in
hexadecimal or in metric or something like that?
I wish. Eric I see that you're at Sun, so perhaps you'll prefer this
version:

***@john-home:~$ suncc -V
cc: Sun C 5.10 Linux_i386 2009/06/03
usage: cc [ options] files. Use 'cc -flags' for details
***@john-home:~$ suncc -O fermat2.c -o fermat2
"fermat2.c", line 20: warning: statement not reached
***@john-home:~$ ./fermat2
sizeof(int)=4, INT_MAX=2147483647
Fermat's Last Theorem has been disproved.
***@john-home:~$ cat fermat2.c
int fermat (void)
{
const int MAX = 10;
int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}

#include <stdio.h>
#include <limits.h>

int main (void)
{
printf ("sizeof(int)=%d, INT_MAX=%d\n", sizeof(int), INT_MAX);
if (fermat()) {
printf ("Fermat's Last Theorem has been disproved.\n");
} else {
printf ("Fermat's Last Theorem has not been disproved.\n");
}
return 0;
}
Eric Sosman
2010-05-03 22:31:19 UTC
Permalink
Raw Message
Post by John Regehr
Post by Eric Sosman
Is today the First of April in the Orthodox calendar or in
hexadecimal or in metric or something like that?
I wish. Eric I see that you're at Sun,
Old news, I'm afraid. On the bright side, I had an enviable
87.5% success rate in surviving layoffs there ...
Post by John Regehr
so perhaps you'll prefer this
cc: Sun C 5.10 Linux_i386 2009/06/03
usage: cc [ options] files. Use 'cc -flags' for details
"fermat2.c", line 20: warning: statement not reached
sizeof(int)=4, INT_MAX=2147483647
Fermat's Last Theorem has been disproved.
[... code snipped, see up-thread ...]
Okay, the "closer look" proves my conjecture wrong. Also,
by limiting all arithmetic results to no more than 2000 you've
ruled out overflow even on the narrowest possible int. There's
undefined behavior from printing a size_t with "%d", but I'm just
not going to believe that makes a difference.

At a guess -- and it's only a guess, the compiler "sees"

- That the `return 0;' cannot execute (it says as much)

- Hence, if any `return' executes it must be the `return 1;'
inside the loop

- That the only side-effects are to a,b,c, all unobservable
outside the function

- Hence, those side-effects are unneeded once the function
returns

... hence, the whole function body can be replaced with `return 1;'
with no change in externally observable behavior.

Ugh.

If (*if*) that's indeed what's going on. A couple tests you
might try:

- Make any or all of a,b,c static

- Make any or all of a,b,c file-scope static

- Return the value of a,b, or c (or an expression using them)

But still: My reaction remains "Ugh." I think the reasoning
behind the final "hence" above is simply putrid. 6.8.5p4 says
that "An iteration statement causes a statement called the loop
body to be executed repeatedly until the controlling expression
compares equal to 0," which never happens with your constant
controlling expression `1'. 6.8.4.1p2 says that "[...] the first
substatement [of an `if'] is executed if the expression compares
unequal to 0," which never happens in your code. Therefore the
`return 1;' should not be executed (because the `if' never fires),
and the compilers that execute it have oatmeal between their ears,
if not something worse.

Ugh.
--
Eric Sosman
***@ieee-dot-org.invalid
John Regehr
2010-05-03 23:01:20 UTC
Permalink
Raw Message
     Old news, I'm afraid.  On the bright side, I had an enviable
87.5% success rate in surviving layoffs there ...
Oops, there I go believing what I read on the web...

Your assessment about what's going on in the compilers sounds right.
and the compilers that execute it have oatmeal between their ears,
if not something worse.
Yeah I'd certainly agree with this.

John Regehr
l***@siemens.com
2010-05-04 15:57:32 UTC
Permalink
Raw Message
Post by Eric Sosman
At a guess -- and it's only a guess, the compiler "sees"
- That the `return 0;' cannot execute (it says as much)
- Hence, if any `return' executes it must be the `return 1;'
inside the loop
- That the only side-effects are to a,b,c, all unobservable
outside the function
- Hence, those side-effects are unneeded once the function
returns
... hence, the whole function body can be replaced with `return 1;'
with no change in externally observable behavior.
Ugh.
That kind of optimization is apparently becomming more common. The
current C1X draft explicitly allows it in 6.8.5:

An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>

<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
--
Larry Jones

At times like these, all Mom can think of is how long she was in
labor with me. -- Calvin
Keith Thompson
2010-05-04 16:41:59 UTC
Permalink
Raw Message
[...]
Post by l***@siemens.com
Post by Eric Sosman
... hence, the whole function body can be replaced with `return 1;'
with no change in externally observable behavior.
Ugh.
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
Let me say again:

Ugh.

If I write:

while (1) {
/* statements */
}
fputs("ERROR: Shouldn't get here\n", stderr);

I should never see that message. If I do see it, the compiler is
broken. If the compiler generates code that produces the message
because the language allows it to do so, the language is broken.

If the body of the loop has no effect, I'd welcome a warning, and I'd
even accept treating it as an error and rejecting the program. Silently
removing an infinite loop is, IMNSHO, unacceptable.

Can you think of an example where the permission to remove an infinite
loop is actually beneficial?
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Michal Nazarewicz
2010-05-04 20:33:41 UTC
Permalink
Raw Message
Post by John Regehr
Post by l***@siemens.com
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
while (1) {
/* statements */
}
fputs("ERROR: Shouldn't get here\n", stderr);
It goes better:

for(;;);
start_a_termonuclear_war();

;)
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
Keith Thompson
2010-05-04 22:06:32 UTC
Permalink
Raw Message
Post by Michal Nazarewicz
Post by John Regehr
Post by l***@siemens.com
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
while (1) {
/* statements */
}
fputs("ERROR: Shouldn't get here\n", stderr);
for(;;);
start_a_termonuclear_war();
;)
Nevertheless, I think it's safe to assume that the members of the
C standard committee aren't actually batcrackers insane. There's
probably some good rational reason for adding this permission to
the standard. I look forward to reading about this reason.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
John Regehr
2010-05-04 22:56:45 UTC
Permalink
Raw Message
Post by Keith Thompson
Nevertheless, I think it's safe to assume that the members of the
C standard committee aren't actually batcrackers insane.  There's
probably some good rational reason for adding this permission to
the standard.  I look forward to reading about this reason.
Just to be clear: the C standard is not explicit on this matter,
whereas the C++0x draft standard says that a C++ implementation may
assume that loops free of side effects are terminating, meaning that
non-terminating, side-effect free loops effectively have undefined
behavior.

Hans Boehm explained the C++ committee's rationale to me as follows
(I'm paraphrasing -- any mistakes/mischaracterizations are mine):

- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.

- It's hard to find real examples where this matters.

- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.

- Many compilers already have this behavior anyway.

Hope this is helpful.

John Regehr
Keith Thompson
2010-05-04 23:37:43 UTC
Permalink
Raw Message
Post by John Regehr
Post by Keith Thompson
Nevertheless, I think it's safe to assume that the members of the
C standard committee aren't actually batcrackers insane.  There's
probably some good rational reason for adding this permission to
the standard.  I look forward to reading about this reason.
Just to be clear: the C standard is not explicit on this matter,
whereas the C++0x draft standard says that a C++ implementation may
assume that loops free of side effects are terminating, meaning that
non-terminating, side-effect free loops effectively have undefined
behavior.
As Larry Jones mentioned upthread, the C201X draft (not yet a standard)
is explicit. See
<http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1425.pdf>, 6.8.5p6:

An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.

with a footnote:

This is intended to allow compiler transformations such as removal
of empty loops even when termination cannot be proven.

I would argue that the C99 standard implicitly but unambiguously
disallows such removal by its description of the semantics of loops:

An iteration statement causes a statement called the loop body to be
executed repeatedly until the controlling expression compares
equal to 0.

I just don't see any wiggle room there.
Post by John Regehr
Hans Boehm explained the C++ committee's rationale to me as follows
- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.
I'm sure they're also handicapped by the requirement to produce correct
results. 8-)}
Post by John Regehr
- It's hard to find real examples where this matters.
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
- Many compilers already have this behavior anyway.
I don't think it's currently up in the air. But I'm sure Hans
Boehm knows this stuff better than I do. It's likely I'm missing
something; I just don't know what.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Michal Nazarewicz
2010-05-05 07:13:40 UTC
Permalink
Raw Message
Post by John Regehr
Hans Boehm explained the C++ committee's rationale to me as follows
Any paper on that as it sounds interesting and I don't really see it?
Post by John Regehr
- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.
- It's hard to find real examples where this matters.
I just gave one (Linux 2.4.0) in mere minutes. I'm sure people who are
in embeded and operating systems are likely to give more.
Post by John Regehr
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
I think we all agree on that.

Personally I'd rather standard said compiler can reject such programs
or produce diagnostic then modify it in an undefined way.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
Michal Nazarewicz
2010-05-05 07:21:14 UTC
Permalink
Raw Message
Post by Michal Nazarewicz
Post by John Regehr
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
Personally I'd rather standard said compiler can reject such programs
or produce diagnostic then modify it in an undefined way.
On second thought, implementations can already produce diagnostics
whenever they want so let the standard just say that compiler cannot
assume loops terminate.

If it's such a burden for optimisation (which I still don't see)
implementations can simply provide a standard-incompatible option which
would turn such optimisations of.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
l***@siemens.com
2010-05-05 15:43:10 UTC
Permalink
Raw Message
Post by John Regehr
Hans Boehm explained the C++ committee's rationale to me as follows
- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.
- It's hard to find real examples where this matters.
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
- Many compilers already have this behavior anyway.
The C committee's rationale is basically, "what they said". :-)
--
Larry Jones

It COULD'VE happened by accident! -- Calvin
Keith Thompson
2010-05-05 16:47:06 UTC
Permalink
Raw Message
Reference: N1425 6.8.5p6:

An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.
Post by l***@siemens.com
Post by John Regehr
Hans Boehm explained the C++ committee's rationale to me as follows
- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.
Are there concrete examples of this?
Post by l***@siemens.com
Post by John Regehr
- It's hard to find real examples where this matters.
At least one has been posted in this thread. Here's another
hypothetical but entirely plausible one:

while (1) {
#ifdef FEATURE_ONE
do_feature_one();
#endif
#ifdef FEATURE_TWO
do_feature_two();
#endif
#ifdef FEATURE_THREE
do_feature_three();
#endif
}
fputs("Internal error\n", stderr);
Post by l***@siemens.com
Post by John Regehr
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
In my opinion, it's not up in the air at all. The C standard
defines the semantics of iteration statements. It doesn't allow
non-terminating loops to terminate, any more than it allows
``if (0) puts("oops")'' to print "oops".

I ask again, am I missing something? Can the current wording of
the C99 standard could be read in such a way as to permit this kind
of transformation?
Post by l***@siemens.com
Post by John Regehr
- Many compilers already have this behavior anyway.
Examples?
Post by l***@siemens.com
The C committee's rationale is basically, "what they said". :-)
My own rationale is this: If the standard permits the compiler to
violate the explicit semantics of a program in one special case,
it damages the credibility of the standard and of the language.

Just to make sure we all understand this correctly, does the new
rule actually permit this program:

#include <stdio.h>
int main(void)
{
while (1)
;
puts("Hello");
return 0;
}

to print "Hello"? As far as I can tell, it does; the proposed
wording doesn't exclude cases where the loop can trivially be shown
not to terminate. (I can imagine an argument that no real compiler
would perform the transformation in this particular case, but that's
not my question.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Ersek, Laszlo
2010-05-05 18:18:11 UTC
Permalink
Raw Message
On Wed, 5 May 2010, Keith Thompson wrote:

[snip]
The C standard defines the semantics of iteration statements. It
doesn't allow non-terminating loops to terminate, any more than it
allows ``if (0) puts("oops")'' to print "oops".
If you look back at the original code, the thing that is actually violated
is not the while statement's semantics, but that of "if".

static int
f(void)
{
while (1) {
if (impossible) return 0;
}
return -1;
}

Quoting myself (sorry!) from
<http://blog.regehr.org/archives/140#comment-323>,

"The standard doesn't say anything about the selection statement *having*
to evaluate to nonzero, and the compiler errs when it still assumes so."

(Obviously badly worded, it should have been "the controlling expression
of the selection statement".)

Cheers,
lacos
John Regehr
2010-05-05 19:04:12 UTC
Permalink
Raw Message
In my opinion, it's not up in the air at all.  The C standard
defines the semantics of iteration statements.  It doesn't allow
non-terminating loops to terminate, any more than it allows
``if (0) puts("oops")'' to print "oops".
I ask again, am I missing something?  Can the current wording of
the C99 standard could be read in such a way as to permit this kind
of transformation?
N1124 says this:

"The least requirements on a conforming implementation are:
— At sequence points, volatile objects are stable in the sense that
previous accesses are complete and subsequent accesses have not yet
occurred.
— At program termination, all data written into files shall be
identical to the result that execution of the program according to the
abstract semantics would have produced.
— The input and output dynamics of interactive devices shall take
place as specified in 7.19.3. The intent of these requirements is that
unbuffered or line-buffered output appear as soon as possible, to
ensure that prompting messages actually appear prior to a program
waiting for input."

The question is: Is the middle bullet talking about termination in the
abstract or actual semantics? If the former, then the compiler seems
to be permitted to terminate infinite loops. If the latter, then the
compiler is not permitted to do so. That's my reading at least.
Post by John Regehr
- Many compilers already have this behavior anyway.
Examples?
C compilers from Intel, Sun, Open64, LLVM, and Microsoft.
#include <stdio.h>
int main(void)
{
    while (1)
        ;
    puts("Hello");
    return 0;
}
to print "Hello"?  As far as I can tell, it does; the proposed
wording doesn't exclude cases where the loop can trivially be shown
not to terminate.  (I can imagine an argument that no real compiler
would perform the transformation in this particular case, but that's
not my question.)
The behavior of this program is undefined in C++0x. In C we don't
know...

John Regehr
Keith Thompson
2010-05-05 19:50:35 UTC
Permalink
Raw Message
Post by John Regehr
In my opinion, it's not up in the air at all.  The C standard
defines the semantics of iteration statements.  It doesn't allow
non-terminating loops to terminate, any more than it allows
``if (0) puts("oops")'' to print "oops".
I ask again, am I missing something?  Can the current wording of
the C99 standard could be read in such a way as to permit this kind
of transformation?
It probably doesn't matter in this case, but N1256 is newer.
N1124 includes the C99 standard plus TC1 and TC2; N1256 adds TC3.
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf>
Post by John Regehr
— At sequence points, volatile objects are stable in the sense that
previous accesses are complete and subsequent accesses have not yet
occurred.
— At program termination, all data written into files shall be
identical to the result that execution of the program according to the
abstract semantics would have produced.
— The input and output dynamics of interactive devices shall take
place as specified in 7.19.3. The intent of these requirements is that
unbuffered or line-buffered output appear as soon as possible, to
ensure that prompting messages actually appear prior to a program
waiting for input."
The question is: Is the middle bullet talking about termination in the
abstract or actual semantics? If the former, then the compiler seems
to be permitted to terminate infinite loops. If the latter, then the
compiler is not permitted to do so. That's my reading at least.
Are you saying that since the second bullet applies only to
file contents *at program termination*, all bets are off for
non-terminating programs?

If so, that's a much stronger statement than the proposal to allow
possibly infinite loops to be removed. And adding permission
to remove such loops really changes nothing; the behavior is
undefined anyway, so the compiler could replace the loop with a
call to format_disk().

Hmm. I suppose the requirements regarding volatile objects and
interactive devices are still in place.
Post by John Regehr
Post by John Regehr
- Many compilers already have this behavior anyway.
Examples?
C compilers from Intel, Sun, Open64, LLVM, and Microsoft.
According to <http://blog.regehr.org/archives/140>, LLVM no longer
does this; apparently its maintainers considered the behavior to
be a bug.
Post by John Regehr
#include <stdio.h>
int main(void)
{
    while (1)
        ;
    puts("Hello");
    return 0;
}
to print "Hello"?  As far as I can tell, it does; the proposed
wording doesn't exclude cases where the loop can trivially be shown
not to terminate.  (I can imagine an argument that no real compiler
would perform the transformation in this particular case, but that's
not my question.)
The behavior of this program is undefined in C++0x. In C we don't
know...
The latest C201X draft contains wording similar (possibly identical, I
haven't checked) to the wording in the latest C++ draft.

I meant to ask about the behavior under the proposed C201X standard.
I don't see how it's undefined. The compiler has permission to
remove the loop. If it does so, the program prints "Hello"; if
it doesn't, the program never terminates and produces no output.
I'd say it's unspecified, not undefined.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Keith Thompson
2010-05-05 20:04:55 UTC
Permalink
Raw Message
[...]
Post by Keith Thompson
Post by John Regehr
Post by John Regehr
#include <stdio.h>
int main(void)
{
    while (1)
        ;
    puts("Hello");
    return 0;
}
to print "Hello"?  As far as I can tell, it does; the proposed
wording doesn't exclude cases where the loop can trivially be shown
not to terminate.  (I can imagine an argument that no real compiler
would perform the transformation in this particular case, but that's
not my question.)
The behavior of this program is undefined in C++0x. In C we don't
know...
The latest C201X draft contains wording similar (possibly identical, I
haven't checked) to the wording in the latest C++ draft.
I meant to ask about the behavior under the proposed C201X standard.
I don't see how it's undefined. The compiler has permission to
remove the loop. If it does so, the program prints "Hello"; if
it doesn't, the program never terminates and produces no output.
I'd say it's unspecified, not undefined.
Sorry, I didn't read carefully enough. The proposed language
doesn't just give the compiler permission to remove the loop; it
says that it "may be assumed by the implementation to terminate".
I suppose that means that if the program violates this assumption
(by not terminating), then the behavior is undefined.

Another way to state this is that it's a change to the meaning of
a loop.

By writing a loop, the programmer is asserting *either* that the
loop does something with visible behavior (performs I/O, accesses
volatile objects, or performs synchronization or atomic operations),
*or* that it eventually terminates. If I violate that assumption
by writing an infinite loop with no side effects, I am lying to the
compiler, and my program's behavior is undefined. (If I'm lucky,
the compiler might warn me about it.) The language doesn't provide
(no longer provides?) a way to express an infinite loop without
visible behavior.

I suggest that something similar to the previous paragraph is
a clearer way to express the idea than what's currently in the
draft. It expresses what the code means rather than giving the
implementation permission to do something to the code.

I'm still not at all convinced this is a good idea, but it's probably
not as obviously insane as I've been thinking it is.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
John Regehr
2010-05-05 20:21:09 UTC
Permalink
Raw Message
Post by Keith Thompson
By writing a loop, the programmer is asserting *either* that the
loop does something with visible behavior (performs I/O, accesses
volatile objects, or performs synchronization or atomic operations),
*or* that it eventually terminates.  If I violate that assumption
by writing an infinite loop with no side effects, I am lying to the
compiler, and my program's behavior is undefined.
This is, I believe, exactly right.

John Regehr
Wojtek Lerch
2010-05-05 20:45:40 UTC
Permalink
Raw Message
Post by Keith Thompson
Sorry, I didn't read carefully enough. The proposed language
doesn't just give the compiler permission to remove the loop; it
says that it "may be assumed by the implementation to terminate".
I suppose that means that if the program violates this assumption
(by not terminating), then the behavior is undefined.
Perhaps, although if that's the case, then I think it would be nice if the
standard made that clearer by either saying "shall terminate" or explicitly
mentioning undefined behaviour.
Post by Keith Thompson
Another way to state this is that it's a change to the meaning of
a loop.
By writing a loop, the programmer is asserting *either* that the
loop does something with visible behavior (performs I/O, accesses
volatile objects, or performs synchronization or atomic operations),
*or* that it eventually terminates. If I violate that assumption
by writing an infinite loop with no side effects, I am lying to the
compiler, and my program's behavior is undefined. (If I'm lucky,
the compiler might warn me about it.) The language doesn't provide
(no longer provides?) a way to express an infinite loop without
visible behavior.
Actually, "back: goto back;" is such a loop, and since it's not an iteration
statement, the text in question does not apply to it.
Post by Keith Thompson
I suggest that something similar to the previous paragraph is
a clearer way to express the idea than what's currently in the
draft. It expresses what the code means rather than giving the
implementation permission to do something to the code.
I wouldn't even think that making an assumption about the code counts as
doing something *to* the code. :)
Post by Keith Thompson
I'm still not at all convinced this is a good idea, but it's probably
not as obviously insane as I've been thinking it is.
--
<http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Keith Thompson
2010-05-05 21:21:00 UTC
Permalink
Raw Message
[...]
Post by Wojtek Lerch
Post by Keith Thompson
By writing a loop, the programmer is asserting
[snip]
Post by Wojtek Lerch
Actually, "back: goto back;" is such a loop, and since it's not an
iteration statement, the text in question does not apply to it.
By "loop", I really meant "iteration statement".

In other words, I was wrong. Thanks for the correction.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Wojtek Lerch
2010-05-05 20:53:19 UTC
Permalink
Raw Message
Post by Keith Thompson
I'm still not at all convinced this is a good idea, but it's probably
not as obviously insane as I've been thinking it is.
My main concern would be that it may make it harder to debug loops that are
infinite by mistake.
Keith Thompson
2010-05-05 21:19:52 UTC
Permalink
Raw Message
Post by Wojtek Lerch
Post by Keith Thompson
I'm still not at all convinced this is a good idea, but it's probably
not as obviously insane as I've been thinking it is.
My main concern would be that it may make it harder to debug loops that are
infinite by mistake.
My main concern is that it's surprising. Programmers who aren't aware
of the new rule (even if it's really a consequence of existing rules)
will naturally assume that this:

while (1)
;

is an infinite loop, and nothing following it can ever execute.
Whether this is a good idea or not, it's bound to cause confusion.

And sometimes (rarely) you really need a way to execute forever without
doing anything. With this new rule, they'll either need to add an
otherwise unnecessary side effect, or use some other construct such as
``loop: goto loop;'' (with at least a paragraph of comments explaining
why it's necessary), or compile without optimization.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Keith Thompson
2010-05-05 21:34:56 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Wojtek Lerch
Post by Keith Thompson
I'm still not at all convinced this is a good idea, but it's probably
not as obviously insane as I've been thinking it is.
My main concern would be that it may make it harder to debug loops that are
infinite by mistake.
My main concern is that it's surprising. Programmers who aren't aware
of the new rule (even if it's really a consequence of existing rules)
while (1)
;
is an infinite loop, and nothing following it can ever execute.
Whether this is a good idea or not, it's bound to cause confusion.
And sometimes (rarely) you really need a way to execute forever without
doing anything. With this new rule, they'll either need to add an
otherwise unnecessary side effect, or use some other construct such as
``loop: goto loop;'' (with at least a paragraph of comments explaining
why it's necessary), or compile without optimization.
I also have a sneaking suspicion that this new permission to assume
that certain loops terminate is potentially just one special case
of a whole class of such things.

The new rule covers "while (1) {}" but not "loop: goto loop;". Why?
And what other cases might there, where breaking the naive semantics
of certain code can make optimization easier?

Is this a slippery slope that logically leads to making the behavior
of even the most obvious code undefined, producing wrong answers
*really* quickly?

Or is there a more general rule hiding behind this special case,
something that can be expressed clearly enough that you don't have
to be a compiler expert to understand it?

(By asking these questions, I don't mean to suggest that I have a
clue what the answers might be.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Fred Bloggs
2010-05-06 18:06:01 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Keith Thompson
Post by Wojtek Lerch
Post by Keith Thompson
I'm still not at all convinced this is a good idea, but it's
probably not as obviously insane as I've been thinking it is.
My main concern would be that it may make it harder to debug loops
that are infinite by mistake.
My main concern is that it's surprising. Programmers who aren't
aware of the new rule (even if it's really a consequence of existing
while (1)
;
is an infinite loop, and nothing following it can ever execute.
Whether this is a good idea or not, it's bound to cause confusion.
And sometimes (rarely) you really need a way to execute forever
without doing anything. With this new rule, they'll either need to
add an otherwise unnecessary side effect, or use some other construct
such as ``loop: goto loop;'' (with at least a paragraph of comments
explaining why it's necessary), or compile without optimization.
I also have a sneaking suspicion that this new permission to assume
that certain loops terminate is potentially just one special case
of a whole class of such things.
The new rule covers "while (1) {}" but not "loop: goto loop;". Why?
And what other cases might there, where breaking the naive semantics
of certain code can make optimization easier?
Is this a slippery slope that logically leads to making the behavior
of even the most obvious code undefined, producing wrong answers
*really* quickly?
Or is there a more general rule hiding behind this special case,
something that can be expressed clearly enough that you don't have
to be a compiler expert to understand it?
(By asking these questions, I don't mean to suggest that I have a
clue what the answers might be.)
As previously stated by others, LOTS of embedded code relies on infinite
loops not terminating. Some of this code is safety critical and exiting
from the loop could cause safety issues.

Changing the behaviour in such a way will break a lot of existing code!

C99 is already so badly defined that C90 is used as the basis for MISRA
C. There was hope that C1X would have resolved some of the issues.

However if C1X breakes safety critical code then there will be no uptake
of the new standard and we are back to the situation as we are with C99.

I've always stood up to the people who say C should not be used for
safety critical code and Ada should be used instead. I will find it much
more difficly not to agree with them in future!
Hallvard B Furuseth
2010-05-07 15:20:16 UTC
Permalink
Raw Message
Post by Fred Bloggs
As previously stated by others, LOTS of embedded code relies on infinite
loops not terminating. Some of this code is safety critical and exiting
from the loop could cause safety issues.
Including side-effect-free infinite loops, which is the ones affected?
Something like, the program is driven completely by signals and waits
for next signal with while(1); instead of with some system call?
--
Hallvard
Hallvard B Furuseth
2010-05-07 15:26:54 UTC
Permalink
Raw Message
Post by Hallvard B Furuseth
Post by Fred Bloggs
As previously stated by others, LOTS of embedded code relies on infinite
loops not terminating. Some of this code is safety critical and exiting
from the loop could cause safety issues.
Including side-effect-free infinite loops, which is the ones affected?
Never mind, I should have read the whole thread first. "Yes."
--
Hallvard
Clive D. W. Feather
2010-05-07 20:12:19 UTC
Permalink
Raw Message
Post by Hallvard B Furuseth
Including side-effect-free infinite loops, which is the ones affected?
Something like, the program is driven completely by signals and waits
for next signal with while(1); instead of with some system call?
Right. There may well not be any "system" to call. The infinite loop may
be the only way not to do anything.
--
Clive D.W. Feather | Home: <***@davros.org>
Mobile: +44 7973 377646 | Web: <http://www.davros.org>
Please reply to the Reply-To address, which is: <***@davros.org>
Dan Henry
2010-05-08 16:04:08 UTC
Permalink
Raw Message
Post by Keith Thompson
My main concern is that it's surprising. Programmers who aren't aware
of the new rule (even if it's really a consequence of existing rules)
while (1)
;
is an infinite loop, and nothing following it can ever execute.
Whether this is a good idea or not, it's bound to cause confusion.
Surprising and confusing. Many embedded designs initialize peripheral
subsystems to generate interrupts and then enter an infinite (empty)
do-nothing loop. *Everything* happens in the interrupt handlers. The
for(;;); or while(1); is there to prevent main() from returning.
Post by Keith Thompson
And sometimes (rarely) you really need a way to execute forever without
doing anything. With this new rule ...
With this new rule, a lot of code will become broken.
--
Dan Henry
Francis Glassborow
2010-05-09 12:47:30 UTC
Permalink
Raw Message
Post by Dan Henry
Post by Keith Thompson
My main concern is that it's surprising. Programmers who aren't aware
of the new rule (even if it's really a consequence of existing rules)
while (1)
;
is an infinite loop, and nothing following it can ever execute.
Whether this is a good idea or not, it's bound to cause confusion.
Surprising and confusing. Many embedded designs initialize peripheral
subsystems to generate interrupts and then enter an infinite (empty)
do-nothing loop. *Everything* happens in the interrupt handlers. The
for(;;); or while(1); is there to prevent main() from returning.
Post by Keith Thompson
And sometimes (rarely) you really need a way to execute forever without
doing anything. With this new rule ...
With this new rule, a lot of code will become broken.
Exactly, forever loops are very common when you get down to basics. If
ever the machine in front of me breaks out of the forever loop that
scans for input I have to restart my machine. Granted that those loops
are not always empty but as pointed out above, they are a reasonable
design solution.
James Kuyper
2010-05-09 16:19:17 UTC
Permalink
Raw Message
..
Post by Francis Glassborow
Post by Dan Henry
With this new rule, a lot of code will become broken.
Exactly, forever loops are very common when you get down to basics. If
ever the machine in front of me breaks out of the forever loop that
scans for input I have to restart my machine. Granted that those loops
are not always empty but as pointed out above, they are a reasonable
design solution.
Please note that such a loop would probably not be covered by the
proposed rule. Whatever method is used to scan for input would almost
certainly violate one of the requirements for application of the rule:
"... performs no input/output operations, does not access volatile
objects, and performs no synchronization or atomic operations ...".
Dan Henry
2010-05-09 19:28:58 UTC
Permalink
Raw Message
On Sun, 09 May 2010 12:19:17 -0400, James Kuyper
Post by James Kuyper
..
Post by Francis Glassborow
Post by Dan Henry
With this new rule, a lot of code will become broken.
Exactly, forever loops are very common when you get down to basics. If
ever the machine in front of me breaks out of the forever loop that
scans for input I have to restart my machine. Granted that those loops
are not always empty but as pointed out above, they are a reasonable
design solution.
Please note that such a loop would probably not be covered by the
proposed rule. Whatever method is used to scan for input would almost
"... performs no input/output operations, does not access volatile
objects, and performs no synchronization or atomic operations ...".
The case I am worried about is the one where after hardware
initialization, the program enters a (truly) empty infinite loop at
the bottom of main() leaving *all* processing to be performed by
interrupt handlers. Translating main(), the compiler is unaware that
operations are performed outside the empty loop. Would that loop be
eliminated such that main() could return?
--
Dan Henry
Fred Bloggs
2010-05-09 19:48:26 UTC
Permalink
Raw Message
Post by Dan Henry
On Sun, 09 May 2010 12:19:17 -0400, James Kuyper
Post by James Kuyper
..
Post by Francis Glassborow
Post by Dan Henry
With this new rule, a lot of code will become broken.
Exactly, forever loops are very common when you get down to basics. If
ever the machine in front of me breaks out of the forever loop that
scans for input I have to restart my machine. Granted that those loops
are not always empty but as pointed out above, they are a reasonable
design solution.
Please note that such a loop would probably not be covered by the
proposed rule. Whatever method is used to scan for input would almost
"... performs no input/output operations, does not access volatile
objects, and performs no synchronization or atomic operations ...".
The case I am worried about is the one where after hardware
initialization, the program enters a (truly) empty infinite loop at
the bottom of main() leaving *all* processing to be performed by
interrupt handlers. Translating main(), the compiler is unaware that
operations are performed outside the empty loop. Would that loop be
eliminated such that main() could return?
Or the ones in safety critical code such as:

for(;;){} /* Fatal error, watchdog will reset */
Keith Thompson
2010-05-09 20:06:15 UTC
Permalink
Raw Message
Post by Dan Henry
On Sun, 09 May 2010 12:19:17 -0400, James Kuyper
Post by James Kuyper
..
Post by Francis Glassborow
Post by Dan Henry
With this new rule, a lot of code will become broken.
Exactly, forever loops are very common when you get down to basics. If
ever the machine in front of me breaks out of the forever loop that
scans for input I have to restart my machine. Granted that those loops
are not always empty but as pointed out above, they are a reasonable
design solution.
Please note that such a loop would probably not be covered by the
proposed rule. Whatever method is used to scan for input would almost
"... performs no input/output operations, does not access volatile
objects, and performs no synchronization or atomic operations ...".
The case I am worried about is the one where after hardware
initialization, the program enters a (truly) empty infinite loop at
the bottom of main() leaving *all* processing to be performed by
interrupt handlers. Translating main(), the compiler is unaware that
operations are performed outside the empty loop. Would that loop be
eliminated such that main() could return?
The new rule gives the implementation permission to *assume* that
the loop terminates. One possible consequence of this assumption
is, as you say, that the code for the loop will be removed. Or,
of course, the compiler can leave it alone and do what you expect;
the compiler isn't *required* to act on the assumption.

What the new rule really says (and IMHO should say more clearly)
is that the behavior of a non-terminating loop that doesn't perform
any of the listed actions is undefined. In effect, by writing such
a loop, you're lying to the compiler, and the standard places no
constraints on the implementation's response.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Phil Carmody
2010-05-10 07:27:58 UTC
Permalink
Raw Message
Post by Keith Thompson
What the new rule really says (and IMHO should say more clearly)
is that the behavior of a non-terminating loop that doesn't perform
any of the listed actions is undefined. In effect, by writing such
a loop, you're lying to the compiler, and the standard places no
constraints on the implementation's response.
Precisely what is the lie in ``for(;;);''? That's what I mean,
that's what I want, that's what I ask for. I see no lie. You've
criticised others for abusing the word 'lie', but it looks like
you're abusing it here yourself.

Phil
--
I find the easiest thing to do is to k/f myself and just troll away
-- David Melville on r.a.s.f1
Hallvard B Furuseth
2010-05-10 12:02:17 UTC
Permalink
Raw Message
Post by Phil Carmody
Post by Keith Thompson
What the new rule really says (and IMHO should say more clearly)
is that the behavior of a non-terminating loop that doesn't perform
any of the listed actions is undefined. In effect, by writing such
a loop, you're lying to the compiler, and the standard places no
constraints on the implementation's response.
Precisely what is the lie in ``for(;;);''? That's what I mean,
that's what I want, that's what I ask for. I see no lie.
You're getting it backwards. Note "in effect". With the new rule, the
description of how the compiler may deal with an empty loop is the same
as for undefined behavior: The compiler may assume that such code
doesn't get invoked. That's why "invoking undefined behavior" is "lying
to the compiler" in Standardese: It allows the compiler to make an
incorrect assumption.

The point is that the new rule _doesn't_ say that but it should, since
it _in effect_ says the same thing. The lie would then be "this program
doesn't execute empty loops", clearly spelled out in Standardese
terminology. Instead we are left to infer it. Or maybe it's supposed
to mean something slightly gentler since they didn't call it "undefined
behavior", but if so I'm not sure how it's supposed to gentler.
--
Hallvard
Keith Thompson
2010-05-10 15:38:01 UTC
Permalink
Raw Message
Post by Phil Carmody
Post by Keith Thompson
What the new rule really says (and IMHO should say more clearly)
is that the behavior of a non-terminating loop that doesn't perform
any of the listed actions is undefined. In effect, by writing such
a loop, you're lying to the compiler, and the standard places no
constraints on the implementation's response.
Precisely what is the lie in ``for(;;);''? That's what I mean,
that's what I want, that's what I ask for. I see no lie. You've
criticised others for abusing the word 'lie', but it looks like
you're abusing it here yourself.
It's a reference to Henry Spencer's famous quotation: "If you lie
to the compiler, it will get its revenge". I didn't mean to imply
deliberate deception, as I would if I applied the word "lie" made
by a person to another person; note that I wrote "in effect",
and that the target of the "lie" is a compiler, not a person.

My real point is that the proposed rule permits the compiler to treat
``for(;;);'' as a lie, in the same way that it can treat a call to
printf via a pointer of type ``int(*)int'' as a lie.

I object to this proposed rule because it breaks existing code that is
very widely seen as perfectly valid.

I also object to the confusing way that it's stated, as a permission
for what the compiler is allowed to do with a certain construct
rather than as a description of the behavior (or lack of defined
behavior) of that construct. The standard's target audience is, or
should be, both implementers and programmers; the latter shouldn't
have to reason backwards from the compiler's permitted behavior to
infer the behavior of their code.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Eric Sosman
2010-05-10 16:56:20 UTC
Permalink
Raw Message
Post by Keith Thompson
[...]
My real point is that the proposed rule permits the compiler to treat
``for(;;);'' as a lie, in the same way that it can treat a call to
printf via a pointer of type ``int(*)int'' as a lie.
We're not talking about the programmer lying to the compiler,
but about the compiler deliberately misunderstanding the programmer.

The C1x Charter says "Trust the programmer, as a goal, is
outdated," to which I can only respond "Trust the compiler, as
a practical policy, is outdated." In other words, Ugh.
--
Eric Sosman
***@ieee-dot-org.invalid
Keith Thompson
2010-05-10 18:20:43 UTC
Permalink
Raw Message
Post by Eric Sosman
Post by Keith Thompson
[...]
My real point is that the proposed rule permits the compiler to treat
``for(;;);'' as a lie, in the same way that it can treat a call to
printf via a pointer of type ``int(*)int'' as a lie.
We're not talking about the programmer lying to the compiler,
but about the compiler deliberately misunderstanding the programmer.
I agree with your point, but I'd say what we're talking about is the
potential of a programmer *unintentionally* lying to the compiler
because of a misunderstanding about the language the compiler and
programmer are using to communicate.

It's as if X told Y "The sky is blue", and Y responded that X is
lying because the ISO committee in change of standarding English
recently ruled that the word "blue" may be used in reference to
the color of the sky only if there are no visible clouds at all
(a rule that doesn't apply to other potentially blue objects,
such as those implemented by goto statements).

The real lie, of course, is my implicit claim that the above is a
reasonable analogy. 8-)}

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Phil Carmody
2010-05-11 05:01:48 UTC
Permalink
Raw Message
Post by Eric Sosman
Post by Keith Thompson
[...]
My real point is that the proposed rule permits the compiler to treat
``for(;;);'' as a lie, in the same way that it can treat a call to
printf via a pointer of type ``int(*)int'' as a lie.
We're not talking about the programmer lying to the compiler,
but about the compiler deliberately misunderstanding the programmer.
Well worded response.
Post by Eric Sosman
The C1x Charter says "Trust the programmer, as a goal, is
outdated," to which I can only respond "Trust the compiler, as
a practical policy, is outdated." In other words, Ugh.
Or in long-hand "find me another language, this one's beginning to
smell".

Phil
--
I find the easiest thing to do is to k/f myself and just troll away
-- David Melville on r.a.s.f1
James Kuyper
2010-05-09 23:07:13 UTC
Permalink
Raw Message
Post by Dan Henry
On Sun, 09 May 2010 12:19:17 -0400, James Kuyper
Post by James Kuyper
..
Post by Francis Glassborow
Post by Dan Henry
With this new rule, a lot of code will become broken.
Exactly, forever loops are very common when you get down to basics. If
ever the machine in front of me breaks out of the forever loop that
scans for input I have to restart my machine. Granted that those loops
are not always empty but as pointed out above, they are a reasonable
design solution.
Please note that such a loop would probably not be covered by the
proposed rule. Whatever method is used to scan for input would almost
"... performs no input/output operations, does not access volatile
objects, and performs no synchronization or atomic operations ...".
The case I am worried about is the one where after hardware
initialization, the program enters a (truly) empty infinite loop at
the bottom of main() leaving *all* processing to be performed by
interrupt handlers. Translating main(), the compiler is unaware that
operations are performed outside the empty loop. Would that loop be
eliminated such that main() could return?
Yes, it seems to me that the suggested change would allow elimination of
such a loop, and I agree that this is a problem.


However, it wasn't clear to me that Francis was talking about that kind
of empty loop. "scan for input" sounds more active to me than what
you've described. He might have been thinking about the same thing
you're referring to, which is why I used the words "probably" and
"almost", but it sounded more to me like he was talking about doing
non-blocking I/O, by calling a OS-specific function to determine whether
there was any input waiting to be processed. Such a loop would not be
covered by this change: it must still actually be executed.
Vincent Lefevre
2010-05-06 01:58:14 UTC
Permalink
Raw Message
Post by Keith Thompson
Sorry, I didn't read carefully enough. The proposed language
doesn't just give the compiler permission to remove the loop; it
says that it "may be assumed by the implementation to terminate".
I suppose that means that if the program violates this assumption
(by not terminating), then the behavior is undefined.
Just like the behavior is undefined *after* a program terminates
"normally" (but rather than saying that the behavior is undefined,
I'd say that this is no longer the scope of the C standard).
--
Vincent Lefèvre <***@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
Keith Thompson
2010-05-06 02:20:33 UTC
Permalink
Raw Message
Post by Vincent Lefevre
Post by Keith Thompson
Sorry, I didn't read carefully enough. The proposed language
doesn't just give the compiler permission to remove the loop; it
says that it "may be assumed by the implementation to terminate".
I suppose that means that if the program violates this assumption
(by not terminating), then the behavior is undefined.
Just like the behavior is undefined *after* a program terminates
"normally" (but rather than saying that the behavior is undefined,
I'd say that this is no longer the scope of the C standard).
I don't think that's a good analogy. Keep in mind that this isn't
just about program termination; it's about the termination of a
(potentially) infinite loop within a program.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Vincent Lefevre
2010-05-06 21:30:30 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Vincent Lefevre
Just like the behavior is undefined *after* a program terminates
"normally" (but rather than saying that the behavior is undefined,
I'd say that this is no longer the scope of the C standard).
I don't think that's a good analogy. Keep in mind that this isn't
just about program termination; it's about the termination of a
(potentially) infinite loop within a program.
But assuming the loop is infinite, how the program terminates would be
unspecified by the standard. In particular, the implementation would
be allowed to say that the behavior is like a second program starting
in the same environment, just as if the loop terminated in some way.
--
Vincent Lefèvre <***@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
Keith Thompson
2010-05-06 02:38:41 UTC
Permalink
Raw Message
Keith Thompson <kst-***@mib.org> writes:
[...]
Post by Keith Thompson
Sorry, I didn't read carefully enough. The proposed language
doesn't just give the compiler permission to remove the loop; it
says that it "may be assumed by the implementation to terminate".
I suppose that means that if the program violates this assumption
(by not terminating), then the behavior is undefined.
Another way to state this is that it's a change to the meaning of
a loop.
By writing a loop, the programmer is asserting *either* that the
loop does something with visible behavior (performs I/O, accesses
volatile objects, or performs synchronization or atomic operations),
*or* that it eventually terminates. If I violate that assumption
by writing an infinite loop with no side effects, I am lying to the
compiler, and my program's behavior is undefined. (If I'm lucky,
the compiler might warn me about it.) The language doesn't provide
(no longer provides?) a way to express an infinite loop without
visible behavior.
I suggest that something similar to the previous paragraph is
a clearer way to express the idea than what's currently in the
draft. It expresses what the code means rather than giving the
implementation permission to do something to the code.
Even better would be an explicit statement that the behavior is
undefined. For example, replace this (from N1425):

An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.

by something like this:

If an iteration statement that performs no input/output
operations, does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3
never terminates, the behavior is undefined.

I'm not happy with the structure of that sentence. By the time I get to
the end of it I've almost forgotten how it started -- and I wrote it.

An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization or
atomic operations in its body, controlling expression, or (in the
case of a for statement) its expression-3 does not terminate, is
called a _foo iteration statement_. If a foo iteration statement
never terminates, the behavior is undefined.

Obviously we need a better term than "foo".

Again, I think that a statement about the behavior of the program is
better than a statement about what the implementation is permitted to do
with the program. Explicit statements of undefined behavior are not
required (behavior that's not defined is undefined by omission), but it
doesn't hurt to be explicit.
Post by Keith Thompson
I'm still not at all convinced this is a good idea, but it's probably
not as obviously insane as I've been thinking it is.
And I think it's perfectly reasonable, given the current standard,
for a programmer to assume that an infinite loop never terminates.
There are valid arguments that this isn't the case, but plenty of
programmers either would reject such arguments or aren't even aware
of them.

Given the difficulty of proving that this isn't a quiet change,
I strongly suggest that it should be treated as one. That affects
both the decision whether to make the change, and the need for
programmer education if it is made.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Phil Carmody
2010-05-07 08:11:06 UTC
Permalink
Raw Message
Post by Keith Thompson
[...]
Post by Keith Thompson
Sorry, I didn't read carefully enough. The proposed language
doesn't just give the compiler permission to remove the loop; it
says that it "may be assumed by the implementation to terminate".
I suppose that means that if the program violates this assumption
(by not terminating), then the behavior is undefined.
Another way to state this is that it's a change to the meaning of
a loop.
By writing a loop, the programmer is asserting *either* that the
loop does something with visible behavior (performs I/O, accesses
volatile objects, or performs synchronization or atomic operations),
*or* that it eventually terminates. If I violate that assumption
by writing an infinite loop with no side effects, I am lying to the
compiler, and my program's behavior is undefined. (If I'm lucky,
the compiler might warn me about it.) The language doesn't provide
(no longer provides?) a way to express an infinite loop without
visible behavior.
I suggest that something similar to the previous paragraph is
a clearer way to express the idea than what's currently in the
draft. It expresses what the code means rather than giving the
implementation permission to do something to the code.
Even better would be an explicit statement that the behavior is
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.
If an iteration statement that performs no input/output
operations, does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3
never terminates, the behavior is undefined.
I'm not happy with the structure of that sentence. By the time I get to
the end of it I've almost forgotten how it started -- and I wrote it.
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization or
atomic operations in its body, controlling expression, or (in the
case of a for statement) its expression-3 does not terminate, is
called a _foo iteration statement_. If a foo iteration statement
never terminates, the behavior is undefined.
Obviously we need a better term than "foo".
You're chosing bikeshed colour. Which colour of broken-and-wrong
is best?
Post by Keith Thompson
Again, I think that a statement about the behavior of the program is
better than a statement about what the implementation is permitted to do
with the program. Explicit statements of undefined behavior are not
required (behavior that's not defined is undefined by omission), but it
doesn't hurt to be explicit.
Post by Keith Thompson
I'm still not at all convinced this is a good idea, but it's probably
not as obviously insane as I've been thinking it is.
And I think it's perfectly reasonable, given the current standard,
for a programmer to assume that an infinite loop never terminates.
There are valid arguments that this isn't the case, but plenty of
programmers either would reject such arguments or aren't even aware
of them.
Given the difficulty of proving that this isn't a quiet change,
I strongly suggest that it should be treated as one. That affects
both the decision whether to make the change, and the need for
programmer education if it is made.
Absolutely.

Phil
--
I find the easiest thing to do is to k/f myself and just troll away
-- David Melville on r.a.s.f1
Wojtek Lerch
2010-05-05 20:25:44 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by John Regehr
— At sequence points, volatile objects are stable in the sense that
previous accesses are complete and subsequent accesses have not yet
occurred.
— At program termination, all data written into files shall be
identical to the result that execution of the program according to the
abstract semantics would have produced.
...
Post by Keith Thompson
Are you saying that since the second bullet applies only to
file contents *at program termination*, all bets are off for
non-terminating programs?
As far as the file contents go, yes. C permits buffered I/O. If your
program gets stuck in an infinite loop, buffered writes may never make it to
the disk.
Post by Keith Thompson
If so, that's a much stronger statement than the proposal to allow
possibly infinite loops to be removed. And adding permission
to remove such loops really changes nothing; the behavior is
undefined anyway, so the compiler could replace the loop with a
call to format_disk().
Hmm. I suppose the requirements regarding volatile objects and
interactive devices are still in place.
Right. It's not undefined behaviour -- only the contents of files are
unspecified.
William Ahern
2010-05-05 21:01:38 UTC
Permalink
Raw Message
<snip>
Post by Keith Thompson
Post by John Regehr
The question is: Is the middle bullet talking about termination in the
abstract or actual semantics? If the former, then the compiler seems
to be permitted to terminate infinite loops. If the latter, then the
compiler is not permitted to do so. That's my reading at least.
Are you saying that since the second bullet applies only to
file contents *at program termination*, all bets are off for
non-terminating programs?
The standard allows a seeming infinite loop to terminate the program. For
instance, the standard allows a divide-by-zero to "trap-and-terminate" via
SIGFPE. Thus:

for (unsigned char i = 1; i > -1; i++) {
int n = 1/i;
}

Of course, divide-by-zero causes undefined behavior, so it's a little
incongruous that the standard recognizes and condones SIGFPE and it's
termination behavior. Really divide-by-zero occupies a middle ground between
unspecified and undefined (i.e. one of the enumerated behaviors is
undefinedness), as in reality a lot of other things in the standard, such as
perhaps type-punning.

Developers rely on, and C derives strength from, this structural ambiguity.
But the proposed change now gives the compiler license to shoot you in the
head for doing things that have always been condoned, if disfavored.
Condoned yet disfavored because in choosing not to incorporate more
sophisticated typing and control flow mechanisms, it's understood that the
trade-off includes providing some guarantees for unorthodox usage,
epitomized by the "as-if" rule. By abrogating this rule the standard is
shredding it's foundational contract. It's giving license to the compiler to
make assumptions, leaving the developer in the lurch without the ability to
refute those assumptions with new control flow constructs--e.g. not reached
or "may reach" statement qualifiers.

Without the complementary constructs the committee is shirking its duty. The
end result will be less confidence in the standard by developers, making
dependence on extensions comparatively more appealing, and so while
intending to bring implementors into the fold paradoxically creating the
conditions for the opposite to happen.
Wojtek Lerch
2010-05-05 19:28:26 UTC
Permalink
Raw Message
Post by l***@siemens.com
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.
I can think of several possible interpretations of the above -- it would be
good if the final standard made it clearer which of them, if any, is the
intended one:

#1 If, for a certain combination of program inputs, an iteration statement
described above loops forever (in the abstract machine), the behaviour is
undefined.

#2 If such an iteration statement loops forever, the implementation is free
to make it behave as if the controlling expression returned zero after an
arbitrary number of operations (even if it is a non-zero constant or some
other expression that cannot possibly ever return zero).

#3 If such an iteration statement loops forever, the implementation is free
to make it behave as if the controlling expression returned zero, or to
choose an arbitrary combination of "if" and "switch" statements containing a
"break" or "return" statement or a "goto" statement leading outside the
loop, and make them behave as if their controlling expressions returned
values that led to the execution of the "break" or "return" or "goto"
statement, even if some of them are constant expressions inconsistent with
such behaviour.

#4 Same as #3, but also add impossible "goto" statements leading to places
in the loop that make termination possible.

#5 Same as #4, but also allows executing "dead" statements such as the one
after a "break" or "return" statement.

#5 Same as #4, but also allow termination be reaching an unreachable "exit",
"raise", or "abort" call.

Example:

int fun( void ) {
int x = 0;
while ( 1 ) {
if ( 0 ) {
x = 1;
break;
ret2: return 2;
}
if ( 0 ) {
goto ret2;
return 3;
}
}
return x;
}

Presumably the function is allowed to return 0; but can it return 1? 2? 3?
Wojtek Lerch
2010-05-05 19:31:17 UTC
Permalink
Raw Message
Post by Wojtek Lerch
I can think of several possible interpretations of the above
Forgot about an important one:

#0 If, for a certain combination of program inputs, an iteration statement
described above loops forever (in the abstract machine), the implementation
is free to make it behave as if it called exit().
Wojtek Lerch
2010-05-05 19:44:34 UTC
Permalink
Raw Message
... after an arbitrary number of operations...
Um, *iterations*...
Vincent Lefevre
2010-05-06 01:38:00 UTC
Permalink
Raw Message
Post by Keith Thompson
In my opinion, it's not up in the air at all. The C standard
defines the semantics of iteration statements. It doesn't allow
non-terminating loops to terminate, any more than it allows
``if (0) puts("oops")'' to print "oops".
But does the time of the abstract machine need to correspond to the
physical time? What I mean is that an infinite time (in the abstract
machine) can mathematically be compressed into a finite time (in the
real world), and what is beyond this finite time is out of the scope
of the C standard. So,

while (1) { }
printf ("foo\n");

could actually be transformed into an empty statement with an
implementation-defined termination (at least in a freestanding
environment).
--
Vincent Lefèvre <***@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
Richard Bos
2010-05-11 14:06:03 UTC
Permalink
Raw Message
Post by John Regehr
Hans Boehm explained the C++ committee's rationale to me as follows
- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.
Those optimisations will just have to become more advanced, then.
Post by John Regehr
- It's hard to find real examples where this matters.
As amply proven in this thread, this is simply not true.
Post by John Regehr
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
I'm astounded that Boehm seems to believe that the _current_ C Standard
leaves this undefined. Of course it does not.
Post by John Regehr
- Many compilers already have this behavior anyway.
I've never found one, and if I did, I'd delete it forthwith. I cannot
work with a compiler I cannot trust.

One half out of four - poor show.

Richard
Eric Sosman
2010-05-11 14:47:05 UTC
Permalink
Raw Message
Post by Richard Bos
Post by John Regehr
Hans Boehm explained the C++ committee's rationale to me as follows
- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.
Those optimisations will just have to become more advanced, then.
Post by John Regehr
- It's hard to find real examples where this matters.
As amply proven in this thread, this is simply not true.
Post by John Regehr
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
I'm astounded that Boehm seems to believe that the _current_ C Standard
leaves this undefined. Of course it does not.
Post by John Regehr
- Many compilers already have this behavior anyway.
I've never found one, and if I did, I'd delete it forthwith. I cannot
work with a compiler I cannot trust.
John Regehr has already reported five compilers (or six,
depending on how you count) that are broken in this way. You
might want to expunge from your system all of
Post by Richard Bos
[...] Microsoft (Visual Studio 2008 and 2010), LLVM/Clang 2.7,
Open64-x86 4.2.3, Sun CC 5.10, and Intel CC 11.1.072.
One half out of four - poor show.
He's earned at least one-point-five out of four, I think.
--
Eric Sosman
***@ieee-dot-org.invalid
Geoff
2010-05-12 04:32:28 UTC
Permalink
Raw Message
On Tue, 11 May 2010 10:47:05 -0400, Eric Sosman
Post by Eric Sosman
John Regehr has already reported five compilers (or six,
depending on how you count) that are broken in this way. You
might want to expunge from your system all of
Post by Richard Bos
[...] Microsoft (Visual Studio 2008 and 2010), LLVM/Clang 2.7,
Open64-x86 4.2.3, Sun CC 5.10, and Intel CC 11.1.072.
What settings did he use to prove VS 2008 does this?
______________________________________________________________

I created the following source:
______________________________________________________________

#include <stdio.h>

void f(void)
{
while(1);
}

int main(void)

{
for(;;){}
puts("Hello world!");
}
______________________________________________________________

The optimized code looks like this:
______________________________________________________________

; Listing generated by Microsoft (R) Optimizing Compiler Version
15.00.30729.01

TITLE c:\Documents and Settings\Geoff\My Documents\Visual
Studio 2008\Projects\inf_loop\inf_loop.cpp
.686P
.XMM
include listing.inc
.model flat

INCLUDELIB OLDNAMES

PUBLIC ??***@_0N@***@Hello?5world?$CB?$AA@ ; `string'
EXTRN @***@4:PROC
EXTRN __imp__puts:PROC
; COMDAT ??***@_0N@***@Hello?5world?$CB?$AA@
CONST SEGMENT
??***@_0N@***@Hello?5world?$CB?$AA@ DB 'Hello world!', 00H ;
`string'
CONST ENDS
PUBLIC _main
; Function compile flags: /Ogtpy
; File c:\documents and settings\geoff\my documents\visual studio
2008\projects\inf_loop\inf_loop.cpp
; COMDAT _main
_TEXT SEGMENT
_main PROC ; COMDAT

; 7 : {

$***@main:

; 8 : for(;;){}

00000 eb fe jmp SHORT $***@main
_main ENDP
_TEXT ENDS
END

______________________________________________________________

Clearly, the unreferenced f() was removed but the infinite loop in
main wasn't and the program never outputs the text. The linker removed
the unused text block from the executable image. The result was the
same for inf_loop.c.
______________________________________________________________
John Regehr
2010-05-12 15:20:42 UTC
Permalink
Raw Message
Post by Geoff
What settings did he use to prove VS 2008 does this?
Sorry, it was one of my students that did this, I don't have VS
running anywhere. He said optimization level was /Ox. Source code is
below, Here's the main emitted by VC2008 (2010 is basically the same):

PUBLIC _main
; Function compile flags: /Ogtpy
_TEXT SEGMENT
_main PROC
; Line 30
xor eax, eax
; Line 31
ret 0

Let me know if you get something different.

John Regehr




#include <stdio.h>

#define MAX 1000

void fermat (void)
{
int a=1,b=1,c=1;
while ((a*a*a) != ((b*b*b)+(c*c*c))) {
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
}

int main (void)
{
fermat();
return 0;
}
Geoff
2010-05-12 23:45:28 UTC
Permalink
Raw Message
On Wed, 12 May 2010 08:20:42 -0700 (PDT), John Regehr
Post by John Regehr
Let me know if you get something different.
John Regehr
John,

Thanks for providing that information. I got the same results you did
at /Ox optimization using your source.

Applying the same optimization level to while(1) and for(;;)
statements yielded correct results (infinite loops) when applied to my
test code. It looks like the compiler is trusting the programmer in
those cases. :)

PUBLIC _main
; Function compile flags: /Ogtpy
; COMDAT _main
_TEXT SEGMENT
_main PROC ; COMDAT

; 7 : {

$***@main:

; 8 : for(;;){}

jmp SHORT $***@main
_main ENDP
_TEXT ENDS
END
_________________________________________________________

I then went back to your original 5/3 fermat2.c code and it calls the
fermat function. It looks like VC++ 2008 Microsoft (R) Optimizing
Compiler Version 15.00.30729.01 doesn't have the bug.
John Regehr
2010-05-13 19:30:32 UTC
Permalink
Raw Message
Post by Geoff
Applying the same optimization level to while(1) and for(;;)
statements yielded correct results (infinite loops) when applied to my
test code. It looks like the compiler is trusting the programmer in
those cases. :)
It's odd that compilers delete infinite loops that contain stuff, but
fail to delete empty infinite loops. I'm not an optimizer expert but
I'd have expected one pass to delete useless code and another pass to
delete empty loops, but it's apparently not quite that simple.
Post by Geoff
I then went back to your original 5/3 fermat2.c code and it calls the
fermat function. It looks like VC++ 2008 Microsoft (R) Optimizing
Compiler Version 15.00.30729.01 doesn't have the bug.
My observation was that the Microsoft C compiler will terminate an
infinite loop, but not if it contains multiple exit paths. On the
other hand, the other C compilers I mentioned will delete an infinite
loop even when it has multiple (dead of course) return paths,
effectively choosing one at random.

John
Hallvard B Furuseth
2010-05-13 21:42:32 UTC
Permalink
Raw Message
Post by John Regehr
It's odd that compilers delete infinite loops that contain stuff, but
fail to delete empty infinite loops.
Maybe due to the concerns expressed here. They'll forcibly terminate a
loop only if it looks like the loop is doing something other than just
wait. So if you write "for(;;);" they assume you mean what you say, but
with with "while (complicated) { more complicated; }" they assume the
complications are there to be optimized in any way as possible.

Also it seems to me this optimization is only useful if the compiler
cannot determine whether the loop will terminate, or it would take much
work to figure it out. If it notices that the loop won't terminate, it
has no reason to assume the programmer wants the loop terminated.
Post by John Regehr
Post by Geoff
I then went back to your original 5/3 fermat2.c code and it calls the
fermat function. It looks like VC++ 2008 Microsoft (R) Optimizing
Compiler Version 15.00.30729.01 doesn't have the bug.
My observation was that the Microsoft C compiler will terminate an
infinite loop, but not if it contains multiple exit paths.
Of course not. The compiler may assume the loop will terminate through
_some_ path, but it can't simply guess which path it would be.
Post by John Regehr
On the other hand, the other C compilers I mentioned will delete an
infinite loop even when it has multiple (dead of course) return paths,
effectively choosing one at random.
All paths out are dead: The compiler is allowed to just pick one.
Exactly one path possibly live (or the compiler can't eliminate it):
This optimization has to pick that path. More than one path and the
compiler can't determine which will be used: It has to let the program
run the loop and figure it out the old-fashioned way.
--
Hallvard
Wojtek Lerch
2010-05-15 03:39:28 UTC
Permalink
Raw Message
Post by John Regehr
It's odd that compilers delete infinite loops that contain stuff, but
fail to delete empty infinite loops. I'm not an optimizer expert but
I'd have expected one pass to delete useless code and another pass to
delete empty loops, but it's apparently not quite that simple.
An empty infinite loop is completely uninteresting to an optimizer: there's
nothing to optimize there. Obviously, the programmer intended for it to
loop forever.

A loop with complicated stuff in it is another story.

I'm not an optimizer expert either, but my understanding is that optimizers
generally don't work directly with high-level constructs such as iteration
statements, but disassemble your code into a graph and then optimize it by
transforming that graph.

Imagine that at some point the optimizer notices that the only way to reach
node B in the graph leads from node A through a complicated knot that
performs a lot of operations on local variables but doesn't touch any
volatiles or make any I/O calls. The next thing your optimizer notices is
that none of the local variables modified in the knot affects anything that
happens after the execution reaches node B.

The only thing that all those complicated operations can possibly affect is
whether node B is reached at all or not.

But the knot is too complicated for the optimizer to figure out whether B is
reached or not and what that depends on.

The choice the optimizer has now is to either generate code that actually
performs all those complicated operations, for the sole purpose of
determining whether B is ever reached or not; or to just "assume" that the
knot is not an infinite loop, and just throw it away. Most of the time, a
complicated loop that does nothing is *not* infinite -- a sane programer
that wants an infinite loop will keep it simple. A complicated loop that
does nothing useful is more likely a product of optimizing a complicated
loop that may or may not do something useful -- for instance, an inlined
function call whose result isn't checked, or whose arguments have known
values. If your compiler keeps complicated loops that do nothing useful,
just because they're too complicated to prove that they're not infinite, but
your competitor's compiler deletes them to make the code run faster, whose
compiler do you think people will like more?
Eric Sosman
2010-05-15 13:43:57 UTC
Permalink
Raw Message
[...] If your compiler keeps complicated
loops that do nothing useful, just because they're too complicated to
prove that they're not infinite, but your competitor's compiler deletes
them to make the code run faster, whose compiler do you think people
will like more?
The one that gets the wrong answer faster, of course.

The old joke goes "It's not a `supercomputer' unless it can
finish an infinite loop in under a minute." ISTM the optimizers'
attempts to get supercomputer performance out of every wristwatch
chip have gone a little too far.
--
Eric Sosman
***@ieee-dot-org.invalid
Keith Thompson
2010-05-11 16:53:07 UTC
Permalink
Raw Message
[...]
Post by Richard Bos
Post by John Regehr
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
I'm astounded that Boehm seems to believe that the _current_ C Standard
leaves this undefined. Of course it does not.
[...]

As I understand it, the rationale for this is C99 5.1.2.3p5:

The least requirements on a conforming implementation are:

-- At sequence points, volatile objects are stable in the sense that
previous accesses are complete and subsequent accesses have not
yet occurred.

-- At program termination, all data written into files shall be
identical to the result that execution of the program according
to the abstract semantics would have produced.

-- The input and output dynamics of interactive devices shall take
place as specified in 7.19.3. The intent of these requirements is
that unbuffered or line-buffered output appear as soon as
possible, to ensure that prompting messages actually appear prior
to a program waiting for input.

These are forms of *program behavior* that are actually required to
occur. Any manipulations based on the as-if rule may not alter them.
Note that whether the program terminates or not is not listed as
a form of required behavior.

Given this program:

int main(void) {
while (1);
return 0;
}

one could argue that a generated executable that does nothing and hangs
forever and a generated executable that does nothing and terminates
satisfy the "least requirements" equally well. (The conclusion *I*
draw from this is that the "least requirements" are insufficient.)

Even given this program;

#include <stdio.h>
int main(void) {
while (1);
puts("OOPS!");
return 0;
}

one could, I suppose, argue that it's permitted to print "OOPS!".
The standard only requires file data to be consistent "At program
termination". Then again, the same argument could imply that it's
permitted to print "Gazinga!". (The new wording apparently makes
the behavior of the above program undefined, so printing "Gazinga!"
would be permitted.)

I have a harder time with this:

int main(void) {
volatile int v;
while (1);
v = 42;
return 0;
}

since the "least requirements" for volatile objects apply at each
sequence point, not just at program termination.

Please note that I'm attempting to summarize arguments I don't agree
with, that lead to conclusions that I dislike. I don't claim that
I'm presenting these arguments entirely accurately.

I've seen committee members here, when presented with problematic
statements in the standard, say that the standard should be read
with the assumption that it makes sense. If something in the
standard seems contradictory or ridiculous, one should consider
the obvious intent when interpreting the text. For example, we all
assume that ((void*)0) is a null pointer constant, even though the
standard doesn't say that a parenthesized NPC is an NPC (I don't
think a committee member has addressed that particular case, but
it's an example of what I'm talking about). But in my opinion,
this principle leads to the obvious conclusion that ``while (1);''
really is an infinite loop.

I'd love to hear from someone who sincerely thinks the proposed
change is a good idea (and I encourage everyone else to be kind).
It might also be interesting to see an explanation (in layman's
terms, if that's possible) of why keeping infinite loops infinite
is such a problem for optimizers.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Wojtek Lerch
2010-05-11 18:45:39 UTC
Permalink
Raw Message
...
Post by Keith Thompson
-- At program termination, all data written into files shall be
identical to the result that execution of the program according
to the abstract semantics would have produced.
-- The input and output dynamics of interactive devices shall take
place as specified in 7.19.3. The intent of these requirements is
that unbuffered or line-buffered output appear as soon as
possible, to ensure that prompting messages actually appear prior
to a program waiting for input.
...
Post by Keith Thompson
Even given this program;
#include <stdio.h>
int main(void) {
while (1);
puts("OOPS!");
return 0;
}
one could, I suppose, argue that it's permitted to print "OOPS!".
The standard only requires file data to be consistent "At program
termination".
That applies when stdout is a "file"; but when it's an "interactive device",
the requirements are tighter.
Post by Keith Thompson
Then again, the same argument could imply that it's
permitted to print "Gazinga!".
If stdout is a file, then yes, the program is free to write arbitrary junk
to it as long as it makes sure that the file contains the expected data when
the program terminates, *if* it ever terminates.
Hallvard B Furuseth
2010-05-11 20:06:19 UTC
Permalink
Raw Message
Aha...
(...snippety...)
int main(void) {
volatile int v;
while (1);
v = 42;
return 0;
}
since the "least requirements" for volatile objects apply at each
sequence point, not just at program termination.
The optimization can break 5.1.2.3p5 at program termination, as another
component of the rationale seems to be "Signal handlers do not count":

#include <signal.h>
#include <stdlib.h>
volatile sig_atomic_t foo;

void done(int sig) { foo = sig; _Exit(0); }
int main(void) { signal(SIGTERM, done); for(;;); }
--
Hallvard
Michal Nazarewicz
2010-05-12 00:25:30 UTC
Permalink
Raw Message
Post by Keith Thompson
I'd love to hear from someone who sincerely thinks the proposed
change is a good idea (and I encourage everyone else to be kind).
It might also be interesting to see an explanation (in layman's
terms, if that's possible) of why keeping infinite loops infinite
is such a problem for optimizers.
Precisely. I'm still waiting for some example.

If thing there are two basic cases. One is there is at most one
possible (meaning that possible by looking at the code and not proving
otherwise) exit:

for(;;);

or:

for (unsigned i = 0; i % 2 == 0; i += 2);

and one where there are more possible exits:

for (unsigned i = 0; 1; i += 4) {
if (i % 4 == 1) return 1;
if (i % 4 == 2) return 2;
}

In the later case the assumption that the loop terminates has no real
meaning since compiler still does not know which exit path is the proper
one. So in the end, it can't optimise the loop.

The former case is another story -- if compiler assumes that loop
terminates and there is only one possible exit path then surely that's
the one compiler can choose and remove the whole loop.

However, I have the impression that this case is so distinct that if
present then it means programmer made it deliberately and so compiler
should not optimise this too much.


So what I mean is that the assumption is either useless (because
compiler does not know which exit path to choose) or breaks intentional
code.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
Hallvard B Furuseth
2010-05-13 21:58:58 UTC
Permalink
Raw Message
Post by Keith Thompson
I'd love to hear from someone who sincerely thinks the proposed
change is a good idea (and I encourage everyone else to be kind).
It might also be interesting to see an explanation (in layman's
terms, if that's possible) of why keeping infinite loops infinite
is such a problem for optimizers.
C++ templates, maybe? Your 2nd question for C++ translates to "why
can't the compiler solve the halting problem?" since templates are
a Turing-complete programming language evaluated at compile time.

I haven't kept track of modern C++ practice, but: As people keep
stuffing more advanced programs into templates, they will need C++
compilers to get more aggressive about optimiziong. Maybe by now this
has become a noticeable problem in C++.

But if that's the reason, it seems irrelevant for C. I notice that
several compilers mentioned in the thread are C/C++ compilers, where C
maybe got the optimization because the C++ compiler had it anyway,
rather than for a particular good "C" reason.
--
Hallvard
Hallvard B Furuseth
2010-05-14 15:01:44 UTC
Permalink
Raw Message
Post by Hallvard B Furuseth
Post by Keith Thompson
I'd love to hear from someone who sincerely thinks the proposed
change is a good idea (and I encourage everyone else to be kind).
(...)
C++ templates, maybe? Your 2nd question for C++ translates to "why
can't the compiler solve the halting problem?" since templates are
a Turing-complete programming language evaluated at compile time.
Eh. You didn't see me say that, typical late-night posting:(

Anyway, the point of templates being turing-complete here is more that
since they are a powerful programming language, they are being used as
such to _generate_ a program and can do so in very complex ways.

So they're being used to compose bits of code - one piece of behavior
(such as a loop terminating condition) from one template, part of the
loop from another, etc. There's no particular part of the program text
which might know a loop would be empty, just like a complex C macro
might not know that it'd evaluate to a compile-time constant. That's
what probably needs increasing levels of optimization as C++ template
programming gets more complex.
Post by Hallvard B Furuseth
I haven't kept track of modern C++ practice, but: As people keep
stuffing more advanced programs into templates, they will need C++
compilers to get more aggressive about optimiziong. Maybe by now this
has become a noticeable problem in C++.
But if that's the reason, it seems irrelevant for C. I notice that
several compilers mentioned in the thread are C/C++ compilers, where C
maybe got the optimization because the C++ compiler had it anyway,
rather than for a particular good "C" reason.
--
Hallvard
Walter Banks
2010-05-14 13:47:12 UTC
Permalink
Raw Message
Post by Keith Thompson
I'd love to hear from someone who sincerely thinks the proposed
change is a good idea (and I encourage everyone else to be kind).
It might also be interesting to see an explanation (in layman's
terms, if that's possible) of why keeping infinite loops infinite
is such a problem for optimizers.
Once a compiler doesn't assume that a loop will exit then
it isn't a problem. We also assume that our users have a
reason for writing what appears bad or inefficient but
syntactically correct code. Our compilers are primarily used
in non hosted embedded systems.

I agree with the implication in your comments that it should
not be a problem for compilers. Hosted compilers have to
deal with issues that I have no need to know. My customers
have applications that have a three line main in a tight loop that
never exits that may call a function or may be entirely event
driven. Hosted code may have OS considerations.

Regards,


Walter..
--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com
s***@casperkitty.com
2015-10-08 20:52:11 UTC
Permalink
Raw Message
Post by Keith Thompson
int main(void) {
volatile int v;
while (1);
v = 42;
return 0;
}
That one could be rationalized non-astonishingly by a rule that specified
that the time required to perform an operation, even if infinite, is not
a side effect. Such a rule would allow a compiler is allowed to split
code execution onto multiple threads provided all side-effects and
dependencies are consistent with single-threaded execution.

If one of the threads handles the "while(1)" and the other thread handles everything else, all side-effects from both threads will complete before
the program terminates, so behavior would be consistent with single-threaded
behavior.

A more interesting case would be:

unsigned int q=0;
while(function_that_will_never_return_zero())
{
q++;
}
if (q)
return 1<<(32|q);
else
return 1/q;

Here there is a data dependency between the loop termination and the code
that follows. If the loop exits when q is zero then a division by zero
will occur, which a compiler may use as justification to do whatever it
likes. If the loop exits when q is one, then an over-sized left shift will
occur likewise justify arbitrary action by the compiler. Once execution
reaches the first line of the snippet above, no execution path will avoid
Undefined Behavior, but I would aver that under a good execution model the
data dependency should prevent a compiler from eliding the loop.

BTW, I find it odd that the Standard would require that code wishing to
avoid loop elision must access a volatile variable, rather than having
the standard also define a __DUMMY_SIDE_EFFECT() directive which a
compiler would be required to treat as though it performed a side-effect
but wouldn't have to actually have it generate any code. Having to add
useless code to prevent an optimizer from taking out useful code seems
contrary to the whole point of optimization.
Keith Thompson
2015-10-08 21:35:57 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
int main(void) {
volatile int v;
while (1);
v = 42;
return 0;
}
I apparently wrote that more than 5 years ago. I don't remember writing
it, and my server doesn't have a copy of the article.
Post by s***@casperkitty.com
That one could be rationalized non-astonishingly by a rule that specified
that the time required to perform an operation, even if infinite, is not
a side effect. Such a rule would allow a compiler is allowed to split
code execution onto multiple threads provided all side-effects and
dependencies are consistent with single-threaded execution.
I wouldn't call that non-astonishing.
Post by s***@casperkitty.com
If one of the threads handles the "while(1)" and the other thread
handles everything else, all side-effects from both threads will
complete before the program terminates, so behavior would be
consistent with single-threaded behavior.
I fail to see how that's consistent with single-threaded execution.
The straightforward interpretation of that code says that control
flow provably *cannot* reach the statement `v = 42;`. Splitting it
up into threads doesn't change that.

The discussion probably had something to do with N1570 6.8.5p6:

An iteration statement whose controlling expression is not a
constant expression, that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.

with a footnote:

This is intended to allow compiler transformations such as
removal of empty loops even when termination cannot be proven.

Since the controlling expression is a constant expression, 6.8.5p6
doesn't even apply.

In any case, I dislike that paragraph; I would have preferred that
compilers *not* be allowed to remove empty loops unless they can
prove termination. Apparently the committee was convinced that it
provides enough opportunities for optimization to make it worthwhile.
I particularly dislike the way it's phrased. It states that the
implementation may *assume* about the loop, but doesn't directly say
what it can do with that assumption. It should IMHO say something
about the permitted behavior of such a loop.

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2015-10-09 19:26:17 UTC
Permalink
Raw Message
Post by Keith Thompson
I apparently wrote that more than 5 years ago. I don't remember writing
it, and my server doesn't have a copy of the article.
I can repost if you like.
Post by Keith Thompson
Post by s***@casperkitty.com
That one could be rationalized non-astonishingly by a rule that specified
that the time required to perform an operation, even if infinite, is not
a side effect. Such a rule would allow a compiler is allowed to split
code execution onto multiple threads provided all side-effects and
dependencies are consistent with single-threaded execution.
I wouldn't call that non-astonishing.
I should perhaps have been more clear. In general, astonishment is a
consequence of code being executed according to rules which differ from
one's expectations. I would posit that the rules required for the behavior
not to be astonishing are both reasonable and useful. Consider the code:

int x(int y)
{
int r1 = slow_calculation_no_side_effects(y);
int r2 = not_quite_as_slow_maybe_side_effects(y);
if (r2)
return r2;
else
return r1;
}

In cases where r1 will complete without side effects, I would not consider
it astonishing for a compiler to rewrite the code as

int x(int y)
{
int r = not_quite_as_slow_maybe_side_effects(y);
if (r)
return r;
else
return slow_calculation_no_side_effects(y);
}

and on a multi-tasking system might like it if a compiler could start
evaluating both functions simultaneously (perhaps giving the first one
a low CPU priority), and then depending upon the result of the second
function either boosting the CPU priority for the first function, waiting
for it to complete and returning its result, or else abandoning all
efforts to complete the first function (and abandoning any work done thus
far) once it becomes clear that such work would never be used for anything.

While it might be somewhat surprising how quickly x() can finish in the
case where the second function returns a non-zero value, I would not regard
those optimizations as "astonishing" in the case where the first function
will eventually complete. Unfortunately, it's often difficult for compilers
to prove that a function will complete, and if compilers had to regard as a
side-effect the failure of a loop to terminate, that would greatly impair
their ability to optimize performance in cases where function do in fact
terminate but cannot be proven to do so.

If one adds a rule which says "The time required to complete a piece of
code, even if infinite, need not be regarded as a side-effect", then a
compiler would be allowed to make the above useful optimizations without
having to prove that the first function will in fact terminate. While I
think compilers should be required to indicate that they might perform
such optimizations, I think such optimizations have sufficient value that
they should not be forbidden by the C Standard.
Post by Keith Thompson
I fail to see how that's consistent with single-threaded execution.
The straightforward interpretation of that code says that control
flow provably *cannot* reach the statement `v = 42;`. Splitting it
up into threads doesn't change that.
In the absence of special permissions regarding endless loops, it would
be legitimate for compilers to defer the execution of loops which would
take 500 billion years to execute, but not those which could never complete
no matter how much time they were given. I would suggest that from a
practical perspective, in situations where a compiler can tell that a
piece of code will either produce a result nobody cares about or loop
forever, and where producing the result could take a very long time,
it will often be more helpful for the compiler to omit the code than
to generate it.
Post by Keith Thompson
An iteration statement whose controlling expression is not a
constant expression, that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.
This is intended to allow compiler transformations such as
removal of empty loops even when termination cannot be proven.
I'll grant that the fact that the controlling expression changes things,
but in general I think there is limited value in requiring that compilers
distinguish between loops which might take 500 billion years to execute
and those which may never finish.
Post by Keith Thompson
In any case, I dislike that paragraph; I would have preferred that
compilers *not* be allowed to remove empty loops unless they can
prove termination. Apparently the committee was convinced that it
provides enough opportunities for optimization to make it worthwhile.
I particularly dislike the way it's phrased. It states that the
implementation may *assume* about the loop, but doesn't directly say
what it can do with that assumption. It should IMHO say something
about the permitted behavior of such a loop.
Agree 100%.

BTW, I would suggest that there's another behavior which should be allowed,
and which should not be astonishing: an implementation should be allowed to
document that at any time after it has determined, via whatever means, that
execution has reached a state such that no side-effects can possibly occur
in the absence of a trap, it may generate a trap to get execution out of
that state.

Because the existing lexicon of terms does not have any other way to
describe trapping behaviors, the only way to allow such functionality *within
the existing lexicon* is to say that endless loops have Undefined Behavior.
Alas, some people seem to think that Undefined Behavior should be taken not
as an invitation to add documented traps, but rather as an invitation to
behave as though that any condition which must be true to avoid Undefined
Behavior, is true. While that can be particularly problematic with regard
to endless loops, I'd consider the general problem much broader.

What's needed is a category of behavior where implementations would be
allowed to have traps for certain things, but must document the existence
of any such traps and a means by which they could be prevented from causing
"totally" undefined behavior. It would not be necessary that the trap's
behavior be predictable in any particularly *useful* way (e.g. it would
be acceptable for an implementation to document that it would fill all
memory with unspecified content and then jump to an unspecified address)
but most implementations' authors would probably at least try to be helpful.
Eric Sosman
2010-05-04 17:28:57 UTC
Permalink
Raw Message
[... "optimization" finitizes an infinite loop ...]
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
I've only two things to say: (1) "Ugh," which I've said
before, and (2) "Time for the torchlight parade; bring your
own pitchfork."
--
Eric Sosman
***@ieee-dot-org.invalid
Dag-Erling Smørgrav
2010-05-04 17:33:01 UTC
Permalink
Raw Message
Post by l***@siemens.com
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
I appreciate the sentiment - loops are notoriously hard to prove even
for humans, and provably unprovable in the general case - but this is a
radical departure from the as-if rule. I can't think of a case where
applying this rule would break *useful* code, but it still feels wrong.

DES
--
Dag-Erling Smørgrav - ***@des.no
Eric Sosman
2010-05-04 18:03:00 UTC
Permalink
Raw Message
Post by Dag-Erling Smørgrav
Post by l***@siemens.com
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
I appreciate the sentiment - loops are notoriously hard to prove even
for humans, and provably unprovable in the general case - but this is a
radical departure from the as-if rule. I can't think of a case where
applying this rule would break *useful* code, but it still feels wrong.
In a free-standing setting, perhaps something like

void kernel_panic(int code) {
stop_devices_abruptly();
display_panic_code(code);
for (;;) {
/* This would be a good place for a HALT, if the
* CPU's instruction set included such a thing.
*/
}
}

... would count as "useful." I for one would be unhappy if a
call to kernel_panic() returned.
--
Eric Sosman
***@ieee-dot-org.invalid
Antoine Leca
2010-05-06 09:23:41 UTC
Permalink
Raw Message
Post by Eric Sosman
In a free-standing setting, perhaps something like
void kernel_panic(int code) {
stop_devices_abruptly();
display_panic_code(code);
for (;;) {
/* This would be a good place for a HALT, if the
* CPU's instruction set included such a thing.
*/
}
}
... would count as "useful." I for one would be unhappy if a
call to kernel_panic() returned.
I believe this case is intended to be covered by
An iteration statement that performs no input/output operations,
^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^
(full text with context within <msgid:cos5b7-***@jones.homeip.net>)


However I won't argue with you this is a very dangerous path to open,
since what would constitute an I/O operation, and whether a HALT cpu
instruction counts as such or not, is something the implementation
should be responsible of; and even if it is said to be "implementation-
defined" (which the Standard should certainly state) hence described in
the accompagnying documentation, I am affraid many people will fail to
read that, and will be caught "unhappy", as Eric said.


Antoine
John Regehr
2010-05-04 18:05:25 UTC
Permalink
Raw Message
radical departure from the as-if rule.  I can't think of a case where
applying this rule would break *useful* code, but it still feels wrong.
It's a corner case for sure. However...

In embedded software one occasionally writes a deliberate infinite
loop. For example to hang up the CPU if main() returns, or to hang
the processor waiting for the watchdog timer to reboot the system. To
stop the compiler from deleting these loops, people are going to have
to put in a side-effecting operation such as an access to a volatile-
qualified object.

Deleting infinite loops also seems like an impediment to debugging.
If I accidentally write an infinite loop, I'd prefer my program to
hang so I can use a debugger to find the problem. If the compiler
deletes the loop and also computes a nonsensical result, as in the
Fermat example, I have no way to find the latent error in my system.

John Regehr
Michal Nazarewicz
2010-05-04 20:32:25 UTC
Permalink
Raw Message
Post by John Regehr
radical departure from the as-if rule.  I can't think of a case where
applying this rule would break *useful* code, but it still feels wrong.
Part of Linux 2.4.0:

NORET_TYPE void panic(const char * fmt, ...)
{
/* ... */
for(;;) {
CHECK_EMERGENCY_SYNC
/* the above macro may be empty */
}
}

I admit, pretty old, but hey! Besides, didn't check versions between
2.4.0 and 2.6.12-rc2 so this code may be present even in newer
versions.

Suffice to say, the wording of the draft would allow a compiler to break
a rather widely used application (and that's just a single example).
Post by John Regehr
In embedded software one occasionally writes a deliberate infinite
loop. For example to hang up the CPU if main() returns, or to hang
the processor waiting for the watchdog timer to reboot the system. To
stop the compiler from deleting these loops, people are going to have
to put in a side-effecting operation such as an access to a volatile-
qualified object.
Either that or people will include some inline assembler. In fact,
I think there was this comment that for gcc the following is enough:

for(;;) asm volatile("":::"memory");
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
John Regehr
2010-05-04 18:05:31 UTC
Permalink
Raw Message
radical departure from the as-if rule.  I can't think of a case where
applying this rule would break *useful* code, but it still feels wrong.
It's a corner case for sure. However...

In embedded software one occasionally writes a deliberate infinite
loop. For example to hang up the CPU if main() returns, or to hang
the processor waiting for the watchdog timer to reboot the system. To
stop the compiler from deleting these loops, people are going to have
to put in a side-effecting operation such as an access to a volatile-
qualified object.

Deleting infinite loops also seems like an impediment to debugging.
If I accidentally write an infinite loop, I'd prefer my program to
hang so I can use a debugger to find the problem. If the compiler
deletes the loop and also computes a nonsensical result, as in the
Fermat example, I have no way to find the latent error in my system.

John Regehr
John Regehr
2010-05-04 18:05:44 UTC
Permalink
Raw Message
radical departure from the as-if rule.  I can't think of a case where
applying this rule would break *useful* code, but it still feels wrong.
It's a corner case for sure. However...

In embedded software one occasionally writes a deliberate infinite
loop. For example to hang up the CPU if main() returns, or to hang
the processor waiting for the watchdog timer to reboot the system. To
stop the compiler from deleting these loops, people are going to have
to put in a side-effecting operation such as an access to a volatile-
qualified object.

Deleting infinite loops also seems like an impediment to debugging.
If I accidentally write an infinite loop, I'd prefer my program to
hang so I can use a debugger to find the problem. If the compiler
deletes the loop and also computes a nonsensical result, as in the
Fermat example, I have no way to find the latent error in my system.

John Regehr
Wojtek Lerch
2010-05-04 18:44:31 UTC
Permalink
Raw Message
Post by l***@siemens.com
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
Even if the controllong expression is a constant "1" and there's no "break"
or "return" inside?

Ugh.

I hope there's no plan to also remove goto from C1X?

Or to extend the above to also apply to loops made with goto?
James Kuyper
2010-05-05 12:33:04 UTC
Permalink
Raw Message
***@siemens.com wrote:
...
Post by l***@siemens.com
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
I could not imagine code qualifying for such removal as being correct
(which, given the examples other people have provided from the kernal
and embedded programming world, simply proves my lack of imagination).
Therefore, I would always want it to execute in accordance to the way it
was actually written, which would at least allow it's unintended
behavior to be debugged.

If an implementation can determine that a particular loop can never
terminate (which, of course, it cannot do in the general case), it could
generate a warning message, and I'd be happy to see that, because in my
code it would almost always be the result of a coding error. However,
given the examples other people have provided, rejecting such code or
specifying that it's behavior is undefined would not be acceptable.

I wouldn't mind if the compiler also gave a warning even when it
couldn't prove non-termination, so long as it was also having an unusual
amount of trouble proving that the loop can terminate. This warning,
however, should definitely be one that can be turned off.

Since these warnings cannot be made mandatory, and must not prevent
compilation of the code according to it's specified semantics, saying
anything about such warnings falls outside the scope of the standard. It
should simply remain silent on this issue.

That is my opinion, at least until I see a more detailed and compelling
argument that the corresponding optimizations are sufficiently useful
and beneficial to make up for these disadvantages.
Richard Bos
2010-05-11 14:06:02 UTC
Permalink
Raw Message
Post by l***@siemens.com
That kind of optimization is apparently becomming more common. The
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
Let me just add my voice to the chorus: this is _EVIL_. And not in the
street talk meaning of the word, but in the Vade Retro meaning.

Richard
Dag-Erling Smørgrav
2010-05-04 08:33:22 UTC
Permalink
Raw Message
Post by John Regehr
Consider this example, where "icc" is the current version of Intel's
Fermat's Last Theorem has been disproved.
What happens if you move a, b, c out of fermat() and inspect them in
main() if / when fermat() returns? Does the program behave differently?
I still have a very hard time believing any compiler would do this; my
money is on some sort of overflow / wraparound.

DES
--
Dag-Erling Smørgrav - ***@des.no
John Regehr
2010-05-04 16:08:57 UTC
Permalink
Raw Message
Post by Dag-Erling Smørgrav
What happens if you move a, b, c out of fermat() and inspect them in
main() if / when fermat() returns?  Does the program behave differently?
For icc and suncc, moving any of a, b, or c out of function scope
causes the emitted program to hang.
Post by Dag-Erling Smørgrav
I still have a very hard time believing any compiler would do this; my
money is on some sort of overflow / wraparound.
I don't think so, but am happy to be corrected. All of these
compilers are easy to obtain.

Everything I've seen is consistent with C implementations that prefer
performance over correctness.

John Regehr
Michal Nazarewicz
2010-06-18 17:10:02 UTC
Permalink
Raw Message
I've been recently wondering about the whole infinite loop optimisation
problem and came to conclusion that maybe after all it makes some
sense. Not much, but a little. Consider some a bit silly code:

int get_number_category(unsigned n) {
for (unsigned i = 0;; ++i) {
if (hard_to_disprove(n, i)) return 0;
if (easy_to_disprove_1(n, i)) return 1;
if (easy_to_disprove_2(n, i)) return 2;
}
}

A function that categorises a number into four categories. It's a bit
unrealistic maybe but shows my point.

Now, if compiler were able to prove that easy_to_disprove_1 and
easy_to_disprove_2 will always be false, then it is left with:

int get_number_category(unsigned n) {
for (unsigned i = 0;; ++i) {
if (hard_to_disprove(n, i)) return 0;
}
}

at witch point, if it were allowed to assume loop will finish it could
change the whole function call to simple “return 0”.

I think a situation where there are several exit paths of with all but
one can be easily disproved could be the motivation for allowing to
assume loops terminate.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
Keith Thompson
2010-06-18 17:54:17 UTC
Permalink
Raw Message
Post by Michal Nazarewicz
I've been recently wondering about the whole infinite loop optimisation
problem and came to conclusion that maybe after all it makes some
int get_number_category(unsigned n) {
for (unsigned i = 0;; ++i) {
if (hard_to_disprove(n, i)) return 0;
if (easy_to_disprove_1(n, i)) return 1;
if (easy_to_disprove_2(n, i)) return 2;
}
}
A function that categorises a number into four categories. It's a bit
unrealistic maybe but shows my point.
Do you mean three categories, or does never returning from the function
constitute a fourth category?
Post by Michal Nazarewicz
Now, if compiler were able to prove that easy_to_disprove_1 and
int get_number_category(unsigned n) {
for (unsigned i = 0;; ++i) {
if (hard_to_disprove(n, i)) return 0;
}
}
at witch point, if it were allowed to assume loop will finish it could
change the whole function call to simple “return 0”.
I think a situation where there are several exit paths of with all but
one can be easily disproved could be the motivation for allowing to
assume loops terminate.
Hmm. I'm still not convinced. The proposed optimization transforms the
behavior of get_number_category() from

If we detect that n is in category 0, return 0, otherwise keep
trying.

to

Say than n is in category 0 whether it is or not.

I wouldn't claim that the original behavior is necessarily good
design, but it's not 100% obviously wrong. Maybe treating n as if
it were in category 0 when it really isn't would have some horribly
bad consequences, and it's better to keep testing forever than to
terminate the loop and continue with bad information.

If a compiler is able to do the analysis necessary to convert
the for loop into something that's guaranteed to terminate, it
can issue a warning (and generate the code the user asked for).
With the proposed change to the standard, a compiler could *silently*
change the code; that is, IMHO, unacceptable.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Michal Nazarewicz
2010-06-25 12:11:08 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Michal Nazarewicz
I've been recently wondering about the whole infinite loop optimisation
problem and came to conclusion that maybe after all it makes some
int get_number_category(unsigned n) {
for (unsigned i = 0;; ++i) {
if (hard_to_disprove(n, i)) return 0;
if (easy_to_disprove_1(n, i)) return 1;
if (easy_to_disprove_2(n, i)) return 2;
}
}
A function that categorises a number into four categories. It's a bit
unrealistic maybe but shows my point.
Do you mean three categories, or does never returning from the function
constitute a fourth category?
I realised that I had something different in mind. Consider the
following instead:

int get_number_category(unsigned n) {
for (unsigned i = 0;; ++i) {
if (hard_to_prove_but_true_for_some_i(n, i)) return 0;
if (easy_to_disprove_1(n, i)) return 1;
if (easy_to_disprove_2(n, i)) return 2;
}
return 0;
}

So the point is that a function categorises a number into one of few
categories and for some n it may be trivial to disprove all but one
condition.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
Loading...