Discussion:
Why was the `static` keyword so important to include inside array declarator?
(too old to reply)
s***@gmail.com
2016-06-12 18:51:47 UTC
Permalink
What I'm referring is the confusing new syntax which allow constructs like this:

void f(int p[static 6]) { }

Not only that declaration of a parameter as ‘‘array of type’’ is confusing enough but now we decided to bloat it with additional keywords and new meanings with the intend of allowing some fishy "optimizations".

First if those optimizations were so important why limit ourselves with only array parameters - why don't we allow similar information to be added for all other declarations.

Though to be honest I don't think any new syntax is needed anyway because we already have similar construct in the language called "array types". If we wanted to restrict the passed array as of specific number of elements we could instead declare the parameter as "pointer to array" which will not only provide the compiler with usable information for possible optimizations but it'll also provide usable static type information which could be used to diagnose function calls with incompatible argument types. I.e.:

void f(int (*p)[6]) { }

Of-course this solution won't be available when targeting compatibility with old code calling a function declared with parameter of type "pointer to T" as T being the type of elements of the passed array.

In this case though there is still a way to give the compiler similar information by simply casting the received pointer of type "pointer to T" to "pointer to array of N elements with type T" inside the function as N being constant-expression. For example:

void f(int p[6]) { } //old version with no available size information for the //pointed array to the optimizer

void f(int *p) { p = *((int (*)[6])p); } //size information of the pointed array //can be retrieved by the array from which 'p' is updated

Now as long as you don't make any copies of p previous the assignment - the compiler will presumably know the size of the array which it points by first element.

I know that there is a possibility of the code above to produce UB but chances are that conversion from "pointer to T" to "pointer to array of T" will be legal on all implementations as arrays are so limited in C anyway.

Though my biggest question here is why weren't array parameters made obsolescent considering their confusing and inconsistent behavior (compared to other array declarations).
s***@casperkitty.com
2016-06-12 19:27:50 UTC
Permalink
Post by s***@gmail.com
void f(int p[6]) { } //old version with no available size information for the //pointed array to the optimizer
There are a few differences between an array parameter with a static
size versus a pointer-to-array parameter:

1. A compiler would be entitled to pre-fetch data from the first without
having to validate that is non-null, even if code would refrain from
dereferencing the pointer if it were null.

2. The syntax to access a member of a pointer-to-array-of-int type is
somewhat awkward compared with the syntax to access a member of a
normal array type.

2. An array parameter with a static size may be used to access an
arbitrary slice of any size array, provided the entire slice fits
within the array in question. It's don't think it's clear exactly
what is or is not permitted when using pointer-to-array types, but
compilers are likely to assume that if "p" and "q" are both variables
of "pointer to array of int" type, then (*p)[i] and (*p)[j] cannot
possibly alias when "i" and "j" are unequal, but they would not be
able to make such an assumption when using parameters with normal
array types and static sizes.

I do think the design of array declarations in prototypes was a fundamental
and needless mistake. While it would be rare that one would want to pass
arrays by variable, there would have been no need to treat arrays specially
in function prototypes if there were a rule that functions which take arrays
by value could *only* be called with a prototype in scope. The value of
having "int foo(int bar[5]);" be synonymous with "int foo(int *bar);" seems
rather limited, compared with the value of having all parameter types work
consistently.
Keith Thompson
2016-06-12 20:29:56 UTC
Permalink
Post by s***@gmail.com
void f(int p[static 6]) { }
It's hardly new. It was introduced in the 1999 ISO C standard, 17 years
ago. It's explained in the C99 Rationale
http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf
section 6.7.5.2.
Post by s***@gmail.com
Not only that declaration of a parameter as ‘‘array of type’’ is
confusing enough but now we decided to bloat it with additional
keywords and new meanings with the intend of allowing some fishy
"optimizations".
Optimizations are the point. I don't know why you feel the need to put
that word in scare quotes.
Post by s***@gmail.com
First if those optimizations were so important why limit ourselves
with only array parameters - why don't we allow similar information to
be added for all other declarations.
Because C doesn't have array parameters. What looks like an array
parameter declaration actually declares a pointer parameter.
Using array syntax provided an opportunity to provide extra
information about the array object that the pointer points to
(more precisely, the pointer points to an element of the array).

When you define an array *object*, you have to specify the actual size.
Post by s***@gmail.com
Though to be honest I don't think any new syntax is needed anyway
because we already have similar construct in the language called
"array types". If we wanted to restrict the passed array as of
specific number of elements we could instead declare the parameter as
"pointer to array" which will not only provide the compiler with
usable information for possible optimizations but it'll also provide
usable static type information which could be used to diagnose
Sure, you can define a parameter of type "pointer to array" -- but that
specifies the *exact* size of the array. The C99 "static" mechanism
specifies that the array object has *at least* the specified number of
elements.

[...]
Post by s***@gmail.com
Though my biggest question here is why weren't array parameters made
obsolescent considering their confusing and inconsistent behavior
(compared to other array declarations).
Array parameters are nonexistent. I agree that pointer parameters
defined with array syntax are confusing. I personally wouldn't mind
if they were made obsolescent, but there's a valid argument that
they're meaningful to the human reader even if not to the compiler.
For example, if you write:

void func(int param[]);

it suggests that function will treat the parameter as a pointer
to an element of an array. And of course removing array parameter
syntax would have lost the opportunity to add the "static" keyword
to enable optimizats.

The reuse of the "static" keyword with a distinct meaning has
been justly criticized, but reusing a keyword avoided breaking
existing code. And we've had 17 years to get used to it.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
James Kuyper
2016-06-13 03:11:33 UTC
Permalink
Post by s***@gmail.com
void f(int p[static 6]) { }
Keith has answered most of your points already, I'll just add my two
Post by s***@gmail.com
First if those optimizations were so important why limit ourselves
with only array parameters - why don't we allow similar information
to be added for all other declarations.
Please feel free to identify what other information you would consider
"similar". The information provided by this syntax is the minimum length
of the accessible part of the array pointed at by the the pointer p.
Offhand, I can't think of any other piece of information that could be
declared, which is at all similar to this piece of information, that
isn't already declarable.

...
Post by s***@gmail.com
Though my biggest question here is why weren't array parameters made
obsolescent considering their confusing and inconsistent behavior
(compared to other array declarations).
Because use and reliance upon array notation to declare pointer
parameters is extremely common. Such a change would break a very large
fraction of all the C code in the world. I'm not sure whether it's more
than half of all C code, but I wouldn't be surprised if it were.
s***@casperkitty.com
2016-06-13 14:54:31 UTC
Permalink
Post by James Kuyper
Please feel free to identify what other information you would consider
"similar". The information provided by this syntax is the minimum length
of the accessible part of the array pointed at by the the pointer p.
Offhand, I can't think of any other piece of information that could be
declared, which is at all similar to this piece of information, that
isn't already declarable.
How about:

1. After a pointer value is assigned, a compiler may prefetch up to N items
using it [same meaning as static size when used in parameter, except
that there's no nice syntax for a pointer having such traits].

2. A pointer may be regarded as identifying the start of an array of unknown
size and will not alias any other such pointer *unless* the two pointers
are equal.

3. Neither pointer X *nor any pointer derived from it* will ever be used to
modify an object [so an object which is not exposed to outside code
except via pointer x can be assumed not to be modifiable by outside
code]. Not sure the best syntax, since "const" exists with a looser
meaning.

4. A particular object will never (again) be modified within its lifetime
and so any value that is ever read from it may be cached indefinitely.
As above, I'm not sure of the best syntax.

5. A variable will never be accessed via pointer except within a the
context of a function that is directly passed its address, and will
never be accessed by name such a context (IMHO, the register keyword
could be very useful if extended with this meaning).

6. A function will not persist any copy of a particular pointer that it
receives as a parameter (applying a "register" qualifier to the pointer's
target type could indicate that).

All of those could enable clearly-useful optimizations, but I don't know of
any way to express any of them within C.
s***@gmail.com
2016-06-13 16:17:00 UTC
Permalink
OK I start to see things clear now - as ANSI introduced function prototyping with the same confusing array parameters adjustment as in existing function declarators (which of-course could be avoided) - they now don't want to admit their mistake. It's not that easy to admit that you were wrong. I'm also wondering if this confusing syntax would have been less observed if they had admitted making a mistake and so had made this construct obsolescent in-time. Now I can guess their policy about it is that it's pretty normal and expected syntax, not at all confusing and being part of the language all the time.

Anyway, the exact piece of code (or any other with different constant-expression):

void f(int p[static 6]) { /* code using p */ }

Can be replaced with:

void f(int *p) { p = *((int (*)[6])p); /* code using p */ }


Which gives away the same information to the compiler as the new form. This construct could also be used for other declarations as well, though I don't know how useful it could be but at least it's an option if needed:

void f1(int n){

int *p = *((int (*)[6])malloc(sizeof(int[6]) + sizeof(int) * n));
}

I actually wrote the option for f in the bottom of my first post. As it'll be useful when optimizing already existing functions which accept the array by a pointer to its first element and existing code relays on this specific function prototype.

Of-course as I wrote there too the replacement code for f can potentially invoke UB but the chances are it won't (it's an very unlikely situation that conversion from "pointer to T" to "pointer to array of T with N elements" will not be defined). The code in f1 is perfectly valid though.
Keith Thompson
2016-06-13 16:41:59 UTC
Permalink
Post by s***@gmail.com
OK I start to see things clear now - as ANSI introduced function
prototyping with the same confusing array parameters adjustment as in
existing function declarators (which of-course could be avoided) -
they now don't want to admit their mistake. It's not that easy to
admit that you were wrong.
It's particularly difficult to admit that you were wrong when you
weren't.

Keeping the array syntax for parameters wasn't about being right or
wrong. It was about not breaking existing code.

And in what sense is the existing syntax "wrong"? I agree that it can
be confusing, and it's not the way I would have designed it if I had
started from scratch. But as I mentioned before, it does provide some
documentation that the argument is expected to point to the initial
element of an array.
Post by s***@gmail.com
I'm also wondering if this confusing syntax
would have been less observed if they had admitted making a mistake
and so had made this construct obsolescent in-time. Now I can guess
their policy about it is that it's pretty normal and expected syntax,
not at all confusing and being part of the language all the time.
Which happens to be the truth.
Post by s***@gmail.com
Anyway, the exact piece of code (or any other with different
void f(int p[static 6]) { /* code using p */ }
void f(int *p) { p = *((int (*)[6])p); /* code using p */ }
<sarcasm>Right, that's *much* less confusing.</sarcasm>

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@gmail.com
2016-06-13 16:49:30 UTC
Permalink
Post by Keith Thompson
Post by s***@gmail.com
OK I start to see things clear now - as ANSI introduced function
prototyping with the same confusing array parameters adjustment as in
existing function declarators (which of-course could be avoided) -
they now don't want to admit their mistake. It's not that easy to
admit that you were wrong.
It's particularly difficult to admit that you were wrong when you
weren't.
Keeping the array syntax for parameters wasn't about being right or
wrong. It was about not breaking existing code.
I have left with the impression (both by reading K&R book second edition last night and an old version of the ANSI C rationale) that function prototypes were included as a new feature rather then as a way of supporting old code. There were already function declarators but not function prototypes at the time I believe. Function prototypes were only part of the C++ language back then.
James Kuyper
2016-06-13 18:37:16 UTC
Permalink
....
Post by s***@gmail.com
Post by Keith Thompson
Keeping the array syntax for parameters wasn't about being right or
wrong. It was about not breaking existing code.
I have left with the impression (both by reading K&R book second
edition last night and an old version of the ANSI C rationale) that
function prototypes were included as a new feature rather then as a
way of supporting old code. There were already function declarators
but not function prototypes at the time I believe. Function
prototypes were only part of the C++ language back then.
True, but the use of array syntax to declare pointer parameters applies
just as much to old-style function definitions as it does to function
prototypes. As such it dates all the way back to K&C, decades before the
first C standard described function prototypes.
s***@casperkitty.com
2016-06-13 18:52:50 UTC
Permalink
Post by James Kuyper
True, but the use of array syntax to declare pointer parameters applies
just as much to old-style function definitions as it does to function
prototypes. As such it dates all the way back to K&C, decades before the
first C standard described function prototypes.
Do you know of any unified chronology of when various features were added
to the C language in its development, and when various guarantees were
added or relaxed? I know the earliest versions of C had neither "long" nor
"unsigned" types, but from what I understand guaranteed that the existing
"int" type would silently wrap around on overflow (at least on two's-
complement platforms, and I'm unaware of that version of the language
supporting anything other than two's-complement platforms).

Of particular interest here would be the relationship between typedefs and
array declarations. If a language can't have parameters of array types,
having "foo[]" be a syntactical substitute for "*foo" doesn't really hurt
anything, but there's something a bit odd about the fact that given:

void foo(x)
bar x;
{
boz y;
printf("%d %d", sizeof x, sizeof y);
}

the reported sizes of x and y may be different. Did the ability to
use typedef with array types come before or after the ability to pass
structures by value?
Keith Thompson
2016-06-13 19:13:33 UTC
Permalink
Post by s***@casperkitty.com
Post by James Kuyper
True, but the use of array syntax to declare pointer parameters applies
just as much to old-style function definitions as it does to function
prototypes. As such it dates all the way back to K&C, decades before the
first C standard described function prototypes.
Do you know of any unified chronology of when various features were added
to the C language in its development, and when various guarantees were
added or relaxed? I know the earliest versions of C had neither "long" nor
"unsigned" types, but from what I understand guaranteed that the existing
"int" type would silently wrap around on overflow (at least on two's-
complement platforms, and I'm unaware of that version of the language
supporting anything other than two's-complement platforms).
I see no such guarantee in the C Reference Manual from, I believe, 1975.
It says that:

Integers (int) are represented in 16-bit 2’s complement notation.

As far as I can tell, there's no discussion of numeric overflow.
It's likely that it behaved as you describe (that's only a guess
on my part), but because it was simpler, not because the language
definition required it. In modern terms, behavior on overflow
would be undefined by omission.

https://www.bell-labs.com/usr/dmr/www/cman.pdf

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Richard Kettlewell
2016-06-13 19:26:12 UTC
Permalink
Post by Keith Thompson
Post by s***@casperkitty.com
Post by James Kuyper
True, but the use of array syntax to declare pointer parameters applies
just as much to old-style function definitions as it does to function
prototypes. As such it dates all the way back to K&C, decades before the
first C standard described function prototypes.
Do you know of any unified chronology of when various features were added
to the C language in its development, and when various guarantees were
added or relaxed? I know the earliest versions of C had neither "long" nor
"unsigned" types, but from what I understand guaranteed that the existing
"int" type would silently wrap around on overflow (at least on two's-
complement platforms, and I'm unaware of that version of the language
supporting anything other than two's-complement platforms).
I see no such guarantee in the C Reference Manual from, I believe, 1975.
Integers (int) are represented in 16-bit 2’s complement notation.
As far as I can tell, there's no discussion of numeric overflow.
It's likely that it behaved as you describe (that's only a guess
on my part), but because it was simpler, not because the language
definition required it. In modern terms, behavior on overflow
would be undefined by omission.
https://www.bell-labs.com/usr/dmr/www/cman.pdf
One of the examples explicitly relies on the behavior signed integer
overflow, so I don’t think that the author can have thought it was
(in modern terms) undefined.
--
http://www.greenend.org.uk/rjk/
Keith Thompson
2016-06-13 19:36:19 UTC
Permalink
Post by Richard Kettlewell
Post by Keith Thompson
Post by s***@casperkitty.com
Post by James Kuyper
True, but the use of array syntax to declare pointer parameters applies
just as much to old-style function definitions as it does to function
prototypes. As such it dates all the way back to K&C, decades before the
first C standard described function prototypes.
Do you know of any unified chronology of when various features were added
to the C language in its development, and when various guarantees were
added or relaxed? I know the earliest versions of C had neither "long" nor
"unsigned" types, but from what I understand guaranteed that the existing
"int" type would silently wrap around on overflow (at least on two's-
complement platforms, and I'm unaware of that version of the language
supporting anything other than two's-complement platforms).
I see no such guarantee in the C Reference Manual from, I believe, 1975.
Integers (int) are represented in 16-bit 2’s complement notation.
As far as I can tell, there's no discussion of numeric overflow.
It's likely that it behaved as you describe (that's only a guess
on my part), but because it was simpler, not because the language
definition required it. In modern terms, behavior on overflow
would be undefined by omission.
https://www.bell-labs.com/usr/dmr/www/cman.pdf
One of the examples explicitly relies on the behavior signed integer
overflow, so I don’t think that the author can have thought it was
(in modern terms) undefined.
What example?
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Richard Kettlewell
2016-06-13 20:10:34 UTC
Permalink
Post by Keith Thompson
Post by Richard Kettlewell
Post by Keith Thompson
I see no such guarantee in the C Reference Manual from, I believe, 1975.
Integers (int) are represented in 16-bit 2’s complement notation.
As far as I can tell, there's no discussion of numeric overflow.
It's likely that it behaved as you describe (that's only a guess
on my part), but because it was simpler, not because the language
definition required it. In modern terms, behavior on overflow
would be undefined by omission.
https://www.bell-labs.com/usr/dmr/www/cman.pdf
One of the examples explicitly relies on the behavior signed integer
overflow, so I don’t think that the author can have thought it was
(in modern terms) undefined.
What example?
Page 24, near the top.
--
http://www.greenend.org.uk/rjk/
Keith Thompson
2016-06-13 21:59:11 UTC
Permalink
Post by Richard Kettlewell
Post by Keith Thompson
Post by Richard Kettlewell
Post by Keith Thompson
I see no such guarantee in the C Reference Manual from, I believe, 1975.
Integers (int) are represented in 16-bit 2’s complement notation.
As far as I can tell, there's no discussion of numeric overflow.
It's likely that it behaved as you describe (that's only a guess
on my part), but because it was simpler, not because the language
definition required it. In modern terms, behavior on overflow
would be undefined by omission.
https://www.bell-labs.com/usr/dmr/www/cman.pdf
One of the examples explicitly relies on the behavior signed integer
overflow, so I don’t think that the author can have thought it was
(in modern terms) undefined.
What example?
Page 24, near the top.
Ah, you're right.

if (x < 0) {
x = −x;
if (x < 0) { /* is − infinity */
printf("−32768");
continue;
}
putchar('́-' );
}
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-06-14 01:51:46 UTC
Permalink
Post by Keith Thompson
Ah, you're right.
if (x < 0) {
x = −x;
if (x < 0) { /* is − infinity */
printf("−32768");
continue;
}
putchar('́-' );
}
Interestingly, the code will work just fine without modification on systems
with sign-magnitude or ones'-complement integers, no matter what those
systems would do in case of overflow. It can only cause overflow on two's-
complement systems, and on those systems computations that exceeded the
range of "int" had a clear mathematically-defined meaning (yield a result
which is congruent to the mathematically-correct result mod 65536).
Tim Rentsch
2016-06-20 15:24:51 UTC
Permalink
Post by Richard Kettlewell
Post by Keith Thompson
Post by s***@casperkitty.com
Post by James Kuyper
True, but the use of array syntax to declare pointer parameters applies
just as much to old-style function definitions as it does to function
prototypes. As such it dates all the way back to K&C, decades before the
first C standard described function prototypes.
Do you know of any unified chronology of when various features were added
to the C language in its development, and when various guarantees were
added or relaxed? I know the earliest versions of C had neither "long" nor
"unsigned" types, but from what I understand guaranteed that the existing
"int" type would silently wrap around on overflow (at least on two's-
complement platforms, and I'm unaware of that version of the language
supporting anything other than two's-complement platforms).
I see no such guarantee in the C Reference Manual from, I believe, 1975.
Integers (int) are represented in 16-bit 2's complement notation.
As far as I can tell, there's no discussion of numeric overflow.
It's likely that it behaved as you describe (that's only a guess
on my part), but because it was simpler, not because the language
definition required it. In modern terms, behavior on overflow
would be undefined by omission.
https://www.bell-labs.com/usr/dmr/www/cman.pdf
One of the examples explicitly relies on the behavior signed integer
overflow, so I don't think that the author can have thought it was
(in modern terms) undefined.
I don't think that follows necessarily. It could be that the
example is simply relying on implementation-specific behavior,
with an awareness that it might not be appropriate for other
implementations. Certainly using the constant value -32768 when
the overflow condition occurs is not meant to be implementation
independent. Also, note the third sentence in the second
paragraph:

This paper is a manual only for the C language itself as
implemented on the PDP11.

A clear and explicit statement that the paper does not address
what goes on in other implementations (except to give occasional
hints about implementation-dependent features, as noted in the
next sentence).
s***@casperkitty.com
2016-06-20 16:23:21 UTC
Permalink
Notably, it identifies a case where overflow is forbidden (and that this
a bug). The inference I draw from that is that is that the authors are
aware of overflow as an issue and consider it worth mentioning in cases
where programmers must avoid it.
I think the part you're thinking of says essentially that comparisons
between operands which differ by more than +/-32767 are not "illegal" as
such, but should be regarded as arbitrarily yielding 0 or 1 in Unspecified
fashion. I don't think it was intended to say that such operations must
be prevented at all costs even in cases where either behavior would end
up yielding correct results (e.g. because the choice between the true and
false branches was only important in cases where the operands would be
close enough together to yield a deterministic result).
Tim Rentsch
2016-07-11 22:49:17 UTC
Permalink
Post by Tim Rentsch
Post by Richard Kettlewell
One of the examples explicitly relies on the behavior signed integer
overflow, so I don't think that the author can have thought it was
(in modern terms) undefined.
I don't think that follows necessarily. It could be that the
example is simply relying on implementation-specific behavior,
with an awareness that it might not be appropriate for other
implementations. Certainly using the constant value -32768 when
the overflow condition occurs is not meant to be implementation
independent. Also, note the third sentence in the second
This paper is a manual only for the C language itself as
implemented on the PDP11.
A clear and explicit statement that the paper does not address
what goes on in other implementations (except to give occasional
hints about implementation-dependent features, as noted in the
next sentence).
I'd agree that the paper has roughly the modern concept of
implementation-defined behavior.
I would say that it isn't clear whether the author distinguishes
between (what is now called) undefined behavior and (what is now
called) implementation-defined behavior. Maybe he does, maybe he
doesn't, but based on what I see in the paper (and admittedly I
haven't read all of it carefully) it might be either one.
Despite the caveat quoted, though, it does actually does go
into some detail about implementation differences (appendix 2).
For me that is covered under the "occasional hints" exception,
but I take your point.
Notably, it identifies a case where overflow is forbidden (and that this
a bug). The inference I draw from that is that is that the authors are
aware of overflow as an issue and consider it worth mentioning in cases
where programmers must avoid it.
Oh I don't read it that way. First what is being described is a
bug in the implementation, not about user programs. More
important though is what it (tacitly) says about overflow. The
problem isn't that overflow "misbehaves". The problem is that
when a comparison is done using subtraction, overflow behaves as
they expect but the comparison operation delivers a wrong result.
I don't see any indication that this statement implies overflow
is generally unreliable. Of course it may be that overflow /was/
thought to be generally unreliable, but I wouldn't conclude that
just from this bug description.
s***@casperkitty.com
2016-07-11 23:42:30 UTC
Permalink
Post by Tim Rentsch
I would say that it isn't clear whether the author distinguishes
between (what is now called) undefined behavior and (what is now
called) implementation-defined behavior. Maybe he does, maybe he
doesn't, but based on what I see in the paper (and admittedly I
haven't read all of it carefully) it might be either one.
The distinction is IMHO rather unhelpful in the modern Standard, since in
many cases it would be very cheap for the vast majority of implementations
to offer *some* useful behavioral guarantees, but expensive to guarantee
a *specific* behavior. What would have been helpful to have a category of
behavior which would allow an implementation to either specify a list of
possible behaviors, from which it could select in arbitrary (Unspecified)
fashion, or explicitly state that it would be impractical to guarantee
anything about the behavior. While implementations would be allowed to
regard any behavior as UB if they document it as such, doing so without a
good reason would be considered a sign of poor quality.

For example, requiring that an expression like "x+y > z" be evaluated using
wrapping two's-complement semantics could be expensive in cases where y and
z were loop invariants allowing (z-y) to be evaluated outside the loop (with
a check to handle the situation where that computation overflows). On the
other hand, requiring that the expression must for any combination of values
yield 0 or 1 with no side-effects, but allowing it to arbitrarily return 0
or 1 in case of overflow, would be much cheaper but almost as useful if
the compiler allowed a programmer to write e.g. "(int)(x+y) > z" in cases
where precise wrapping semantics were needed.

Loose overflow semantics could be very useful in situations were e.g. a
program needs to examine a large number of items to find a small number
that meet some criterion. If a quick test testing function wouldn't cause
overflow on any items meeting the criteria, but might cause overflow on a
small fraction of items not meeting the criteria, ignoring overflows during
the quick test but then filtering out false positives may be faster than
having to prevent overflows at all costs. If the whole purpose of modern
treatment of UB is to facilitate optimization, requiring that programmers
must add extra logic to prevent overflow at all costs would seem counter-
productive.
Tim Rentsch
2016-07-20 14:43:23 UTC
Permalink
Post by Tim Rentsch
I would say that it isn't clear whether the author distinguishes
between (what is now called) undefined behavior and (what is now
called) implementation-defined behavior. Maybe he does, maybe he
doesn't, but based on what I see in the paper (and admittedly I
haven't read all of it carefully) it might be either one.
The distinction is IMHO rather unhelpful in the modern Standard, [...]
IMO the distinction is helpful. But in any case that has no
bearing on what I was saying.

Tim Rentsch
2016-07-11 22:51:56 UTC
Permalink
Post by Tim Rentsch
This paper is a manual only for the C language itself as
implemented on the PDP11.
A clear and explicit statement that the paper does not address
what goes on in other implementations (except to give occasional
hints about implementation-dependent features, as noted in the
next sentence).
It is unfortunate that the Standard failed to acknowledge that by far
the most common use case of C involves a compiler which generates code
for an execution platform which is outside the compiler writer's
control, since that would clarify the compilers' and execution
platforms' duties in many cases where the Standard presently imposes
no requirements. If a CPU will catch fire if an "ADD" instruction
overflows, it is reasonable to say that any programmer who does not
1. Ensure that overflows don't happen, or
2. Refrain from running his code on such a machine.
I've seen no indication that the authors of the Standard intended that
#2 shouldn't be sufficient.
The Standard explicitly chooses not to concern itself with such
questions. See section 1.
s***@casperkitty.com
2016-07-11 23:20:24 UTC
Permalink
Post by Tim Rentsch
It is unfortunate that the Standard failed to acknowledge that by far
the most common use case of C involves a compiler which generates code
for an execution platform which is outside the compiler writer's
control, since that would clarify the compilers' and execution
platforms' duties in many cases where the Standard presently imposes
no requirements. If a CPU will catch fire if an "ADD" instruction
overflows, it is reasonable to say that any programmer who does not
1. Ensure that overflows don't happen, or
2. Refrain from running his code on such a machine.
I've seen no indication that the authors of the Standard intended that
#2 shouldn't be sufficient.
The Standard explicitly chooses not to concern itself with such
questions. See section 1.
The choice may have been deliberate, but I would still regard it as
unfortunate.

In deciding when implementations should or should not be required to
implement an even-somewhat-predictable behavior, the authors of the
Standard seem to have focused their consideration primarily on systems--
real or hypothetical--where specifying anything about the behavior might
be difficult, and whether imposing a mandate would make the language
more useful *even on those systems*. Such a focus would make sense if
they expected that features which were widely supported on architectures
that could support them cheaply would remain so with or without a mandate.

Perhaps the biggest failing of the Standard from that point of view is
that it fails to explicitly say whether the requirement that extensions be
documented applies only to syntactical extensions or also to behavioral
guarantees as well. Since the list of common extensions in Annex J of the
C89 draft makes no mention of silent wraparound two's-complement behavior
despite the fact that the Committee was obviously not only aware of it but
regarded it as useful and desirable for things like (assuming 8/16-bit
char/int):

unsigned int mult(unsigned char a, unsigned char b) { return a*b; }

If the Committee had taken the lead in regarding silent wraparound behavior
as a common extension, compiler writers may well have taken the lead in
expressly documented what they could or could not guarantee about overflow
behavior. Since the Standard didn't appear to regard it as an extension,
however, there was no reason for compiler writers to do so.
Tim Rentsch
2016-07-20 14:42:25 UTC
Permalink
Post by s***@casperkitty.com
Post by Tim Rentsch
It is unfortunate that the Standard failed to acknowledge that by far
the most common use case of C involves a compiler which generates code
for an execution platform which is outside the compiler writer's
control, since that would clarify the compilers' and execution
platforms' duties in many cases where the Standard presently imposes
no requirements. If a CPU will catch fire if an "ADD" instruction
overflows, it is reasonable to say that any programmer who does not
1. Ensure that overflows don't happen, or
2. Refrain from running his code on such a machine.
I've seen no indication that the authors of the Standard intended that
#2 shouldn't be sufficient.
The Standard explicitly chooses not to concern itself with such
questions. See section 1.
The choice may have been deliberate, but I would still regard it as
unfortunate. [...]
I regard it as fortunate.
s***@casperkitty.com
2016-06-13 20:52:56 UTC
Permalink
Post by Keith Thompson
Integers (int) are represented in 16-bit 2’s complement notation.
As far as I can tell, there's no discussion of numeric overflow.
It's likely that it behaved as you describe (that's only a guess
on my part), but because it was simpler, not because the language
definition required it. In modern terms, behavior on overflow
would be undefined by omission.
It might if the document had been written in modern times. When it was
written, 16-bit two's-complement operations were presumed to yield the
lower 16 bits of an arbitrary-precision computation in the absence of any
specification to the contrary, since such behavior was cheaper than doing
anything else and it was useful.
s***@casperkitty.com
2016-06-13 16:56:50 UTC
Permalink
Post by Keith Thompson
Keeping the array syntax for parameters wasn't about being right or
wrong. It was about not breaking existing code.
I think it would be fair to note that prototypes were invented as part of
C++, and were absorbed into C by compiler vendors who saw that they were a
good idea. For C++ to have defined array parameters in prototypes as it
did was a mistake which wasn't necessary for compatibility with existing
code, since until prototypes were invented there *was* no existing code
that used array parameters in prototypes (or used prototypes at all for that
matter). By the time the Standards Committee got involved with prototypes,
code which used the style of prototypes in C++ already existed, so it was
too late for the Committee to change it.
Keith Thompson
2016-06-13 18:11:10 UTC
Permalink
Post by Keith Thompson
Post by s***@gmail.com
OK I start to see things clear now - as ANSI introduced function
prototyping with the same confusing array parameters adjustment as in
existing function declarators (which of-course could be avoided) -
they now don't want to admit their mistake. It's not that easy to
admit that you were wrong.
It's particularly difficult to admit that you were wrong when you
weren't.
Keeping the array syntax for parameters wasn't about being right or
wrong. It was about not breaking existing code.
Let me amend that. I suppose that C89, particularly the introduction of
prototypes, could have been an opportunity to deprecate or remove array
syntax for pointer parameters. Pre-ANSI C allowed (and ISO C still
allows) a function definition like this:

int main(argc, argv)
int argc;
char *argv[];
{
/* ... */
}

C89 *could* have left things alone for non-prototype definitions, but
required `char **argv` for prototypes.

The committee didn't choose to make that change -- and in my opinion
that was a reasonable decision.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@gmail.com
2016-06-13 16:42:04 UTC
Permalink
Or actually even if not an integer constant expression is the size of the declared array (as I was somehow assuming that only constant expression are allowed after the static keyword) - my proposed alternative will work too.
James Kuyper
2016-06-13 18:45:30 UTC
Permalink
Post by s***@gmail.com
Or actually even if not an integer constant expression is the size
of the declared array (as I was somehow assuming that only constant
expression are allowed after the static keyword) - my proposed
alternative will work too.
No, there are no additional restrictions just because of the presence of
the "static" keyword. The same requirements apply, whether or not it's
there:
6.7.6p1: the size must be any assignment-expression.
6.7.6.2p1: "... the expression shall have an integer type. If the
expression is a constant expression, it shall have a value greater than
zero."
James Kuyper
2016-06-13 18:33:14 UTC
Permalink
Post by s***@gmail.com
OK I start to see things clear now - as ANSI introduced function
prototyping with the same confusing array parameters adjustment as in
existing function declarators (which of-course could be avoided) -
they now don't want to admit their mistake. It's not that easy to
admit that you were wrong. I'm also wondering if this confusing
syntax would have been less observed if they had admitted making a
mistake and so had made this construct obsolescent in-time. Now I can
guess their policy about it is that it's pretty normal and expected
syntax, not at all confusing and being part of the language all the
time.
Well, you're entitled to consider a mistake, and I won't argue with you
about that, but many people consider it a convenience. The simple fact
that it is a widely used feature of the language means that it's not
feasible to remove that feature while retaining any minimal degree of
respect for the need to maintain backwards compatibility. You're free to
pay no respect to that need - but the C committee decided otherwise.
Maintaining backwards compatibility is one their highest priorities
(though it is very far from being their only priority).
Post by s***@gmail.com
Anyway, the exact piece of code (or any other with different
void f(int p[static 6]) { /* code using p */ }
void f(int *p) { p = *((int (*)[6])p); /* code using p */ }
For the purposes of the code below, I'll call those two functions f1()
and f2(), respectively, to keep then distinct.
Post by s***@gmail.com
Which gives away the same information to the compiler as the new
form. This construct could also be used for other declarations as
well, though I don't know how useful it could be but at least it's an
Consider the following code:

void func(int *p)
{
int five[5];
int six[6];
int nine[9];

f1(five);
f1(six);
f1(p);
f1(nine);
f1(nine+3);

f2(five);
f2(six);
f2(p);
f2(nine);
f2(nine+3);
}

int main(void)
{
int fifteen[15];
func(fifteen + 4);
}

The first call to f1() has undefined behavior, while all of the other
calls to f1() are fine.

The second call to f2() is the only call to f2() that is not a
constraint violation.

The authors of the standard intended f1(p), f1(nine), and f1(nine+3) to
have well-defined behavior, so f2() is not an acceptable substitute for
f1().
s***@gmail.com
2016-06-13 19:24:35 UTC
Permalink
Post by James Kuyper
Post by s***@gmail.com
Anyway, the exact piece of code (or any other with different
void f(int p[static 6]) { /* code using p */ }
void f(int *p) { p = *((int (*)[6])p); /* code using p */ }
For the purposes of the code below, I'll call those two functions f1()
and f2(), respectively, to keep then distinct.
Post by s***@gmail.com
Which gives away the same information to the compiler as the new
form. This construct could also be used for other declarations as
well, though I don't know how useful it could be but at least it's an
void func(int *p)
{
int five[5];
int six[6];
int nine[9];
f1(five);
f1(six);
f1(p);
f1(nine);
f1(nine+3);
f2(five);
f2(six);
f2(p);
f2(nine);
f2(nine+3);
}
int main(void)
{
int fifteen[15];
func(fifteen + 4);
}
The first call to f1() has undefined behavior, while all of the other
calls to f1() are fine.
The second call to f2() is the only call to f2() that is not a
constraint violation.
The authors of the standard intended f1(p), f1(nine), and f1(nine+3) to
have well-defined behavior, so f2() is not an acceptable substitute for
f1().
True (I guess ?) but I can't see a real implementation which will have different side-effects by calling f2 instead of f1 in any context (and with any type of arguments).
James R. Kuyper
2016-06-14 00:00:17 UTC
Permalink
Post by s***@gmail.com
Post by James Kuyper
Post by s***@gmail.com
Anyway, the exact piece of code (or any other with different
void f(int p[static 6]) { /* code using p */ }
void f(int *p) { p = *((int (*)[6])p); /* code using p */ }
For the purposes of the code below, I'll call those two functions f1()
and f2(), respectively, to keep then distinct.
Post by s***@gmail.com
Which gives away the same information to the compiler as the new
form. This construct could also be used for other declarations as
well, though I don't know how useful it could be but at least it's an
void func(int *p)
{
int five[5];
int six[6];
int nine[9];
f1(five);
f1(six);
f1(p);
f1(nine);
f1(nine+3);
f2(five);
f2(six);
f2(p);
f2(nine);
f2(nine+3);
}
int main(void)
{
int fifteen[15];
func(fifteen + 4);
}
The first call to f1() has undefined behavior, while all of the other
calls to f1() are fine.
The second call to f2() is the only call to f2() that is not a
constraint violation.
The authors of the standard intended f1(p), f1(nine), and f1(nine+3) to
have well-defined behavior, so f2() is not an acceptable substitute for
f1().
True (I guess ?) but I can't see a real implementation which will have different side-effects by calling f2 instead of f1 in any context (and with any type of arguments).
I have to agree. All of the calls to f2() above except for f2(&six) are
constraint violations, for which at least one diagnostic message is
mandatory. Every real implementation I'm familiar with refuses to
compile code for which diagnostic messages are mandatory - this is not
required by the standard, but it is extremely common. You can sometimes
force a compiler to accept code for which diagnostics are mandatory, but
that's not true in general, and when it is true, it generally requires
using special command line options.

So the only case where it's possible to compare the behavior of f1() and
f2() are f1(six) and f2(&six), for which the behavior of f1() is
well-defined, and identical to the behavior of f2().

For real implementations, the difference between f1() and f2() isn't the
side-effects - it's whether the compiler will even let you use the
function. f1(p), f1(nine), and f1(nine+3) all compile and execute
properly under any conforming implementation of C2011, whereas the
corresponding calls to f2() won't even compile (under any version of the
standard).

In my personal opinion, the most valuable thing about this new feature
is that it makes it possible for f1(five) to trigger a diagnostic
message. The standard specifies undefined behavior, rather than a
constraint violation (probably because there's no easy way for a
compiler to detect that f1(p) has undefined behavior if func(fifteen+10)
were called). Therefore, such a diagnostic message is not mandatory.
However, I'd expect any decent implementation of C2011 to provide it
anyway, in the cases (such as f1(five)) where it can easily be detected.

People have said that optimizations that can be enabled by this feature,
and I have no doubt that this is the case, but I've no idea what those
optimizations would be. The non-mandatory diagnostic is far more
important to me than those optimizations, despite the fact that it isn't
mandatory.
However, whatever those optimizations are, I guarantee you that they
could have effects on the behavior of code such as func(fifteen+10) that
has undefined behavior.
s***@gmail.com
2016-06-14 00:20:51 UTC
Permalink
Post by James R. Kuyper
Post by s***@gmail.com
Post by James Kuyper
Post by s***@gmail.com
void f(int p[static 6]) { /* code using p */ }
void f(int *p) { p = *((int (*)[6])p); /* code using p */ }
For the purposes of the code below, I'll call those two functions f1()
and f2(), respectively, to keep then distinct.
Post by s***@gmail.com
Which gives away the same information to the compiler as the new
form. This construct could also be used for other declarations as
well, though I don't know how useful it could be but at least it's an
void func(int *p)
{
int five[5];
int six[6];
int nine[9];
f1(five);
f1(six);
f1(p);
f1(nine);
f1(nine+3);
f2(five);
f2(six);
f2(p);
f2(nine);
f2(nine+3);
}
int main(void)
{
int fifteen[15];
func(fifteen + 4);
}
The first call to f1() has undefined behavior, while all of the other
calls to f1() are fine.
The second call to f2() is the only call to f2() that is not a
constraint violation.
The authors of the standard intended f1(p), f1(nine), and f1(nine+3) to
have well-defined behavior, so f2() is not an acceptable substitute for
f1().
True (I guess ?) but I can't see a real implementation which will have different side-effects by calling f2 instead of f1 in any context (and with any type of arguments).
I have to agree. All of the calls to f2() above except for f2(&six) are
constraint violations, for which at least one diagnostic message is
mandatory. Every real implementation I'm familiar with refuses to
compile code for which diagnostic messages are mandatory - this is not
required by the standard, but it is extremely common. You can sometimes
force a compiler to accept code for which diagnostics are mandatory, but
that's not true in general, and when it is true, it generally requires
using special command line options.
So the only case where it's possible to compare the behavior of f1() and
f2() are f1(six) and f2(&six), for which the behavior of f1() is
well-defined, and identical to the behavior of f2().
For real implementations, the difference between f1() and f2() isn't the
side-effects - it's whether the compiler will even let you use the
function. f1(p), f1(nine), and f1(nine+3) all compile and execute
properly under any conforming implementation of C2011, whereas the
corresponding calls to f2() won't even compile (under any version of the
standard).
I'm very curios if you could provide me with more information about those implementations. Even maybe some links to them. You see because the experience I have with C compilers (msvc, clang, gcc) lead me to believe that code like this will mostly likely compile fine (and in-fact I tested it with gcc recently resulting in 0 warning and 0 errors).
Post by James R. Kuyper
In my personal opinion, the most valuable thing about this new feature
is that it makes it possible for f1(five) to trigger a diagnostic
message. The standard specifies undefined behavior, rather than a
constraint violation (probably because there's no easy way for a
compiler to detect that f1(p) has undefined behavior if func(fifteen+10)
were called). Therefore, such a diagnostic message is not mandatory.
However, I'd expect any decent implementation of C2011 to provide it
anyway, in the cases (such as f1(five)) where it can easily be detected.
People have said that optimizations that can be enabled by this feature,
and I have no doubt that this is the case, but I've no idea what those
optimizations would be. The non-mandatory diagnostic is far more
important to me than those optimizations, despite the fact that it isn't
mandatory.
However, whatever those optimizations are, I guarantee you that they
could have effects on the behavior of code such as func(fifteen+10) that
has undefined behavior.
And as you've mentioned it - I was wondering the same thing what implementations actually use this feature for optimizations. It's been past 16 years I believe - maybe there is at least one?
s***@casperkitty.com
2016-06-14 00:41:40 UTC
Permalink
Post by s***@gmail.com
And as you've mentioned it - I was wondering the same thing what implementations actually use this feature for optimizations. It's been past 16 years I believe - maybe there is at least one?
I don't know to what extent compilers do implement such an optimization, but
consider something like:

float hey(float foo[static 256])
{
float total;
for (i=0; i<256; i++)
{
if (sin(foo[i]) < -0.999) return 0.0;
total += foo[i];
}
return total;
}

In the absence of the [static 256] declaration, behavior would be well-
defined if the code was given a pointer to an array which was smaller than
256 elements but at least one of the elements in the array it was given
was -1. Consequently, a compiler would be required to write code that
doesn't access any element of foo[] until it has finished processing the
previous one. The [static 256] declaration tells the compiler that it may
fetch the first element and then on subsequent iterations overlap the fetch
of each element with the sine computation of the previous. In addition, if
the processor has a vector unit, the code may be able to use it to perform
multiple compares and additions simultaneously.
s***@casperkitty.com
2016-06-14 01:27:27 UTC
Permalink
... but at least one of the elements in the array it was given
was -1.
Correction [I neglected to adjust the above when I tweaked the example]: at
least one of the elements would yield a sine less than -0.999.
James R. Kuyper
2016-06-14 01:27:18 UTC
Permalink
Post by s***@gmail.com
Post by James R. Kuyper
Post by James Kuyper
Post by s***@gmail.com
void f(int p[static 6]) { /* code using p */ }
void f(int *p) { p = *((int (*)[6])p); /* code using p */ }
For the purposes of the code below, I'll call those two functions f1()
and f2(), respectively, to keep then distinct.
...
Post by s***@gmail.com
Post by James R. Kuyper
Post by James Kuyper
void func(int *p)
{
...
Post by s***@gmail.com
Post by James R. Kuyper
Post by James Kuyper
f2(five);
f2(six);
f2(p);
f2(nine);
f2(nine+3);
Note: the misconception I describe means that each of the arguments in
the above function calls should have been preceded by an '&', You would
probably have noticed my mistake earlier if I had written those calls
"correctly" according to my misconception.
Post by s***@gmail.com
Post by James R. Kuyper
Post by James Kuyper
}
...
Post by s***@gmail.com
Post by James R. Kuyper
I have to agree. All of the calls to f2() above except for f2(&six) are
constraint violations, for which at least one diagnostic message is
mandatory. Every real implementation I'm familiar with refuses to
compile code for which diagnostic messages are mandatory - this is not
required by the standard, but it is extremely common. You can sometimes
force a compiler to accept code for which diagnostics are mandatory, but
that's not true in general, and when it is true, it generally requires
using special command line options.
...
Post by s***@gmail.com
I'm very curios if you could provide me with more information about those implementations. Even maybe some links to them. You see because the experience I have with C compilers (msvc, clang, gcc) lead me to believe that code like this will mostly likely compile fine (and in-fact I tested it with gcc recently resulting in 0 warning and 0 errors).
My profoundest apologies. I've been guilty of reading what I expected to
be reading, rather than what you actually wrote. I've had very little
spare time since the twins were born, and I find myself constantly
rushing through these usenet messages without paying sufficient
attention to the details (not that I was particularly attentive to
detail before they were born!). You wrote:

void f(int *p) { p = *((int (*)[6])p); /* code using p */ }

What I thought you had written, because it's something similar that I
expected to see in this context, was

void f(int(*p)[6]) { /* code using p */ }

The reason I thought this, is because, in my opinion, the single most
important benefit of this new meaning for 'static' is that it enables
(without, unfortunately, mandating) a diagnostic when f() is called on
an array with a length that is too short. What you actually wrote
doesn't enable any such diagnostic. What I thought you wrote makes a
diagnostic mandatory for the case where the array is too short - in
itself, that would make is superior to the 'static' approach. However,
it unfortunately also makes a diagnostic mandatory when the array is
longer than 6 elements. Also, it provides no way to use the function
when all you have is a pointer to the first element of a sufficiently
long array, without having the array itself in scope.

So, you are correct - your suggested alternative is not a constraint
violation in any of those cases. However, for precisely that reason, it
is also not an acceptable substitute, as far as I'm concerned. Getting
diagnostics in some (but not all) of those cases is the main reason I
want this feature.

The optimizations that are enabled by this feature would probably also
be enabled by your suggested alternative - a fact that is of negligible
importance to me.
Keith Thompson
2016-06-14 00:27:55 UTC
Permalink
"James R. Kuyper" <***@verizon.net> writes:
[...]
Post by James R. Kuyper
I have to agree. All of the calls to f2() above except for f2(&six) are
constraint violations, for which at least one diagnostic message is
mandatory. Every real implementation I'm familiar with refuses to
compile code for which diagnostic messages are mandatory - this is not
required by the standard, but it is extremely common. You can sometimes
force a compiler to accept code for which diagnostics are mandatory, but
that's not true in general, and when it is true, it generally requires
using special command line options.
[...]

(Straying a bit from the topic of `static`.)

Alas, that's not my experience. gcc in particular *very* commonly
issues non-fatal warning messages for many constraint violations. You
need to use extra command-line options (such as -pedantic-errors) to
force it to reject such code.

An example:

$ cat c.c
int main(void) {
int n = "hello";
}
$ gcc-6.1.0 -c c.c
c.c: In function ‘main’:
c.c:2:13: warning: initialization makes integer from pointer without
a cast [-Wint-conversion]
int n = "hello";
^~~~~~~
$

Even with "-pedantic -std=c11", it's just a warning. clang follows
gcc's example.

In fact all the C compilers I currently have access to (at least the
ones I've tried) print a non-fatal warning.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Loading...