Discussion:
printf("%p\n", (void*)0)
(too old to reply)
Keith Thompson
2016-10-18 00:08:35 UTC
Permalink
Raw Message
N1570 7.1.4 Use of library functions:

Each of the following statements applies unless explicitly
stated otherwise in the detailed descriptions that follow:
If an argument to a function has an invalid value (such as
a value outside the domain of the function, or a pointer
outside the address space of the program, or a null pointer,
or a pointer to non-modifiable storage when the corresponding
parameter is not const-qualified) or a type (after promotion)
not expected by a function with variable number of arguments,
the behavior is undefined.

I infer from this that passing a null pointer to a library function has
undefined behavior unless there's an explicit statement to the contrary
for that function. For example, strlen(NULL) has undefined behavior,
but free(NULL) is well defined because there's an explicit statement to
that effect.

7.21.6.1p8:

p The argument shall be a pointer to void. The value of the
pointer is converted to a sequence of printing characters, in
an implementation-defined manner.

There is no explicit statement that a null pointer is allowed.
That implies, I think, that a null pointer is an invalid argument
value, and that
printf("%p\n", (void*)0);
has undefined behavior.

On the other hand, being able to print null pointer values, even
in an implementation-defined manner, is extremely useful, and I
know of no implemetation that doesn't handle this as expected,
or any reason not to handle it.

I'm assuming the intent is that the cases mentioned in the "such as"
clause of 7.1.4 are *always* invalid unless stated otherwise. The null
pointer case is the only one that could reasonably *sometimes* be valid.
On the other hand, the presence of explicit statements for some
functions that null pointers are allowed might suggest that such a
statement is required. On the other other hand, in most cases there has
to be an explicit description to describe what the behavior is.

In my opinion:

- Null pointers are always invalid arguments to library functions
unless explicitly permitted for a given function;

- An (overly) literal reading of the standard implies that
printf("%p\n", (void*)0) therefore has undefined behavior; but

- The intent is that it should be permitted, and it should print
an implementation-defined representation of a null pointer,
and therefore:

- The description of the "%p" conversion specifier *should*
explicitly state that a null pointer (of type void*) is permitted.

(Note that if my first point is wrong, and if a null pointer can
be valid even if there's no explicit statement to that effect, then
that raises questions about memcpy(NULL, NULL, 0). If I'm correct,
then memcpy(NULL, NULL, 0) has undefined behavior.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Kaz Kylheku
2016-10-18 00:53:34 UTC
Permalink
Raw Message
Post by Keith Thompson
p The argument shall be a pointer to void. The value of the
pointer is converted to a sequence of printing characters, in
an implementation-defined manner.
There is no explicit statement that a null pointer is allowed.
Nice find!

That omission is arguably defective; of course we want %p to be able to
convey that a pointer is null.
Tim Rentsch
2016-10-18 12:53:35 UTC
Permalink
Raw Message
Post by Keith Thompson
Each of the following statements applies unless explicitly
If an argument to a function has an invalid value (such as
a value outside the domain of the function, or a pointer
outside the address space of the program, or a null pointer,
or a pointer to non-modifiable storage when the corresponding
parameter is not const-qualified) or a type (after promotion)
not expected by a function with variable number of arguments,
the behavior is undefined.
I infer from this that passing a null pointer to a library function has
undefined behavior unless there's an explicit statement to the contrary
for that function. For example, strlen(NULL) has undefined behavior,
but free(NULL) is well defined because there's an explicit statement to
that effect.
p The argument shall be a pointer to void. The value of the
pointer is converted to a sequence of printing characters, in
an implementation-defined manner.
There is no explicit statement that a null pointer is allowed.
That implies, I think, that a null pointer is an invalid argument
value, and that
printf("%p\n", (void*)0);
has undefined behavior. [.. snip elaboration ..]
I reach a different conclusion. Let me try to explain what it is
and why I think so.

Library functions that take pointer-valued arguments do not
automatically regard a null pointer as an invalid value. Rather,
it depends in each case on how the particular argument is defined
or described in the function's semantics. Here are some examples.
(I have made no attempt to be exhaustive.)

For a %s value in fprintf() -

If no l length modifier is present, the argument shall be a
pointer to the initial element of an array of character
type. [...]

Clearly a null pointer doesn't satisfy the "shall" clause. Null
pointers are invalid values for these arguments.

For the first two arguments to memcpy() -

The memcpy function copies n characters from the object
pointed to by s2 into the object pointed to by s1.

The parameters s1 and s2 are expected (ie, required) to point to
objects. Null pointers are invalid values for these arguments.

For the first two arguments to strncpy() -

The strncpy function copies not more than n characters
(characters that follow a null character are not copied)
from the array pointed to by s2 to the array pointed to by
s1.

The parameters s1 and s2 are expected (ie, required) to point to
arrays. Null pointers are invalid values for these arguments.

For the first argument to snprintf() -

The snprintf function is equivalent to fprintf, except that
the output is written into an array (specified by argument
s) rather than to a stream. If n is zero, nothing is
written, and s may be a null pointer.

The parameter s1 is nominally expected to point to an array, but
the second sentence allows a null pointer in the case where n is
zero. Null pointers are valid values in such cases, and invalid
values otherwise.

Now let's look at arguments corresponding to a %p conversion
in fprintf() (copied from what you wrote in your posting) -

The argument shall be a pointer to void. The value of the
pointer is converted to a sequence of printing characters,
[...]

What is required is an argument of type pointer to void. The
description doesn't say anything about what the pointer does or
does not point to; the action depends only on the value of the
pointer, not whether or not it's a null pointer. Null pointers
are valid values for these arguments.

To restate: in my view whether null pointers are meant to be
valid values depends on the description of the parameter in each
case. There is not a default presumption that null pointers are
invalid values - whether they are or not depends on the semantics
of the particular function and argument. I admit the wording
used in 7.1.4 p1 is somewhat misleading, and could give the
impression that null pointers are invalid values unless there
is an explicit statement to the contrary. The key point though
is not the null-pointer-ness but the invalid-ness - an argument
described as a pointer value, with no mention made of what the
pointer points to, admits a null pointer as a valid value.

I fully agree that text in the Standard deserves clarification
on this topic, so on that point I agree with you.
s***@casperkitty.com
2016-10-18 16:19:07 UTC
Permalink
Raw Message
Post by Tim Rentsch
For the first two arguments to strncpy() -
The strncpy function copies not more than n characters
(characters that follow a null character are not copied)
from the array pointed to by s2 to the array pointed to by
s1.
The parameters s1 and s2 are expected (ie, required) to point to
arrays. Null pointers are invalid values for these arguments.
Given:

struct fam { int size; char dat[]; }

void test(void *src, int size)
{
struct fam *dest = malloc(size+sizeof (struct fam));
if (fam)
memcpy(fam->dat, src, size);
}

would behavior be defined in the case where size is zero? If so, to
what kind of object would fam->dat point? It would seem unlikely that
the authors of the C89 Standard would have expected that any
implementation of memcpy() would do anything weird in the above case,
but nothing in the language of the Standard would imply that a one-past
pointer would be any different from a null pointer.

In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Keith Thompson
2016-10-18 18:25:35 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Tim Rentsch
For the first two arguments to strncpy() -
The strncpy function copies not more than n characters
(characters that follow a null character are not copied)
from the array pointed to by s2 to the array pointed to by
s1.
The parameters s1 and s2 are expected (ie, required) to point to
arrays. Null pointers are invalid values for these arguments.
struct fam { int size; char dat[]; }
void test(void *src, int size)
{
struct fam *dest = malloc(size+sizeof (struct fam));
if (fam)
memcpy(fam->dat, src, size);
}
would behavior be defined in the case where size is zero? If so, to
what kind of object would fam->dat point? It would seem unlikely that
the authors of the C89 Standard would have expected that any
implementation of memcpy() would do anything weird in the above case,
but nothing in the language of the Standard would imply that a one-past
pointer would be any different from a null pointer.
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
The question is not whether it's reasonable for memcpy(NULL, NULL, 0)
to have defined behavior, doing nothing. The question is whether
the standard defines the behavior.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Tim Rentsch
2016-10-19 08:13:26 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Or if you'd like it put more simply: Never imagine yourself
not to be otherwise than what it might appear to others that
what you were or might have been was not otherwise than what
you had been would have appeared to them to be otherwise.
s***@casperkitty.com
2016-10-19 14:42:07 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by s***@casperkitty.com
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Or if you'd like it put more simply: Never imagine yourself
not to be otherwise than what it might appear to others that
what you were or might have been was not otherwise than what
you had been would have appeared to them to be otherwise.
Let me break it down for you:

1. It is not patently unreasonable to expect that copying 0 bytes would be
a no-op regardless of the pointers, when using a general-purpose build
on anything resembling remotely-commonplace hardware. Some people might
think it reasonable, some not, but it is not so unreasonable that no
reasonable person could think it otherwise.

2. No reasonable person would be astonished at a general-purpose compiler
which completely ignored the pointer arguments whenever the size is zero.

3. When choosing between a behavior which is guaranteed not to cause any
astonishment versus one which is likely to astonish people who are being
even remotely reasonable, a quality compiler should opt for the former
in the absence of a compelling reason to use the latter.
Tim Rentsch
2016-10-19 22:42:58 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by s***@casperkitty.com
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Or if you'd like it put more simply: Never imagine yourself
not to be otherwise than what it might appear to others that
what you were or might have been was not otherwise than what
you had been would have appeared to them to be otherwise.
1. It is not patently unreasonable to expect that copying 0 bytes would be
a no-op regardless of the pointers, when using a general-purpose build
on anything resembling remotely-commonplace hardware. Some people might
think it reasonable, some not, but it is not so unreasonable that no
reasonable person could think it otherwise.
2. No reasonable person would be astonished at a general-purpose compiler
which completely ignored the pointer arguments whenever the size is zero.
3. When choosing between a behavior which is guaranteed not to cause any
astonishment versus one which is likely to astonish people who are being
even remotely reasonable, a quality compiler should opt for the former
in the absence of a compelling reason to use the latter.
You are welcome to your opinion. My earlier remarks are
only about whether the Standard /does/ define the behavior,
not whether or who may think it /should/ define the behavior.
I will offer no opinion on the latter question at this time
(and probably not any other time either).
Keith Thompson
2016-10-22 23:47:25 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Tim Rentsch
Post by s***@casperkitty.com
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Or if you'd like it put more simply: Never imagine yourself
not to be otherwise than what it might appear to others that
what you were or might have been was not otherwise than what
you had been would have appeared to them to be otherwise.
1. It is not patently unreasonable to expect that copying 0 bytes would be
a no-op regardless of the pointers, when using a general-purpose build
on anything resembling remotely-commonplace hardware. Some people might
think it reasonable, some not, but it is not so unreasonable that no
reasonable person could think it otherwise.
2. No reasonable person would be astonished at a general-purpose compiler
which completely ignored the pointer arguments whenever the size is zero.
3. When choosing between a behavior which is guaranteed not to cause any
astonishment versus one which is likely to astonish people who are being
even remotely reasonable, a quality compiler should opt for the former
in the absence of a compelling reason to use the latter.
Thanks to James Kuyper's recent observation on comp.lang.c, the standard
is not ambiguous on the question of whether memcpy(NULL, NULL, 0) has
defined behavior.

One might argue that the "such as" clause in 7.1.4 doesn't imply that a
null pointer is *always* an invalid argument unless otherwise stated.
(I don't agree, but it could be a reasonable interpretation.)

But 7.24.1p2, which covers the <string.h> functions, says:

Where an argument declared as size_t n specifies the length
of the array for a function, n can have the value zero on a
call to that function. Unless explicitly stated otherwise in
the description of a particular function in this subclause,
pointer arguments on such a call shall still have valid values,
as described in 7.1.4. On such a call, a function that locates
a character finds no occurrence, a function that compares two
character sequences returns zero, and a function that copies
characters copies zero characters.

A null pointer is clearly not a valid value "as described
in 7.1.4", so the standard does not define the behavior of
memcpy(NULL, NULL, 0).

If you want to advocate changing the standard so that
memcpy(NULL, NULL, 0) is a well-defined no-op, you're free to do so,
but I suggest there's no point in repeating that argument for the
Nth time.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Tim Rentsch
2016-10-23 15:20:53 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by s***@casperkitty.com
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Or if you'd like it put more simply: Never imagine yourself
not to be otherwise than what it might appear to others that
what you were or might have been was not otherwise than what
you had been would have appeared to them to be otherwise.
1. It is not patently unreasonable to expect that copying 0 bytes would be
a no-op regardless of the pointers, when using a general-purpose build
on anything resembling remotely-commonplace hardware. Some people might
think it reasonable, some not, but it is not so unreasonable that no
reasonable person could think it otherwise.
2. No reasonable person would be astonished at a general-purpose compiler
which completely ignored the pointer arguments whenever the size is zero.
3. When choosing between a behavior which is guaranteed not to cause any
astonishment versus one which is likely to astonish people who are being
even remotely reasonable, a quality compiler should opt for the former
in the absence of a compelling reason to use the latter.
Thanks to James Kuyper's recent observation on comp.lang.c, the standard
is not ambiguous on the question of whether memcpy(NULL, NULL, 0) has
defined behavior.
One might argue that the "such as" clause in 7.1.4 doesn't imply that a
null pointer is *always* an invalid argument unless otherwise stated.
(I don't agree, but it could be a reasonable interpretation.)
Where an argument declared as size_t n specifies the length
of the array for a function, n can have the value zero on a
call to that function. Unless explicitly stated otherwise in
the description of a particular function in this subclause,
pointer arguments on such a call shall still have valid values,
as described in 7.1.4. On such a call, a function that locates
a character finds no occurrence, a function that compares two
character sequences returns zero, and a function that copies
characters copies zero characters.
A null pointer is clearly not a valid value "as described
in 7.1.4", so the standard does not define the behavior of
memcpy(NULL, NULL, 0).
Furthermore, the description of memcpy() says:

The memcpy function copies n characters from the object
pointed to by s2 into the object pointed to by s1.

To my way of thinking this sentence eliminates null pointers from
the set of valid argument values even without the "such as"
clause.
Jakob Bohm
2016-10-23 17:59:05 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by Keith Thompson
Post by s***@casperkitty.com
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Or if you'd like it put more simply: Never imagine yourself
not to be otherwise than what it might appear to others that
what you were or might have been was not otherwise than what
you had been would have appeared to them to be otherwise.
1. It is not patently unreasonable to expect that copying 0 bytes would be
a no-op regardless of the pointers, when using a general-purpose build
on anything resembling remotely-commonplace hardware. Some people might
think it reasonable, some not, but it is not so unreasonable that no
reasonable person could think it otherwise.
2. No reasonable person would be astonished at a general-purpose compiler
which completely ignored the pointer arguments whenever the size is zero.
3. When choosing between a behavior which is guaranteed not to cause any
astonishment versus one which is likely to astonish people who are being
even remotely reasonable, a quality compiler should opt for the former
in the absence of a compelling reason to use the latter.
Thanks to James Kuyper's recent observation on comp.lang.c, the standard
is not ambiguous on the question of whether memcpy(NULL, NULL, 0) has
defined behavior.
One might argue that the "such as" clause in 7.1.4 doesn't imply that a
null pointer is *always* an invalid argument unless otherwise stated.
(I don't agree, but it could be a reasonable interpretation.)
Where an argument declared as size_t n specifies the length
of the array for a function, n can have the value zero on a
call to that function. Unless explicitly stated otherwise in
the description of a particular function in this subclause,
pointer arguments on such a call shall still have valid values,
as described in 7.1.4. On such a call, a function that locates
a character finds no occurrence, a function that compares two
character sequences returns zero, and a function that copies
characters copies zero characters.
A null pointer is clearly not a valid value "as described
in 7.1.4", so the standard does not define the behavior of
memcpy(NULL, NULL, 0).
The memcpy function copies n characters from the object
pointed to by s2 into the object pointed to by s1.
To my way of thinking this sentence eliminates null pointers from
the set of valid argument values even without the "such as"
clause.
One reasonable good reason to not implement the memcpy(NULL, NULL, 0)
case (but still implement the memcpy(one_past_the_post,
one_past_another_post, 0) case) would be the following:

Imagine a computer system vastly similar to the 16 bit protected mode
x86 platform (=80286 and later), with the sole difference being that
loading a 0 into a "segment address register" causes an immediate
exception rather than delaying this to each subsequent use of said
register (which is what the x86 does). Now for performance (memcpy is
a frequently optimized C library performance hotspot) it is faster to
load the pointer arguments into address registers before whichever
instruction prevents doing anything on a 0 count. However that causes
the rare (and apparently optional) memcpy(NULL, NULL, 0) case to
instantly raise a fatal exception. If the standard required
memcpy(NULL, NULL, 0) to be a safe NOP, then such an implementation
would have to slow down millions and millions of non-NOP memcpy() calls
just to implement that additional requirement.

Note that supercat's example posted on the 18th does not invoke the
memcpy(NULL, NULL, 0) case, only the memcpy(one_past_the_pst,
one_past_another_post, 0) case, thus being a nice example of code
unaffected by the argument above.


Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Tim Rentsch
2016-10-24 01:20:06 UTC
Permalink
Raw Message
Post by Jakob Bohm
Post by Tim Rentsch
Post by Keith Thompson
Post by s***@casperkitty.com
Post by s***@casperkitty.com
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Or if you'd like it put more simply: Never imagine yourself
not to be otherwise than what it might appear to others that
what you were or might have been was not otherwise than what
you had been would have appeared to them to be otherwise.
1. It is not patently unreasonable to expect that copying 0 bytes would be
a no-op regardless of the pointers, when using a general-purpose build
on anything resembling remotely-commonplace hardware. Some people might
think it reasonable, some not, but it is not so unreasonable that no
reasonable person could think it otherwise.
2. No reasonable person would be astonished at a general-purpose compiler
which completely ignored the pointer arguments whenever the size is zero.
3. When choosing between a behavior which is guaranteed not to cause any
astonishment versus one which is likely to astonish people who are being
even remotely reasonable, a quality compiler should opt for the former
in the absence of a compelling reason to use the latter.
Thanks to James Kuyper's recent observation on comp.lang.c, the standard
is not ambiguous on the question of whether memcpy(NULL, NULL, 0) has
defined behavior.
One might argue that the "such as" clause in 7.1.4 doesn't imply that a
null pointer is *always* an invalid argument unless otherwise stated.
(I don't agree, but it could be a reasonable interpretation.)
Where an argument declared as size_t n specifies the length
of the array for a function, n can have the value zero on a
call to that function. Unless explicitly stated otherwise in
the description of a particular function in this subclause,
pointer arguments on such a call shall still have valid values,
as described in 7.1.4. On such a call, a function that locates
a character finds no occurrence, a function that compares two
character sequences returns zero, and a function that copies
characters copies zero characters.
A null pointer is clearly not a valid value "as described
in 7.1.4", so the standard does not define the behavior of
memcpy(NULL, NULL, 0).
The memcpy function copies n characters from the object
pointed to by s2 into the object pointed to by s1.
To my way of thinking this sentence eliminates null pointers from
the set of valid argument values even without the "such as"
clause.
One reasonable good reason to not implement the memcpy(NULL, NULL, 0)
case (but still implement the memcpy(one_past_the_post,
Imagine a computer system vastly similar to the 16 bit protected mode
x86 platform (=80286 and later), with the sole difference being that
loading a 0 into a "segment address register" causes an immediate
exception rather than delaying this to each subsequent use of said
register (which is what the x86 does). Now for performance (memcpy is
a frequently optimized C library performance hotspot) it is faster to
load the pointer arguments into address registers before whichever
instruction prevents doing anything on a 0 count. However that causes
the rare (and apparently optional) memcpy(NULL, NULL, 0) case to
instantly raise a fatal exception. If the standard required
memcpy(NULL, NULL, 0) to be a safe NOP, then such an implementation
would have to slow down millions and millions of non-NOP memcpy() calls
just to implement that additional requirement.
My concern here is only what the Standard does allow, not what it
should or should not allow.
Post by Jakob Bohm
Note that supercat's example posted on the 18th does not invoke the
memcpy(NULL, NULL, 0) case, only the memcpy(one_past_the_pst,
one_past_another_post, 0) case, thus being a nice example of code
unaffected by the argument above.
In my reading of the Standard, this question is addressed by
section 7.24.1, paragraphs 1 and 2, and section 7.1.4 paragraph 1
(noting that 7.24.1 p2 references 7.1.4). A key sentence in
7.1.4 p1 concerns arrays:

If a function argument is described as being an array, the
pointer actually passed to the function shall have a value
such that all address computations and accesses to objects
(that would be valid if the pointer did point to the first
element of such an array) are in fact valid.

Generally a pointer to one past the last element of an array is
considered to be part of the array for purposes of address
computation. Taken in conjunction with 7.24.1 p2 (quoted above),
the quoted sentence makes a pretty good case that the behavior
in the one-past-the-last-element-of-an-array scenario is indeed
defined behavior.
Jakob Bohm
2016-10-24 01:54:04 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by Jakob Bohm
Post by Tim Rentsch
Post by Keith Thompson
Post by s***@casperkitty.com
Post by s***@casperkitty.com
In any case, I would say that it would not be patently unreasonable for
someone reading the Standard to believe that copying zero bytes will be
a no-op on any commonplace implementation, and that anyone who is trying
to write a commonplace implementation should treat as defined anything
that would be regarded as defined under any remotely-reasonable reading
of the Standard.
Or if you'd like it put more simply: Never imagine yourself
not to be otherwise than what it might appear to others that
what you were or might have been was not otherwise than what
you had been would have appeared to them to be otherwise.
1. It is not patently unreasonable to expect that copying 0 bytes would be
a no-op regardless of the pointers, when using a general-purpose build
on anything resembling remotely-commonplace hardware. Some people might
think it reasonable, some not, but it is not so unreasonable that no
reasonable person could think it otherwise.
2. No reasonable person would be astonished at a general-purpose compiler
which completely ignored the pointer arguments whenever the size is zero.
3. When choosing between a behavior which is guaranteed not to cause any
astonishment versus one which is likely to astonish people who are being
even remotely reasonable, a quality compiler should opt for the former
in the absence of a compelling reason to use the latter.
Thanks to James Kuyper's recent observation on comp.lang.c, the standard
is not ambiguous on the question of whether memcpy(NULL, NULL, 0) has
defined behavior.
One might argue that the "such as" clause in 7.1.4 doesn't imply that a
null pointer is *always* an invalid argument unless otherwise stated.
(I don't agree, but it could be a reasonable interpretation.)
Where an argument declared as size_t n specifies the length
of the array for a function, n can have the value zero on a
call to that function. Unless explicitly stated otherwise in
the description of a particular function in this subclause,
pointer arguments on such a call shall still have valid values,
as described in 7.1.4. On such a call, a function that locates
a character finds no occurrence, a function that compares two
character sequences returns zero, and a function that copies
characters copies zero characters.
A null pointer is clearly not a valid value "as described
in 7.1.4", so the standard does not define the behavior of
memcpy(NULL, NULL, 0).
The memcpy function copies n characters from the object
pointed to by s2 into the object pointed to by s1.
To my way of thinking this sentence eliminates null pointers from
the set of valid argument values even without the "such as"
clause.
One reasonable good reason to not implement the memcpy(NULL, NULL, 0)
case (but still implement the memcpy(one_past_the_post,
Imagine a computer system vastly similar to the 16 bit protected mode
x86 platform (=80286 and later), with the sole difference being that
loading a 0 into a "segment address register" causes an immediate
exception rather than delaying this to each subsequent use of said
register (which is what the x86 does). Now for performance (memcpy is
a frequently optimized C library performance hotspot) it is faster to
load the pointer arguments into address registers before whichever
instruction prevents doing anything on a 0 count. However that causes
the rare (and apparently optional) memcpy(NULL, NULL, 0) case to
instantly raise a fatal exception. If the standard required
memcpy(NULL, NULL, 0) to be a safe NOP, then such an implementation
would have to slow down millions and millions of non-NOP memcpy() calls
just to implement that additional requirement.
My concern here is only what the Standard does allow, not what it
should or should not allow.
Someone else suggested there might be no reasonable reason for the
standard not requiring memcpy(NULL, NULL, 0) to be a safe NOP, so I
provided a counter-example.
Post by Tim Rentsch
Post by Jakob Bohm
Note that supercat's example posted on the 18th does not invoke the
memcpy(NULL, NULL, 0) case, only the memcpy(one_past_the_pst,
one_past_another_post, 0) case, thus being a nice example of code
unaffected by the argument above.
In my reading of the Standard, this question is addressed by
section 7.24.1, paragraphs 1 and 2, and section 7.1.4 paragraph 1
(noting that 7.24.1 p2 references 7.1.4). A key sentence in
If a function argument is described as being an array, the
pointer actually passed to the function shall have a value
such that all address computations and accesses to objects
(that would be valid if the pointer did point to the first
element of such an array) are in fact valid.
Generally a pointer to one past the last element of an array is
considered to be part of the array for purposes of address
computation. Taken in conjunction with 7.24.1 p2 (quoted above),
the quoted sentence makes a pretty good case that the behavior
in the one-past-the-last-element-of-an-array scenario is indeed
defined behavior.
No argument with that.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Richard Kettlewell
2016-10-24 08:22:38 UTC
Permalink
Raw Message
Post by Jakob Bohm
Imagine a computer system vastly similar to the 16 bit protected mode
x86 platform (=80286 and later), with the sole difference being that
loading a 0 into a "segment address register" causes an immediate
exception rather than delaying this to each subsequent use of said
register (which is what the x86 does).
Does such a platform exist?
Post by Jakob Bohm
Now for performance (memcpy is a frequently optimized C library
performance hotspot) it is faster to load the pointer arguments into
address registers before whichever instruction prevents doing anything
on a 0 count. However that causes the rare (and apparently optional)
memcpy(NULL, NULL, 0) case to instantly raise a fatal exception. If
the standard required memcpy(NULL, NULL, 0) to be a safe NOP, then
such an implementation would have to slow down millions and millions
of non-NOP memcpy() calls just to implement that additional
requirement.
If there is such a platform then it could easily be supported without
performance compromises while neverthless keeping memcpy() NULL-safe,
for instance:

void *memcpy(void *dest, const void *src, size_t n);
-- same rules as memcpy now, except that src=NULL or dest=NULL are
permitted when n=0
void *memcpy_unsafe(void *dest, const void *src, size_t n);
-- same rules as memcpy now.
--
http://www.greenend.org.uk/rjk/
Jakob Bohm
2016-10-24 08:39:13 UTC
Permalink
Raw Message
Post by Richard Kettlewell
Post by Jakob Bohm
Imagine a computer system vastly similar to the 16 bit protected mode
x86 platform (=80286 and later), with the sole difference being that
loading a 0 into a "segment address register" causes an immediate
exception rather than delaying this to each subsequent use of said
register (which is what the x86 does).
Does such a platform exist?
I don't know, it is a big world out there. Note that x86 does raise an
exception the moment an invalid non-NULL address is loaded into those
registers.
Post by Richard Kettlewell
Post by Jakob Bohm
Now for performance (memcpy is a frequently optimized C library
performance hotspot) it is faster to load the pointer arguments into
address registers before whichever instruction prevents doing anything
on a 0 count. However that causes the rare (and apparently optional)
memcpy(NULL, NULL, 0) case to instantly raise a fatal exception. If
the standard required memcpy(NULL, NULL, 0) to be a safe NOP, then
such an implementation would have to slow down millions and millions
of non-NOP memcpy() calls just to implement that additional
requirement.
If there is such a platform then it could easily be supported without
performance compromises while neverthless keeping memcpy() NULL-safe,
void *memcpy(void *dest, const void *src, size_t n);
-- same rules as memcpy now, except that src=NULL or dest=NULL are
permitted when n=0
void *memcpy_unsafe(void *dest, const void *src, size_t n);
-- same rules as memcpy now.
Except that such a solution would slow down portable programs that use
memcpy() within its current limitations.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Richard Kettlewell
2016-10-24 09:38:25 UTC
Permalink
Raw Message
Post by Jakob Bohm
Post by Richard Kettlewell
If there is such a platform then it could easily be supported without
performance compromises while neverthless keeping memcpy() NULL-safe,
void *memcpy(void *dest, const void *src, size_t n);
-- same rules as memcpy now, except that src=NULL or dest=NULL are
permitted when n=0
void *memcpy_unsafe(void *dest, const void *src, size_t n);
-- same rules as memcpy now.
Except that such a solution would slow down portable programs that use
memcpy() within its current limitations.
If you think you have such a program, and the trivial performance
difference mattered, it would be extremely easy to bulk-replace memcpy
with memcpy_unsafe.
--
http://www.greenend.org.uk/rjk/
Jakob Bohm
2016-10-24 09:43:59 UTC
Permalink
Raw Message
Post by Richard Kettlewell
Post by Jakob Bohm
Post by Richard Kettlewell
If there is such a platform then it could easily be supported without
performance compromises while neverthless keeping memcpy() NULL-safe,
void *memcpy(void *dest, const void *src, size_t n);
-- same rules as memcpy now, except that src=NULL or dest=NULL are
permitted when n=0
void *memcpy_unsafe(void *dest, const void *src, size_t n);
-- same rules as memcpy now.
Except that such a solution would slow down portable programs that use
memcpy() within its current limitations.
If you think you have such a program, and the trivial performance
difference mattered, it would be extremely easy to bulk-replace memcpy
with memcpy_unsafe.
Compiler authors consistently seem to think such programs are very
common, given how they tend to provide fine tuned assembler
implementations of memcpy, sometimes with compile-time inlining of the
implementation.

And changing a portable program to do something compiler-specific by
systematically replacing a standard library function by a non-standard
function is not really practical if the program is to remain portable.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Richard Kettlewell
2016-10-24 10:27:58 UTC
Permalink
Raw Message
Post by Jakob Bohm
And changing a portable program to do something compiler-specific by
systematically replacing a standard library function by a non-standard
function is not really practical if the program is to remain portable.
I’m not proposing a compiler-specific change, I’m suggesting the above
would be an improvement to the langauge specificaiton.
--
http://www.greenend.org.uk/rjk/
s***@casperkitty.com
2016-10-24 17:21:08 UTC
Permalink
Raw Message
Post by Jakob Bohm
Post by Richard Kettlewell
If you think you have such a program, and the trivial performance
difference mattered, it would be extremely easy to bulk-replace memcpy
with memcpy_unsafe.
Compiler authors consistently seem to think such programs are very
common, given how they tend to provide fine tuned assembler
implementations of memcpy, sometimes with compile-time inlining of the
implementation.
Most optimized versions of memcpy() have a smaller per-byte cost but higher
loop-setup cost than would a straightforward implementation of the version
shown in K&R [which wouldn't even look at the pointers in the size-zero
case]. To minimize the extra cost when copying blocks small blocks, such
routines generally start with a test for whether the size is sufficient to
justify using anything other than the simple loop. Even on a platform
which pre-loaded segment registers before even the simple copy loop, making
the size-zero case ignore the pointers would at worst require changing the
code for the simple form from:

load source ptr
load dest ptr
goto loopCheck
loop:
decrement size
copy source to dest
inc source and dest
loopCheck:
if size is non-zero goto loop
exit:
return

to

if size!=0 goto exit
load source ptr
load dest ptr
loop:
decrement size
copy source to dest
inc source and dest
if size is non-zero goto loop
exit:
return

Maybe requiring an size-zero check to be included in the machine code,
though not affecting the total number of such checks executed (note that
if execution time matters more than code size, the extra test should be
included in the code *anyway*).

The only case I can imagine where the cost of guaranteeing that zero-byte
memcpy is always a no-op would cost anything would be if a machine had an
implementation where it wasn't in some kind of shared library, ROM, etc.
And that seems like a really rare and arcane situation to justify requiring
programmers to include extra size-zero checks before calling memcpy.
Ben Bacarisse
2016-10-23 19:36:38 UTC
Permalink
Raw Message
Keith Thompson <kst-***@mib.org> writes:
<snip>
Post by Keith Thompson
Thanks to James Kuyper's recent observation on comp.lang.c, the standard
is not ambiguous on the question of whether memcpy(NULL, NULL, 0) has
defined behavior.
One might argue that the "such as" clause in 7.1.4 doesn't imply that a
null pointer is *always* an invalid argument unless otherwise stated.
(I don't agree, but it could be a reasonable interpretation.)
Where an argument declared as size_t n specifies the length
of the array for a function, n can have the value zero on a
call to that function. Unless explicitly stated otherwise in
the description of a particular function in this subclause,
pointer arguments on such a call shall still have valid values,
as described in 7.1.4. On such a call, a function that locates
a character finds no occurrence, a function that compares two
character sequences returns zero, and a function that copies
characters copies zero characters.
A null pointer is clearly not a valid value "as described
in 7.1.4", so the standard does not define the behavior of
memcpy(NULL, NULL, 0).
No sure if it's already come up, but that wording was added as the
result of a defect report[1], the response to which makes the
committee's intent very clear. Had the special remark about zero length
"objects" been intended to exempt any other requirements on the pointer
arguments, the requirement that they "still have valid values" would
have been omitted (at the very least).

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_054.html

<snip>
--
Ben.
Keith Thompson
2016-10-18 18:49:10 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by Keith Thompson
Each of the following statements applies unless explicitly
If an argument to a function has an invalid value (such as
a value outside the domain of the function, or a pointer
outside the address space of the program, or a null pointer,
or a pointer to non-modifiable storage when the corresponding
parameter is not const-qualified) or a type (after promotion)
not expected by a function with variable number of arguments,
the behavior is undefined.
I infer from this that passing a null pointer to a library function has
undefined behavior unless there's an explicit statement to the contrary
for that function. For example, strlen(NULL) has undefined behavior,
but free(NULL) is well defined because there's an explicit statement to
that effect.
p The argument shall be a pointer to void. The value of the
pointer is converted to a sequence of printing characters, in
an implementation-defined manner.
There is no explicit statement that a null pointer is allowed.
That implies, I think, that a null pointer is an invalid argument
value, and that
printf("%p\n", (void*)0);
has undefined behavior. [.. snip elaboration ..]
I reach a different conclusion. Let me try to explain what it is
and why I think so.
I find your reasoning plausible, but I disagree with it.
Post by Tim Rentsch
Library functions that take pointer-valued arguments do not
automatically regard a null pointer as an invalid value. Rather,
it depends in each case on how the particular argument is defined
or described in the function's semantics. Here are some examples.
(I have made no attempt to be exhaustive.)
My problem with that is that all the other items in the "such as" list
are things that are *always* invalid:

- a value outside the domain of the function,
- a pointer outside the address space of the program,
- a null pointer (this one is perhaps questionable),
- a pointer to non-modifiable storage when the corresponding parameter
is not const-qualified.

and there is no wording to suggest that the null pointer case is to be
treated specially, as something that may or may not be invalid.

Furthermore, the wording just before the "such as" clause allows for
exceptions to be stated explicitly -- and those exceptions *are* stated
explicitly for several functions that take pointers.

Perhaps these two arguments are somewhat in opposition to each other;
null pointers are sometimes valid, so that case *is* special. But the
"unless explicitly stated otherwise" clause applies to the rest of that
very long paragraph, and perhaps to all of 7.1.4.
Post by Tim Rentsch
For a %s value in fprintf() -
If no l length modifier is present, the argument shall be a
pointer to the initial element of an array of character
type. [...]
Clearly a null pointer doesn't satisfy the "shall" clause. Null
pointers are invalid values for these arguments.
Yes, but that doesn't *just* ban null pointers. I'd say null pointers
are not permitted both because of that specific clause and because null
pointers are invalid unless otherwise stated.
Post by Tim Rentsch
For the first two arguments to memcpy() -
The memcpy function copies n characters from the object
pointed to by s2 into the object pointed to by s1.
The parameters s1 and s2 are expected (ie, required) to point to
objects. Null pointers are invalid values for these arguments.
Which perhaps raises the question of memcpy(NULL, NULL, 0). But yes,
even where n==0, this refers to "the object", and if there is no such
object it doesn't apply.
Post by Tim Rentsch
For the first two arguments to strncpy() -
The strncpy function copies not more than n characters
(characters that follow a null character are not copied)
from the array pointed to by s2 to the array pointed to by
s1.
The parameters s1 and s2 are expected (ie, required) to point to
arrays. Null pointers are invalid values for these arguments.
Again, this does ban null pointers, but one could argue that 7.1.4
also bans null pointers since there's no explicit statement here
that they're permitted. (It might have been reasonable to permit
strncpy(NULL, NULL, 0) as a no-op, but the standard does not do so.)
Post by Tim Rentsch
For the first argument to snprintf() -
The snprintf function is equivalent to fprintf, except that
the output is written into an array (specified by argument
s) rather than to a stream. If n is zero, nothing is
written, and s may be a null pointer.
The parameter s1 is nominally expected to point to an array, but
the second sentence allows a null pointer in the case where n is
zero. Null pointers are valid values in such cases, and invalid
values otherwise.
Yes, a case where null pointers are explictly permitted, as required
(IMHO) by 7.1.4. Without the "and s may be a null pointer" wording,
I suggest that passing a null pointer would cause undefined behavior
even if n==0.
Post by Tim Rentsch
Now let's look at arguments corresponding to a %p conversion
in fprintf() (copied from what you wrote in your posting) -
The argument shall be a pointer to void. The value of the
pointer is converted to a sequence of printing characters,
[...]
What is required is an argument of type pointer to void. The
description doesn't say anything about what the pointer does or
does not point to; the action depends only on the value of the
pointer, not whether or not it's a null pointer. Null pointers
are valid values for these arguments.
To restate: in my view whether null pointers are meant to be
valid values depends on the description of the parameter in each
case. There is not a default presumption that null pointers are
invalid values - whether they are or not depends on the semantics
of the particular function and argument. I admit the wording
used in 7.1.4 p1 is somewhat misleading, and could give the
impression that null pointers are invalid values unless there
is an explicit statement to the contrary. The key point though
is not the null-pointer-ness but the invalid-ness - an argument
described as a pointer value, with no mention made of what the
pointer points to, admits a null pointer as a valid value.
By your interpretation, I think, the validity of a null pointer for a
given function has to be determined by "common sense" for each function
where it's not stated explicitly. I dislike relying on that.
Post by Tim Rentsch
I fully agree that text in the Standard deserves clarification
on this topic, so on that point I agree with you.
Excellent.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-10-18 20:18:54 UTC
Permalink
Raw Message
Post by Keith Thompson
My problem with that is that all the other items in the "such as" list
- a pointer to non-modifiable storage when the corresponding parameter
is not const-qualified.
What if the pointer is a "one-past" pointer for an object from which zero
bytes are to be copied? Or how about:

char foo[50];
sprintf(foo, "%s", "Hello");

Nothing I could see in the Standard specifies that "%s" is retrieved
using a const-qualified character pointer, but string literals are
considered to be non-writable storage. By what logic should the sprintf
call above not be regarded as invoking UB, beyond the fact that nothing
in the specification for the function indicates that it would have any
reason to write to that pointer?

I haven't looked through the signatures of all library functions, but I
would not be surprised if there were some others which have a variety of
usage cases, some of which write to storage which is given to them and
some of which don't, and which neglected to explicitly indicate that
passing a pointer to a const-qualified object is allowable in cases where
the object won't be written. I'm pretty certain that some of the functions
in POSIX behave like that.
Post by Keith Thompson
Post by Tim Rentsch
The parameters s1 and s2 are expected (ie, required) to point to
objects. Null pointers are invalid values for these arguments.
Which perhaps raises the question of memcpy(NULL, NULL, 0). But yes,
even where n==0, this refers to "the object", and if there is no such
object it doesn't apply.
Is doing nothing with an object considered an action upon that object?
Post by Keith Thompson
By your interpretation, I think, the validity of a null pointer for a
given function has to be determined by "common sense" for each function
where it's not stated explicitly. I dislike relying on that.
It's not so much "common sense" as a general principle that pointers given
to a function must support the operations which a reasonable implementation
of that function would perform upon those pointers. There is one aspect in
which a null source or destination to memcpy could be dodgy, which is that
C, unlike C++, fails to specify that the sum of a null pointer and 0 is a
null pointer, and that the difference between any two null pointers is zero.
A memcpy implementation might plausibly try to compute a one-past pointer
for the source or destination before checking the length, and on some
implementations such a computation could fail if the pointer is null even
if the length is zero.
Post by Keith Thompson
Post by Tim Rentsch
I fully agree that text in the Standard deserves clarification
on this topic, so on that point I agree with you.
Excellent.
Politically, such a thing might be difficult unless done as part of a
general restructuring of the Standard. Otherwise, the fact that the
existing wording had proven adequate for many years could be taken to
suggest that it was being changed specifically to discredit the behavior
of certain compilers.
Keith Thompson
2016-10-18 21:02:35 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
My problem with that is that all the other items in the "such as" list
- a pointer to non-modifiable storage when the corresponding parameter
is not const-qualified.
What if the pointer is a "one-past" pointer for an object from which zero
char foo[50];
sprintf(foo, "%s", "Hello");
Nothing I could see in the Standard specifies that "%s" is retrieved
using a const-qualified character pointer, but string literals are
considered to be non-writable storage. By what logic should the sprintf
call above not be regarded as invoking UB, beyond the fact that nothing
in the specification for the function indicates that it would have any
reason to write to that pointer?
The one-past-the-end case should probably be clarified. I agree that it
makes sense for this:
char target[10];
char source[10];
memcpy(source+10, target+10, 0);
to be a well defined no-op, and the standard does not currently make
that clear.

As for sprintf(foo, "%s", "hello"), the clause in 7.1.4 does not apply.
It refers to "a pointer to non-modifiable storage when the corresponding
parameter is not const-qualified". There is no corresponding
parameter, so 7.1.4 doesn't imply that "hello" is invalid. Validity is
defined by 7.21.6.1p8, which requires "a pointer to the initial element
of an array of character type". The string literal "hello", after
array-to-pointer conversion, certainly qualifies.
Post by s***@casperkitty.com
I haven't looked through the signatures of all library functions, but I
would not be surprised if there were some others which have a variety of
usage cases, some of which write to storage which is given to them and
some of which don't, and which neglected to explicitly indicate that
passing a pointer to a const-qualified object is allowable in cases where
the object won't be written. I'm pretty certain that some of the functions
in POSIX behave like that.
Post by Keith Thompson
Post by Tim Rentsch
The parameters s1 and s2 are expected (ie, required) to point to
objects. Null pointers are invalid values for these arguments.
Which perhaps raises the question of memcpy(NULL, NULL, 0). But yes,
even where n==0, this refers to "the object", and if there is no such
object it doesn't apply.
Is doing nothing with an object considered an action upon that object?
Irrelevant, since there is no object.
Post by s***@casperkitty.com
Post by Keith Thompson
By your interpretation, I think, the validity of a null pointer for a
given function has to be determined by "common sense" for each function
where it's not stated explicitly. I dislike relying on that.
It's not so much "common sense" as a general principle that pointers given
to a function must support the operations which a reasonable implementation
of that function would perform upon those pointers. There is one aspect in
which a null source or destination to memcpy could be dodgy, which is that
C, unlike C++, fails to specify that the sum of a null pointer and 0 is a
null pointer, and that the difference between any two null pointers is zero.
A memcpy implementation might plausibly try to compute a one-past pointer
for the source or destination before checking the length, and on some
implementations such a computation could fail if the pointer is null even
if the length is zero.
"Reasonableness" is a valid criterion for suggesting how a given
implementation should operate within the requirements of the standard.
I do not accept, at least in this case, that it's a valid criterion for
inferring requirements to be imposed on all conforming implementations.

And you've just given a possible justification for memcpy(NULL, NULL, 0)
behaving as something other than a no-op.
Post by s***@casperkitty.com
Post by Keith Thompson
Post by Tim Rentsch
I fully agree that text in the Standard deserves clarification
on this topic, so on that point I agree with you.
Excellent.
Politically, such a thing might be difficult unless done as part of a
general restructuring of the Standard. Otherwise, the fact that the
existing wording had proven adequate for many years could be taken to
suggest that it was being changed specifically to discredit the behavior
of certain compilers.
I hardly think that would be a problem. I know of no implementation
whose behavior is inconsistent with any clarification being
proposed, either the one I advocate (which would state explictly
that printf("%p", (void*)0) is well defined, of course with
implementation-defined output), or the one I presume you would
advocate (which would state that memcpy(NULL, NULL, 0) is a no-op).

Is there some implementation you have in mind?

(Another likely outcome is that the committee would state that the
existing wording is clear enough.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Keith Thompson
2016-10-18 21:28:03 UTC
Permalink
Raw Message
Keith Thompson <kst-***@mib.org> writes:
[...]
Post by Keith Thompson
The one-past-the-end case should probably be clarified. I agree that it
char target[10];
char source[10];
memcpy(source+10, target+10, 0);
to be a well defined no-op, and the standard does not currently make
that clear.
I meant to write:

memcpy(target+10, source+10, 0);
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Kaz Kylheku
2016-10-18 20:28:00 UTC
Permalink
Raw Message
Post by Keith Thompson
Furthermore, the wording just before the "such as" clause allows for
exceptions to be stated explicitly -- and those exceptions *are* stated
explicitly for several functions that take pointers.
I can easily agree with Tim on this in the following way: suppose we
regard the list introduced by "such as" to be a list of examples.

This is justifiable: in the English language, "such as" is a a phrase
which introduces examples, just like "for example".

In that case, that list is not normative; it doesn't have the power
to define a requirement about what is valid and what isn't.
The Introduction doesn't assert that examples are always introduced
with an EXAMPLE heading, and that only examples deliminated in that
way count as non-normative examples.

It is then a situation that could benefit from a clarification,
as Tim says, not an outright defect.
Keith Thompson
2016-10-18 21:15:33 UTC
Permalink
Raw Message
Post by Kaz Kylheku
Post by Keith Thompson
Furthermore, the wording just before the "such as" clause allows for
exceptions to be stated explicitly -- and those exceptions *are* stated
explicitly for several functions that take pointers.
I can easily agree with Tim on this in the following way: suppose we
regard the list introduced by "such as" to be a list of examples.
This is justifiable: in the English language, "such as" is a a phrase
which introduces examples, just like "for example".
Another interpretation is that, as you say, "such as" doesn't introduce
an *exhaustive* list of invalid arguments, but all the examples in the
list are invalid (unless explicitly stated to be valid for particular
functions).

Consider a phrase like "prime numbers, such as 11, 43, and 51". I would
call the presence of 51 (3*17) in that list an error.
Post by Kaz Kylheku
In that case, that list is not normative; it doesn't have the power
to define a requirement about what is valid and what isn't.
The Introduction doesn't assert that examples are always introduced
with an EXAMPLE heading, and that only examples deliminated in that
way count as non-normative examples.
It is then a situation that could benefit from a clarification,
as Tim says, not an outright defect.
Leaving aside the detailed wording as it currently exists, I think the
ideal way to define which arguments are invalid would be something like
this:

The following (insert rigorously defined list of cases) are invalid
arguments unless explicitly stated otherwise in the detailed
descriptions that follow.

with *all* exceptions clearly stated in the descriptions of individual
functions.

I suggest that the current wording is an attempt to do exactly that.
The current discussion indicates, I think, that the attempt was not
entirely successful.

English are hard.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-10-18 22:10:35 UTC
Permalink
Raw Message
Post by Keith Thompson
Another interpretation is that, as you say, "such as" doesn't introduce
an *exhaustive* list of invalid arguments, but all the examples in the
list are invalid (unless explicitly stated to be valid for particular
functions).
It implies that the items in the list would be invalid under at least some,
probably most, circumstances. Saying "this veterinary clinic treats
domestic animals such as cats and dogs, and does not treat exotic animals
such as rabbits and birds" does not imply that there are not some dogs which
the clinic would decline to treat [e.g. one who has been dead for a few
hours], nor that there are no circumstances in which the clinic would treat
a rabbit or bird [the clinic might would likely refer a customer with such
an animal to a specialty clinic whenever practical, but might still be
willing to attempt emergency treatment in cases where the animal would
almost certainly not live long enough to reach a specialty clinic unless
stabilized first].
Post by Keith Thompson
Consider a phrase like "prime numbers, such as 11, 43, and 51". I would
call the presence of 51 (3*17) in that list an error.
Every number is either prime in all circumstances, or composite in all
circumstances. By contrast, many parameter values may be valid in some
cases and invalid in others.
Post by Keith Thompson
Leaving aside the detailed wording as it currently exists, I think the
ideal way to define which arguments are invalid would be something like
The following (insert rigorously defined list of cases) are invalid
arguments unless explicitly stated otherwise in the detailed
descriptions that follow.
with *all* exceptions clearly stated in the descriptions of individual
functions.
I would add "...and programmers should assume that an implementation may
behave in wacky fashion if any of them are passed to library functions,
even on platforms where there would be no plausible reason to expect
such behavior, unless the implementation expressly promises to behave
sanely", unless the authors of the Standard didn't really mean that.
Keith Thompson
2016-10-18 22:35:44 UTC
Permalink
Raw Message
[...]
Post by s***@casperkitty.com
Post by Keith Thompson
Leaving aside the detailed wording as it currently exists, I think the
ideal way to define which arguments are invalid would be something like
The following (insert rigorously defined list of cases) are invalid
arguments unless explicitly stated otherwise in the detailed
descriptions that follow.
with *all* exceptions clearly stated in the descriptions of individual
functions.
I would add "...and programmers should assume that an implementation may
behave in wacky fashion if any of them are passed to library functions,
even on platforms where there would be no plausible reason to expect
such behavior, unless the implementation expressly promises to behave
sanely", unless the authors of the Standard didn't really mean that.
I wouldn't, and that wording would not be helpful to anyone who
understands what the phrase "undefined behavior" means.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-10-19 14:54:45 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by s***@casperkitty.com
I would add "...and programmers should assume that an implementation may
behave in wacky fashion if any of them are passed to library functions,
even on platforms where there would be no plausible reason to expect
such behavior, unless the implementation expressly promises to behave
sanely", unless the authors of the Standard didn't really mean that.
I wouldn't, and that wording would not be helpful to anyone who
understands what the phrase "undefined behavior" means.
That's not what it used to mean. It used to mean that the Standard
neither imposed requirements, **nor revoked any expectations to which
programmers would have been entitled to in its absence**. You are
saying that even if every single compiler for XYZ platform has behaved
in a certain useful fashion, a significant amount of code for that
platform relies upon that behavior, and there is no plausible reason
why any other behavior would be more efficient in non-contrived
situations, programmers would still not be entitled to expect that
behavior if the Standard doesn't mandate it.

Would a programmer be entitled to expect that (int16_t)(uint16_t)-42u will
yield -42 without having to read compiler documentation? Why or why not?
Is there anything in the Standard that would forbid a two's-complement
implementation which does not use padding from documenting some unusual
rules for performing such a conversion? Or would the fact that no machine
which implements int16_t has ever done such a thing be adequate to justify
an expectation that future machines won't do so either?
Kaz Kylheku
2016-10-19 15:15:49 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
Post by s***@casperkitty.com
I would add "...and programmers should assume that an implementation may
behave in wacky fashion if any of them are passed to library functions,
even on platforms where there would be no plausible reason to expect
such behavior, unless the implementation expressly promises to behave
sanely", unless the authors of the Standard didn't really mean that.
I wouldn't, and that wording would not be helpful to anyone who
understands what the phrase "undefined behavior" means.
That's not what it used to mean. It used to mean that the Standard
neither imposed requirements, **nor revoked any expectations to which
programmers would have been entitled to in its absence**.
Pretty much all situations which are *explicitly* worded as being
"undefined behavior" are precisely doing that: taking away from some
of the real or imagined requirements that would otherwise apply.

I can't imagine it has ever been otherwise in any ANSI or ISO
programming language standard using the term "undefined behavior"
or equivalent.

Sometimes such wording takes away very reasonable requirements,
either stated or inferred with good justification.
It's a form of "collateral damage".
s***@casperkitty.com
2016-10-19 16:30:08 UTC
Permalink
Raw Message
Post by Kaz Kylheku
Pretty much all situations which are *explicitly* worded as being
"undefined behavior" are precisely doing that: taking away from some
of the real or imagined requirements that would otherwise apply.
There are many behavioral aspects that differ between platforms. For some
of them it would be hard to imagine a platform where ensuring a single
predictable consistent behavior would cost anything. For others, such
platforms exist or might might plausibly exist. The authors of the
Standard labeled behavioral aspects of the first type "Implementation-
defined" and those of the second type "Undefined".

I see no indication that they intended such labeling to bind programmers
to the limitations of platforms upon which their code would never run or
which, in some cases, *might not even exist* [e.g. two's-complement
platforms where left-shifting a negative number will do anything other
than yield a value in cases where only "1" bits are shifted into or
through the sign bit].
Post by Kaz Kylheku
I can't imagine it has ever been otherwise in any ANSI or ISO
programming language standard using the term "undefined behavior"
or equivalent.
C was invented as a low-level language, and the range of platforms upon
which it is run and the range of tasks which it is called upon to perform
are both huge. If one defines "language" as a mapping between source texts
and program behaviors, there is no way to define a single language which
is usable on that range of platforms and for that range of tasks.

In the 1990s, C was generally recognized not as a single language, but rather
as a mapping between execution platforms and languages. Platforms with
two's-complement arithmetic implement a language where -2 & 15 == 14. Those
with one's-complement arithmetic implement one where -2 & 15 == 13. As it
is, C still *is* such mapping, but for some reason people like to pretend
it's a single language.

Perhaps what's needed is a terminology to distinguish the language "C for
platform X", where C constructs that would have a clear natural mapping to
the features of platform X have the behaviors implied by that mapping,
from "C, targeting platform X"
Post by Kaz Kylheku
Sometimes such wording takes away very reasonable requirements,
either stated or inferred with good justification.
It's a form of "collateral damage".
In many cases there is no justification whatsoever, because the authors
of the Standard thought they were simply upholding the status quo where
edge cases were treated as having defined behavior on platforms where
such treatment was cheap and useful, but not on those where it would be
expensive or problematic.
Keith Thompson
2016-10-19 15:46:04 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
Post by s***@casperkitty.com
I would add "...and programmers should assume that an implementation may
behave in wacky fashion if any of them are passed to library functions,
even on platforms where there would be no plausible reason to expect
such behavior, unless the implementation expressly promises to behave
sanely", unless the authors of the Standard didn't really mean that.
I wouldn't, and that wording would not be helpful to anyone who
understands what the phrase "undefined behavior" means.
That's not what it used to mean. It used to mean that the Standard
neither imposed requirements, **nor revoked any expectations to which
programmers would have been entitled to in its absence**.
No doubt you have a written citation for that.
Post by s***@casperkitty.com
You are
saying that even if every single compiler for XYZ platform has behaved
in a certain useful fashion, a significant amount of code for that
platform relies upon that behavior, and there is no plausible reason
why any other behavior would be more efficient in non-contrived
situations, programmers would still not be entitled to expect that
behavior if the Standard doesn't mandate it.
I'm saying that undefined behavior is behavior for which the standard
imposes no requirements. I'm also saying that the standard does
not impose any requirements on the behavior of memcpy(NULL, NULL, 0).

If you wish to assume that memcpy(NULL, NULL, 0) is a no-op,
you're free to do so. I know of no implementation that would
violate that assumption.

[snip]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-10-19 16:53:44 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by s***@casperkitty.com
That's not what it used to mean. It used to mean that the Standard
neither imposed requirements, **nor revoked any expectations to which
programmers would have been entitled to in its absence**.
No doubt you have a written citation for that.
Is there anything in the Standard that would indicate an intention to
revoke what had previously been reasonable behavioral expectations for
particular platforms? From what I've seen, the intention was to avoid
breaking existing code, which would if anything imply that if all
implementations for platform X supported some feature which some code
for platform X relied, C89 implementations for such platform would
continue to do so.
Post by Keith Thompson
If you wish to assume that memcpy(NULL, NULL, 0) is a no-op,
you're free to do so. I know of no implementation that would
violate that assumption.
I would consider memcpy(ptr,x,y) in cases where (x,y) will sometimes
be (null,0) to be about as safe on conventional-architecture machines
as:

unsigned mul(unsigned short x, unsigned short y) { return x*y; }

I.e. safe on sane compilers, not safe on gcc. The developers of gcc are
actively adding new "optimizations" to identify cases where the natural
straightforward behavior of code would be useful, but where the Standard
imposes no requirements and substituting some other arbitrary behavior
might make the program more "efficient". When they do this, they often
add compiler switches to block such optimizations, but do so in such a
way that there's no way to ensure that a build file will be compatible
with a future version of gcc. Further, gcc's documentation fails to
actually guarantee much of anything about the behavior of its switches
like -fno-strict-aliasing, -fwrapv, and -fno-strict-overflow.
Keith Thompson
2016-10-19 18:23:34 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
Post by s***@casperkitty.com
That's not what it used to mean. It used to mean that the Standard
neither imposed requirements, **nor revoked any expectations to which
programmers would have been entitled to in its absence**.
No doubt you have a written citation for that.
Is there anything in the Standard that would indicate an intention to
revoke what had previously been reasonable behavioral expectations for
particular platforms? From what I've seen, the intention was to avoid
breaking existing code, which would if anything imply that if all
implementations for platform X supported some feature which some code
for platform X relied, C89 implementations for such platform would
continue to do so.
So no written citation, just your inference about the intent of the
committee. Got it.

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
James R. Kuyper
2016-10-19 18:47:19 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
Post by s***@casperkitty.com
That's not what it used to mean. It used to mean that the Standard
neither imposed requirements, **nor revoked any expectations to which
programmers would have been entitled to in its absence**.
No doubt you have a written citation for that.
Is there anything in the Standard that would indicate an intention to
revoke what had previously been reasonable behavioral expectations for
particular platforms?
No - but those expectations came from the compiler vendors for those
platforms, and that's where you should go to complain if those
expectations were "revoked". The standard was never intended to silently
promote those platform-specific expectations to standard-mandated
guarantees. An implementation does not become non-conforming just
because it violates "reasonable expectations" that are not reflected in
the actual words of the standard. It might become "useless" from your
point of view, but it's still conforming.
Post by s***@casperkitty.com
... From what I've seen, the intention was to avoid
breaking existing code, which would if anything imply that if all
implementations for platform X supported some feature which some code
for platform X relied, C89 implementations for such platform would
continue to do so.
As a quality-of-implementation issue, sure - but QoI is always a matter
of judgement, on which different people can be expected to make
different judgements. Be prepared to fight it out with people who would
consider the implementation quality to be low unless it provides
precisely the same optimizations that you complain about. No one creates
these optimizations just for the fun of listening to you complain about
them - they're motivated by actual real-world benefits that they
consider to be sufficiently important to justify breaking code that
should never have been written that way in the first place, because it
has always had undefined behavior. The marketplace is what's going to
decide whether their desires or yours are more important. It was never
intended that the standard guarantee that you can rely on things not
explicitly guaranteed in the standard itself.
--
s***@casperkitty.com
2016-10-19 21:05:04 UTC
Permalink
Raw Message
Post by James R. Kuyper
Post by s***@casperkitty.com
Is there anything in the Standard that would indicate an intention to
revoke what had previously been reasonable behavioral expectations for
particular platforms?
No - but those expectations came from the compiler vendors for those
platforms, and that's where you should go to complain if those
expectations were "revoked". The standard was never intended to silently
promote those platform-specific expectations to standard-mandated
guarantees. An implementation does not become non-conforming just
because it violates "reasonable expectations" that are not reflected in
the actual words of the standard. It might become "useless" from your
point of view, but it's still conforming.
The authors of gcc claim that even if every previous compiler for a
platform (e.g. 80386) has behaved in a certain way where the Standard
imposes no requirements, programmers have no right to expect that future
compilers won't change the behavior without notice.

Prior to the C89, many execution platforms defined behaviors in situations
other platforms would behave unpredictably, and much of the usefulness of
the language came from the fact that if a platform's natural behavior
would meet requirements in all cases without explicit edge-case code, such
code could be omitted from both the source and executable.

In 1986, if a compiler would have to go out of its way not to support a
behavioral guarantee some target platforms would naturally offer (e.g.
multiplying two integers, or performing a properly-aligned read of any
storage that is within an allocation, won't have any side-effects) and
there was no imaginable benefit an implementation could receive from doing
so, would a programmer who was exclusively targeting such platforms not
be reasonably entitled to expect that quality implementations for them
would support those behaviors?

If such an expectation would have been reasonable in 1986, at what point
would it have ceased to be reasonable, and what would have prompted such
a change? If it would not have been reasonable, what reason would a
programmer have had to expect that quality implementations might go out
of their way not to support such behaviors, when doing so offered no
imaginable benefit.
Post by James R. Kuyper
As a quality-of-implementation issue, sure - but QoI is always a matter
of judgement, on which different people can be expected to make
different judgements. Be prepared to fight it out with people who would
consider the implementation quality to be low unless it provides
precisely the same optimizations that you complain about.
The authors of gcc seem to take the view that the authors of the Standard
already decided that certain behaviors which common platforms could
support at essentially zero cost aren't worth supporting even on such
platforms, and that there is thus no reason they should weigh the costs and
benefits on commonplace platforms.
Post by James R. Kuyper
No one creates
these optimizations just for the fun of listening to you complain about
them - they're motivated by actual real-world benefits that they
consider to be sufficiently important to justify breaking code that
should never have been written that way in the first place, because it
has always had undefined behavior.
Code which had been able to run without difficulty on just about
every remotely-conforming general-purpose implementation for
commonplace hardware before the authors of gcc decided to add a
breaking "optimization" may be viewed as "non-portable" by the
authors of gcc, but I question the "shouldn't have been written
that way in the first place" comment in cases where the code as
written would have been processed more efficiently by nearly all
of the aforementioned implementations than any strictly-conforming
program could have been.
Post by James R. Kuyper
The marketplace is what's going to
decide whether their desires or yours are more important. It was never
intended that the standard guarantee that you can rely on things not
explicitly guaranteed in the standard itself.
I've certainly seen a lot of build files which explicitly disable a whole
bunch of gcc optimizations. That seems to be a form of voting, though not
one the authors of gcc seem very interested in.
Keith Thompson
2016-10-19 21:21:57 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by James R. Kuyper
Post by s***@casperkitty.com
Is there anything in the Standard that would indicate an intention to
revoke what had previously been reasonable behavioral expectations for
particular platforms?
No - but those expectations came from the compiler vendors for those
platforms, and that's where you should go to complain if those
expectations were "revoked". The standard was never intended to silently
promote those platform-specific expectations to standard-mandated
guarantees. An implementation does not become non-conforming just
because it violates "reasonable expectations" that are not reflected in
the actual words of the standard. It might become "useless" from your
point of view, but it's still conforming.
The authors of gcc claim that even if every previous compiler for a
platform (e.g. 80386) has behaved in a certain way where the Standard
imposes no requirements, programmers have no right to expect that future
compilers won't change the behavior without notice.
So take it up with them. They have mailing lists.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
j***@verizon.net
2016-10-19 23:07:18 UTC
Permalink
Raw Message
On Wednesday, October 19, 2016 at 5:05:06 PM UTC-4, ***@casperkitty.com wrote:
...
Post by s***@casperkitty.com
In 1986, if a compiler would have to go out of its way not to support a
behavioral guarantee some target platforms would naturally offer (e.g.
multiplying two integers, or performing a properly-aligned read of any
storage that is within an allocation, won't have any side-effects) and
there was no imaginable benefit an implementation could receive from doing
so, would a programmer who was exclusively targeting such platforms not
be reasonably entitled to expect that quality implementations for them
would support those behaviors?
No. That would only make sense if "no imaginable benefit" was misinterpreted as being the same as "no benefit". That reflects a failure of that programmer to be humble enough to realize that he's not actually bright enough to imagine every possibility. What you can reasonably assume is:
a) No reasonable implementor will violate those assumptions unless there is sufficient benefit of some type (by definition, one not imagined by said programmer) to justify doing so. Since such code has undefined behavior, it should never have been written in the first place, so "sufficient benefit to justify doing so" will usually be a very low bar to clear.
b) An implementor unreasonable enough to violate those assumptions without sufficient benefit to justify doing so will incur the anger of some of their customers. If they generate too much anger, they'll lose customers. They might or might not care about that.
Post by s***@casperkitty.com
If such an expectation would have been reasonable in 1986,
That is a contrary-to-fact assumption for any such code.

...
Post by s***@casperkitty.com
The authors of gcc seem to take the view that the authors of the Standard
already decided that certain behaviors which common platforms could
support at essentially zero cost aren't worth supporting even on such
platforms, and that there is thus no reason they should weigh the costs and
benefits on commonplace platforms.
That's their right to decide. If you disagree, complain to them - but complain about their quality of implementation - don't pretend that it's a conformance issue. They had no obligation to conform to your unsupported expectations. They do have an obligation to produce a product with sufficient quality to justify people using it. Threaten to not use it unless they change this - as a customer, you have an inherent right to make such a threat - just as they have an inherent right to ignore such threats, which they probably will.

Alternatively, convince someone to create a compiler that handles such issues the way you want them to be handled.

...
Post by s***@casperkitty.com
authors of gcc, but I question the "shouldn't have been written
that way in the first place" comment in cases where the code as
written would have been processed more efficiently by nearly all
of the aforementioned implementations than any strictly-conforming
program could have been.
If that's the case, you should complain to the implementors of those compilers for failing to optimize the equivalent strictly-conforming code by implementing it the same way as your preferred code.

...
Post by s***@casperkitty.com
I've certainly seen a lot of build files which explicitly disable a whole
bunch of gcc optimizations. That seems to be a form of voting, though not
one the authors of gcc seem very interested in.
The only form of voting that really matters is lost income from lost sales - your leverage to coerce providers of free compilers to do anything they don't want to do is quite negligible. On the other hand, your power to change the behavior of an open-source compiler by creating your own version of it is immense.
Kaz Kylheku
2016-10-19 23:27:19 UTC
Permalink
Raw Message
Post by j***@verizon.net
The only form of voting that really matters is lost income from lost
sales - your leverage to coerce providers of free compilers to do
anything they don't want to do is quite negligible. On the other hand,
your power to change the behavior of an open-source compiler by
creating your own version of it is immense.
It's also very leveraged. If all you care about is that gcc has certain
behaviors in undefined areas, that is a very narrow scope for a fork.
You can keep up with upstream gcc development and just keep rebasing
the relatively small number of changes to fix things you don't like.

I'm maintaining a fork of the Cygwin DLL called Cygnal which provides
"native-Windows-like" behaviors in some areas, allowing Cygwin to be
used as a run-time library for standalone Windows programs that don't
carry the Cygwin environment with them, and are for Windows users who
expect Windows conventions.

http://www.kylheku.com/cygnal/
s***@casperkitty.com
2016-10-20 05:06:04 UTC
Permalink
Raw Message
Post by j***@verizon.net
...
Post by s***@casperkitty.com
In 1986, if a compiler would have to go out of its way not to support a
behavioral guarantee some target platforms would naturally offer (e.g.
multiplying two integers, or performing a properly-aligned read of any
storage that is within an allocation, won't have any side-effects) and
there was no imaginable benefit an implementation could receive from doing
so, would a programmer who was exclusively targeting such platforms not
be reasonably entitled to expect that quality implementations for them
would support those behaviors?
No. That would only make sense if "no imaginable benefit" was misinterpreted as being the same as "no benefit". That reflects a failure of that programmer to be humble enough to realize that he's not actually bright enough to imagine every possibility.
How much of a performance hit should programmers have been willing to
accept to guard against the possibility that someone writing a compiler
for their platform 20 years later might decide to regard as "broken" the
constructs which allowed the code to be more efficient than it otherwise
could have?
Post by j***@verizon.net
a) No reasonable implementor will violate those assumptions unless there is sufficient benefit of some type (by definition, one not imagined by said programmer) to justify doing so. Since such code has undefined behavior, it should never have been written in the first place, so "sufficient benefit to justify doing so" will usually be a very low bar to clear.
In 1986, *ALL* C code had behavior which was not defined by the C Standard.

Further, in many cases code using behaviors which were unanimously supported
by implementations for some platforms but not others could achieve 50% or
better speedups on such platforms compared with code that relied only upon
behaviors that the Standard would mandate for all platforms. Are you saying
that programmers should have written slower code? What would they have
gained by doing so?
Post by j***@verizon.net
b) An implementor unreasonable enough to violate those assumptions without sufficient benefit to justify doing so will incur the anger of some of their customers. If they generate too much anger, they'll lose customers. They might or might not care about that.
Commercial compilers tend to support coding constructs which are useful,
whether or not the Standard compels them to do so.
Post by j***@verizon.net
That's their right to decide. If you disagree, complain to them - but complain about their quality of implementation - don't pretend that it's a conformance issue. They had no obligation to conform to your unsupported expectations. They do have an obligation to produce a product with sufficient quality to justify people using it. Threaten to not use it unless they change this - as a customer, you have an inherent right to make such a threat - just as they have an inherent right to ignore such threats, which they probably will.
It is a QOI issue. As I've said many times, the authors of the Standard
explicitly state in the rationale that they make no effort to prevent
someone from writing a conforming-but-useless implementation.
Post by j***@verizon.net
Alternatively, convince someone to create a compiler that handles such issues the way you want them to be handled.
Better compilers do exist in the commercial marketplace. I use one.
Post by j***@verizon.net
If that's the case, you should complain to the implementors of those compilers for failing to optimize the equivalent strictly-conforming code by implementing it the same way as your preferred code.
There was and in some cases *still is* no way by which some tasks can
be expressed in strictly-conforming code in such a way as to allow
a conforming compiler to produce an executable as efficient as what could
be produced by a relatively simplistic compiler using source code that
exploits behaviors that could be cheaply supported on 90%+ of platforms.

C was designed to allow a programmer armed with a very simple compiler to
produce efficient code on the kinds of machines that were becoming popular
in the 1970s, by letting the programmer take advantage of the features of
the particular machines being targeted. A programmer who complained that
a compiler didn't turn:

int i;
for (i=0; i<10; i++) a[i] += b[i];

into code equivalent to:

int *p1=a,*p2=b;
int i=10;
do { *p1++ += *p2++; } while(--i);

would have been soundly laughed at, since it was the job of the programmer,
not the C compiler, to perform such transforms.

Changes in architectures mean that many kinds of optimization can be
performed better by compilers than by programmers. On the other hand,
especially when targeting simpler platforms (e.g. ARM Cortex-M0) there
will be many things a programmer can optimize that a compiler can't, but
relatively few that a compiler could optimize but a programmer who wanted
to make the effort, couldn't.
Tim Rentsch
2016-10-19 07:59:45 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Tim Rentsch
Post by Keith Thompson
Each of the following statements applies unless explicitly
If an argument to a function has an invalid value (such as
a value outside the domain of the function, or a pointer
outside the address space of the program, or a null pointer,
or a pointer to non-modifiable storage when the corresponding
parameter is not const-qualified) or a type (after promotion)
not expected by a function with variable number of arguments,
the behavior is undefined.
I infer from this that passing a null pointer to a library function has
undefined behavior unless there's an explicit statement to the contrary
for that function. For example, strlen(NULL) has undefined behavior,
but free(NULL) is well defined because there's an explicit statement to
that effect.
p The argument shall be a pointer to void. The value of the
pointer is converted to a sequence of printing characters, in
an implementation-defined manner.
There is no explicit statement that a null pointer is allowed.
That implies, I think, that a null pointer is an invalid argument
value, and that
printf("%p\n", (void*)0);
has undefined behavior. [.. snip elaboration ..]
I reach a different conclusion. Let me try to explain what it is
and why I think so.
I find your reasoning plausible, but I disagree with it.
[.. elaboration on the particulars ..]
I think I get your reasoning, and although I'm not convinced
by it it does seem not unreasonable. However I must demur
on a small point coming up...
Post by Keith Thompson
Post by Tim Rentsch
To restate: in my view whether null pointers are meant to be
valid values depends on the description of the parameter in each
case. There is not a default presumption that null pointers are
invalid values - whether they are or not depends on the semantics
of the particular function and argument. I admit the wording
used in 7.1.4 p1 is somewhat misleading, and could give the
impression that null pointers are invalid values unless there
is an explicit statement to the contrary. The key point though
is not the null-pointer-ness but the invalid-ness - an argument
described as a pointer value, with no mention made of what the
pointer points to, admits a null pointer as a valid value.
By your interpretation, I think, the validity of a null pointer
for a given function has to be determined by "common sense" for
each function where it's not stated explicitly. I dislike relying
on that.
I'm sorry but I have to object to the "common sense" dependency
here. That isn't what I said, and moreover I believe it isn't
true. In every case I've looked at (specifically with regard to
the possibility of a null pointer) the question has a clear
answer to anyone who is generally familiar with the Standard.
It does not require "common sense". Let me ask you to try the
following experiment: pretend the first parenthesized phrase in
the second sentence of 7.1.4 p1 were not there, and look at the
descriptions for functions that have pointer parameter(s). Do
you find any where what is intended (for null pointers being
allowed) disagrees with a straightforward textual reading of
the description? I expect you will not. The result doesn't
depend on any "common sense" reasoning, only an understanding
of what various terms in the Standard mean. (Or if they do
depend on that, that's a defect in the writing in question,
but I haven't found any of those.)

I have by now looked carefully at the descriptions of pointer
parameters in lots of library functions, and AFAICS there isn't
a one that needs that "such as ..." phrase to disambiguate it.
(ie, as regards a null pointer value being valid.) I agree with
what I think your basic point is here, which is that "rhetorical
interpretation" should not be required. In this case though I
don't think any is.
Keith Thompson
2016-10-19 15:39:13 UTC
Permalink
Raw Message
[...]
Post by Tim Rentsch
Post by Keith Thompson
I find your reasoning plausible, but I disagree with it.
[.. elaboration on the particulars ..]
I think I get your reasoning, and although I'm not convinced
by it it does seem not unreasonable. However I must demur
on a small point coming up...
[snip]
Post by Tim Rentsch
Post by Keith Thompson
By your interpretation, I think, the validity of a null pointer
for a given function has to be determined by "common sense" for
each function where it's not stated explicitly. I dislike relying
on that.
I'm sorry but I have to object to the "common sense" dependency
here. That isn't what I said, and moreover I believe it isn't
true. In every case I've looked at (specifically with regard to
the possibility of a null pointer) the question has a clear
answer to anyone who is generally familiar with the Standard.
It does not require "common sense". Let me ask you to try the
following experiment: pretend the first parenthesized phrase in
the second sentence of 7.1.4 p1 were not there, and look at the
descriptions for functions that have pointer parameter(s). Do
you find any where what is intended (for null pointers being
allowed) disagrees with a straightforward textual reading of
the description? I expect you will not. The result doesn't
depend on any "common sense" reasoning, only an understanding
of what various terms in the Standard mean. (Or if they do
depend on that, that's a defect in the writing in question,
but I haven't found any of those.)
I tentatively concede the point, and I'll try to find the time to look
over the library functions that take pointer arguments as you suggest.

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-10-18 16:06:10 UTC
Permalink
Raw Message
Post by Keith Thompson
I'm assuming the intent is that the cases mentioned in the "such as"
clause of 7.1.4 are *always* invalid unless stated otherwise. The null
pointer case is the only one that could reasonably *sometimes* be valid.
On the other hand, the presence of explicit statements for some
functions that null pointers are allowed might suggest that such a
statement is required. On the other other hand, in most cases there has
to be an explicit description to describe what the behavior is.
In most of the cases where the Standard expressly allows a null pointer,
nearly all implementations would typically be required to add an explicit
check to avoid doing anything bad. For sprintf the explicit check is on
the buffer length rather than the pointer, but the general principle
applies.

For the vast majority of platforms, an implementation of memcpy would have
to go out of its way to make memcpy(x,y,0) even look at the pointers it
is given; even on those where one might plausibly write code that looks
at pointers before copying, it would usually be desirable to do something
like:

if (size <= 4) // Value depends on optimization efforts in second branch
{
switch(size)
{ case 4: *dest++ = *src++;
case 3: *dest++ = *src++;
case 2: *dest++ = *src++;
case 1: *dest++ = *src++;
case 0:
return dest-size;
}
}
else
{
// Examine pointers, check for alignment, etc.
}

Is there any evidence that the authors of the Standard considered the
possibility that implementations would have any reason to treat
memcpy(any,any,0) as anything other than a no-op, or that they would have
been inclined to forbid compiler writers from doing something that they
wouldn't do anyway?

I'm curious what conforming C compiler was the first to regard
memcpy(any,any,0) as anything other than a no-op in a non-sanitizing
build.
s***@casperkitty.com
2016-10-18 20:42:23 UTC
Permalink
Raw Message
Post by Keith Thompson
- Null pointers are always invalid arguments to library functions
unless explicitly permitted for a given function;
My own inclination would be to limit that to functions which are allowed
to read or write the pointers in question, and that the prohibition against
a pointer to non-writable objects to functions where they are not
const-qualified would only apply to functions which would be allowed to
write the storage identified by the pointers or cause it to be written.

Otherwise, by my reading of the Standard, printf("%s", "Hello"); would
invoke UB, since "Hello" is in non-writable storage, and nothing in the
Standard promises that printf will retrieve the that argumetn it using
a const-qualified pointer type.
Loading...