Discussion:
Comments on N2008, proposed C2X enum enhancement
(too old to reply)
Keith Thompson
2016-03-22 01:30:58 UTC
Permalink
Raw Message
I think this was mentioned here recently, but I can't find the article.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf

is a proposal for an enhancement to C's enumerated types, allowing the
programmer to specify the type used to represent a given enumerated
type. For example, you could write:

enum foo : unsigned short { zero, one two };

and an object of type enum foo would have the same representation as
unsigned short (rather than some implementation-defined integer type as
is currently the case).

I generally like the idea, but I have a few suggestions on the wording.

Currently each enumerated type is compatible with some integer type.
The proposal doesn't discuss type compatibility. Either "enum foo" in
the example above should be compatible with "unsigned short", or it
shouldn't be compatible with any integer type.

The syntax for a *enum-specifier* (the tokens between the ":" and the
"{") is too restrictive. It permits "unsigned short" but not "short
unsigned", unlike in any other context. Worse, it forbids the use of a
typedef.

Rather than elaborately reinventing the syntax of an integer type name,
it should simply use a *type-name*, with a constraint that it must be
the name of an integer type.

Should "enum : _Bool { ... }" be permitted?

Proposal two (which depends on proposal one but may be ignored without
affecting proposal one) uses the phrase "implicit cast". The correct
term is "implicit conversion".

It proposes removing the implicit conversion from integer types to enums
(only for enums defined with the new syntax), but permits implicit
conversion from enums to integers. I'm not convinced this asymmetry is
desirable. On the other hand, it's consistent with C++.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Ian Collins
2016-03-22 04:11:07 UTC
Permalink
Raw Message
Post by Keith Thompson
I think this was mentioned here recently, but I can't find the article.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf
is a proposal for an enhancement to C's enumerated types, allowing the
programmer to specify the type used to represent a given enumerated
enum foo : unsigned short { zero, one two };
and an object of type enum foo would have the same representation as
unsigned short (rather than some implementation-defined integer type as
is currently the case).
I generally like the idea, but I have a few suggestions on the wording.
Currently each enumerated type is compatible with some integer type.
The proposal doesn't discuss type compatibility. Either "enum foo" in
the example above should be compatible with "unsigned short", or it
shouldn't be compatible with any integer type.
Compatibility with enums declared with the same type specifier should
also be mentioned. Proposal two does partially address compatibility.
Post by Keith Thompson
The syntax for a *enum-specifier* (the tokens between the ":" and the
"{") is too restrictive. It permits "unsigned short" but not "short
unsigned", unlike in any other context. Worse, it forbids the use of a
typedef.
Agree, the C++ rule is "an enum-base shall name an integral type" which
makes more sense. Type specifiers are particular useful in embedded
code where fixed width types are widely used. One of the main drivers
for type specific enums id to set the width.
Post by Keith Thompson
Rather than elaborately reinventing the syntax of an integer type name,
it should simply use a *type-name*, with a constraint that it must be
the name of an integer type.
Should "enum : _Bool { ... }" be permitted?
Using the integral type rule would answer that.
Post by Keith Thompson
Proposal two (which depends on proposal one but may be ignored without
affecting proposal one) uses the phrase "implicit cast". The correct
term is "implicit conversion".
It proposes removing the implicit conversion from integer types to enums
(only for enums defined with the new syntax), but permits implicit
conversion from enums to integers. I'm not convinced this asymmetry is
desirable. On the other hand, it's consistent with C++.
I believe removing the implicit conversion from integer types is a long
overdue improvement. Mind you my view a tainted by having had to debug
too much code where "invalid" values were assigned to enums... It will
move this class of error from run to compile time which is always a
bonus. Only applying this rule to new enums won't break existing
(possibly broken!) code.
--
Ian Collins
s***@casperkitty.com
2016-03-22 06:32:05 UTC
Permalink
Raw Message
Post by Ian Collins
I believe removing the implicit conversion from integer types is a long
overdue improvement. Mind you my view a tainted by having had to debug
too much code where "invalid" values were assigned to enums... It will
move this class of error from run to compile time which is always a
bonus. Only applying this rule to new enums won't break existing
(possibly broken!) code.
IMHO, there should exist a means of declaring enum types which implicitly
convert, and a means of declaring ones that don't. Both usage cases are
important.
Ian Collins
2016-03-22 06:46:55 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Ian Collins
I believe removing the implicit conversion from integer types is a long
overdue improvement. Mind you my view a tainted by having had to debug
too much code where "invalid" values were assigned to enums... It will
move this class of error from run to compile time which is always a
bonus. Only applying this rule to new enums won't break existing
(possibly broken!) code.
IMHO, there should exist a means of declaring enum types which implicitly
convert, and a means of declaring ones that don't. Both usage cases are
important.
Isn't that covered by the second part of the proposal? C++ successfully
addressed this issue with new enums, so it makes sense for C to do the
same.
--
Ian Collins
David Brown
2016-03-22 10:12:05 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Keith Thompson
The syntax for a *enum-specifier* (the tokens between the ":" and the
"{") is too restrictive. It permits "unsigned short" but not "short
unsigned", unlike in any other context. Worse, it forbids the use of a
typedef.
Agree, the C++ rule is "an enum-base shall name an integral type" which
makes more sense. Type specifiers are particular useful in embedded
code where fixed width types are widely used. One of the main drivers
for type specific enums id to set the width.
Yes. The rationale in n2008 is for use in embedded systems, especially
in the MISRA group. MISRA pretty much bans the use of "short int",
"int" and "long int" - it heavily pushes the use of size-specific types.
So this proposal would force its target users to have #define'd macros
for type names rather than typedef's or <stdint.h> types.

The obvious solution here is for the C implementation to copy that of
C++ (except for the "enum class" scoped enumerations).
Post by Ian Collins
Post by Keith Thompson
Rather than elaborately reinventing the syntax of an integer type name,
it should simply use a *type-name*, with a constraint that it must be
the name of an integer type.
Should "enum : _Bool { ... }" be permitted?
Using the integral type rule would answer that.
Post by Keith Thompson
Proposal two (which depends on proposal one but may be ignored without
affecting proposal one) uses the phrase "implicit cast". The correct
term is "implicit conversion".
It proposes removing the implicit conversion from integer types to enums
(only for enums defined with the new syntax), but permits implicit
conversion from enums to integers. I'm not convinced this asymmetry is
desirable. On the other hand, it's consistent with C++.
I believe removing the implicit conversion from integer types is a long
overdue improvement. Mind you my view a tainted by having had to debug
too much code where "invalid" values were assigned to enums... It will
move this class of error from run to compile time which is always a
bonus. Only applying this rule to new enums won't break existing
(possibly broken!) code.
Richard Bos
2016-03-22 16:10:57 UTC
Permalink
Raw Message
Post by Keith Thompson
It proposes removing the implicit conversion from integer types to enums
(only for enums defined with the new syntax), but permits implicit
conversion from enums to integers. I'm not convinced this asymmetry is
desirable. On the other hand, it's consistent with C++.
If there are valid type safety reasons for this constraint - I haven't
studied the proposal yet, but I can imagine there are - then it should
at least be symmetric. If you require a cast on only one side, you get
silliness as with C++'s treatment of void * and, e.g., malloc().

Richard
s***@casperkitty.com
2016-03-22 16:26:53 UTC
Permalink
Raw Message
Post by Richard Bos
If there are valid type safety reasons for this constraint - I haven't
studied the proposal yet, but I can imagine there are - then it should
at least be symmetric. If you require a cast on only one side, you get
silliness as with C++'s treatment of void * and, e.g., malloc().
In situations where every X will meet the requirements of a Y, but not all
Y will meet the requirements of an X, it makes sense to limit conversions
to a single direction. On the other hand, many languages over-simplify
some parts of their type-related rules in ways that make it impossible to
write consistently-sensible rules for other parts, thus leading to much
annoyance [e.g. many languages require a cast for something like
"someByte=(byte)someInt;" which makes sense since not all "int" values will
fit in a byte, while allowing "someByte = someOtherByte;" or "someByte = 3;"
but a desire to use the same rules for all operators means that they end up
requiring a cast for "someByte = someByte & 3;" which makes far less sense
since there's no way the result of "someByte & 3;" could fail to fit into a
byte.] I would hope any rules about implicit conversions would be written
with such issues in mind.
Jakob Bohm
2016-03-22 16:52:29 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Richard Bos
If there are valid type safety reasons for this constraint - I haven't
studied the proposal yet, but I can imagine there are - then it should
at least be symmetric. If you require a cast on only one side, you get
silliness as with C++'s treatment of void * and, e.g., malloc().
In situations where every X will meet the requirements of a Y, but not all
Y will meet the requirements of an X, it makes sense to limit conversions
to a single direction. On the other hand, many languages over-simplify
some parts of their type-related rules in ways that make it impossible to
write consistently-sensible rules for other parts, thus leading to much
annoyance [e.g. many languages require a cast for something like
"someByte=(byte)someInt;" which makes sense since not all "int" values will
fit in a byte, while allowing "someByte = someOtherByte;" or "someByte = 3;"
but a desire to use the same rules for all operators means that they end up
requiring a cast for "someByte = someByte & 3;" which makes far less sense
since there's no way the result of "someByte & 3;" could fail to fit into a
byte.] I would hope any rules about implicit conversions would be written
with such issues in mind.
Incidentally, at least one C90 compiler emitted "lost precision"
warnings for every calculation involving smaller-than-int types, even
the use of operators like "+=". Took a lot of dummy casts to reduce
the noise from that compiler. I always considered that to be a
misreading of the "implicit int conversion" rules, but the compiler
could not be upgraded because the next version suddenly dropped support
for a needed target CPU. This happened back in the 1990's.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
s***@casperkitty.com
2016-03-22 17:23:26 UTC
Permalink
Raw Message
Post by Jakob Bohm
Incidentally, at least one C90 compiler emitted "lost precision"
warnings for every calculation involving smaller-than-int types, even
the use of operators like "+=". Took a lot of dummy casts to reduce
the noise from that compiler. I always considered that to be a
misreading of the "implicit int conversion" rules, but the compiler
could not be upgraded because the next version suddenly dropped support
for a needed target CPU. This happened back in the 1990's.
It still happens in Java and C#.

I wonder why language designers like so much the notion that types
should only flow from the inside of an expression outward? Making that
work in a language with overloadable operators would require a means
of setting binding priorities, but being able to control such things
would be useful anyway (and is IMHO necessary if one wants to prevent
astonishment while allowing implicit conversions to be considered in
overload evaluation). Given an expression like:

someDouble = someFloat + someLong;

does it really make sense to convert someLong to float, do the addition
with float precision, and then convert the result to double, or would it
make more sense to either convert both operands to double or else squawk
that the expression is ambiguous?
Jakob Bohm
2016-03-22 17:51:01 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Jakob Bohm
Incidentally, at least one C90 compiler emitted "lost precision"
warnings for every calculation involving smaller-than-int types, even
the use of operators like "+=". Took a lot of dummy casts to reduce
the noise from that compiler. I always considered that to be a
misreading of the "implicit int conversion" rules, but the compiler
could not be upgraded because the next version suddenly dropped support
for a needed target CPU. This happened back in the 1990's.
It still happens in Java and C#.
I wonder why language designers like so much the notion that types
should only flow from the inside of an expression outward? Making that
work in a language with overloadable operators would require a means
of setting binding priorities, but being able to control such things
would be useful anyway (and is IMHO necessary if one wants to prevent
astonishment while allowing implicit conversions to be considered in
someDouble = someFloat + someLong;
does it really make sense to convert someLong to float, do the addition
with float precision, and then convert the result to double, or would it
make more sense to either convert both operands to double or else squawk
that the expression is ambiguous?
I can't remember right now, but isn't float implicitly converted to
double, or did I get that one wrong?

Anyway, it certainly makes sense for a language to at least act
consistently in regards to expression evaluation, though few languages
would go as far as APL in this regard (strict right to left evaluation,
no operator priorities, lots of operators).

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Tim Rentsch
2016-03-23 20:03:54 UTC
Permalink
Raw Message
Post by Keith Thompson
I think this was mentioned here recently, but I can't find the article.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf
is a proposal for an enhancement to C's enumerated types, allowing the
programmer to specify the type used to represent a given enumerated
enum foo : unsigned short { zero, one, two };
and an object of type enum foo would have the same representation as
unsigned short (rather than some implementation-defined integer type as
is currently the case).
I generally like the idea, but I have a few suggestions on the wording.
Currently each enumerated type is compatible with some integer type.
The proposal doesn't discuss type compatibility. Either "enum foo" in
the example above should be compatible with "unsigned short", or it
shouldn't be compatible with any integer type.
The syntax for a *enum-specifier* (the tokens between the ":" and the
"{") is too restrictive. It permits "unsigned short" but not "short
unsigned", unlike in any other context. Worse, it forbids the use of a
typedef [name].
Rather than elaborately reinventing the syntax of an integer type name,
it should simply use a *type-name*, with a constraint that it must be
the name of an integer type.
I agree with your reservations. I also have some additional
reservations and further suggestions.

The new-style enum types should have the same representation (and
alignment) as their carrier type but not be compatible with it.
Any (allowed) arithmetic with a new-style enum type should follow
the same rules as if the carrier type of each enum's type were
substituted for the enum type in question.

Sometimes a small number of enumeration names are put inside an
enumeration definition even though these names don't "belong" to
the enumeration. An example is the lowest value not part of the
enumeration. With type 'int' usually this isn't a problem, but
for smaller types, eg, 'char', that value might be outside the
range of the enum type's carrier type. It should be possible to
designate certain enumeration literals as not being subject to
the same range constraints as the regular ones. A possible
syntax:

enum foo : char {
X = 'x',
Y = 'y',
Z = 'z',
foo_limit : int = Z+1
};

One other point. Rather than limiting the set of allowed carrier
types to just integer types, any scalar type should be allowed.
Post by Keith Thompson
Should "enum : _Bool { ... }" be permitted?
I think I would say yes to this. The semantics of converting to
such a type should parallel the semantics of converting to _Bool.
Post by Keith Thompson
Proposal two (which depends on proposal one but may be ignored without
affecting proposal one) uses the phrase "implicit cast". The correct
term is "implicit conversion".
It proposes removing the implicit conversion from integer types to enums
(only for enums defined with the new syntax), but permits implicit
conversion from enums to integers. I'm not convinced this asymmetry is
desirable. On the other hand, it's consistent with C++.
Personally I think proposal two is a bad idea. The Standard
might suggest or recommend in a footnote that implementations
may want to give a diagnostic in such cases, but I don't think
it should be an outright constraint violation.

On the other other, given that the carrier type of a new-style
enum is known without needing to know its full definition, it
might be nice to allow an "opaque" enum definition, as for
example:

enum blah : unsigned short;

How this would work is by analogy with struct or union types that
are declared but not defined. In fact, such "opaque" enum types
are somewhat nicer than opaque struct types, because here the
size is known so clients could declare instances of the enum type
itself (and not just a pointer to one). For converting values,
probably what makes the most sense is not allow conversion in
either direction (even with casting), except perhaps for equality
and inequality comparisons. It would be nice to have something
in C that really enforces type safety; maybe this is a way to
do that.
Keith Thompson
2016-03-23 21:02:00 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by Keith Thompson
I think this was mentioned here recently, but I can't find the article.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf
is a proposal for an enhancement to C's enumerated types, allowing the
programmer to specify the type used to represent a given enumerated
enum foo : unsigned short { zero, one, two };
and an object of type enum foo would have the same representation as
unsigned short (rather than some implementation-defined integer type as
is currently the case).
I generally like the idea, but I have a few suggestions on the wording.
I got a response from the author. He says he'll make sure my feedback
is noted at the upcoming WG14 meeting where N2008 will be discussed. In
particular, he agrees with my criticism of the way the base type is
specified.
Post by Tim Rentsch
Post by Keith Thompson
Currently each enumerated type is compatible with some integer type.
The proposal doesn't discuss type compatibility. Either "enum foo" in
the example above should be compatible with "unsigned short", or it
shouldn't be compatible with any integer type.
The syntax for a *enum-specifier* (the tokens between the ":" and the
"{") is too restrictive. It permits "unsigned short" but not "short
unsigned", unlike in any other context. Worse, it forbids the use of a
typedef [name].
Rather than elaborately reinventing the syntax of an integer type name,
it should simply use a *type-name*, with a constraint that it must be
the name of an integer type.
I agree with your reservations. I also have some additional
reservations and further suggestions.
The new-style enum types should have the same representation (and
alignment) as their carrier type but not be compatible with it.
Any (allowed) arithmetic with a new-style enum type should follow
the same rules as if the carrier type of each enum's type were
substituted for the enum type in question.
Sometimes a small number of enumeration names are put inside an
enumeration definition even though these names don't "belong" to
the enumeration. An example is the lowest value not part of the
enumeration. With type 'int' usually this isn't a problem, but
for smaller types, eg, 'char', that value might be outside the
range of the enum type's carrier type. It should be possible to
designate certain enumeration literals as not being subject to
the same range constraints as the regular ones. A possible
enum foo : char {
X = 'x',
Y = 'y',
Z = 'z',
foo_limit : int = Z+1
};
Hmm. No objection, I suppose, but it might be considered out of scope.
Very nearly the same thing could be accomplised by a separate
declaration:

enum foo : char { X = 'x', Y = 'y', Z = 'z' };
enum : int { foo_limit = Z + 1 };

Whether that works depends on the implicit type conversion rules.
Post by Tim Rentsch
One other point. Rather than limiting the set of allowed carrier
types to just integer types, any scalar type should be allowed.
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
Post by Tim Rentsch
Post by Keith Thompson
Should "enum : _Bool { ... }" be permitted?
I think I would say yes to this. The semantics of converting to
such a type should parallel the semantics of converting to _Bool.
I agree, but either way it should be addressed explicitly.
Post by Tim Rentsch
Post by Keith Thompson
Proposal two (which depends on proposal one but may be ignored without
affecting proposal one) uses the phrase "implicit cast". The correct
term is "implicit conversion".
It proposes removing the implicit conversion from integer types to enums
(only for enums defined with the new syntax), but permits implicit
conversion from enums to integers. I'm not convinced this asymmetry is
desirable. On the other hand, it's consistent with C++.
Personally I think proposal two is a bad idea. The Standard
might suggest or recommend in a footnote that implementations
may want to give a diagnostic in such cases, but I don't think
it should be an outright constraint violation.
On the other other, given that the carrier type of a new-style
enum is known without needing to know its full definition, it
might be nice to allow an "opaque" enum definition, as for
enum blah : unsigned short;
This reminds me of Ada's "derived types". A derived type is defined in
terms of an existing type, but it's a distinct type with no implicit
conversion. (It's what some inexperienced C programmers incorrectly
assume "typedef" means.)

I like the idea of adding such a feature to C, but I don't think "enum"
is a good way to do it. Off the top of my head, there could be a new
_Type keyword, treated syntactically like typedef but creating a new
type. So your example would be:

_Type unsigned short blah;

which would create a new type `blah` with the same characteristics as
`short`, but incompatible with it.

There would be a lot of details to work out. For example, if there's no
implicit conversion then

blah obj = 42;

would be invalid, and would have to be written as

blah obj = (blah)42;

And I wouldn't want to break the ability to write

some_type obj = { 0 };

for any type. (Allowing `{ }`, with *all* members/elements defaulting
to zero, would be a nice feature.)
Post by Tim Rentsch
How this would work is by analogy with struct or union types that
are declared but not defined. In fact, such "opaque" enum types
are somewhat nicer than opaque struct types, because here the
size is known so clients could declare instances of the enum type
itself (and not just a pointer to one). For converting values,
probably what makes the most sense is not allow conversion in
either direction (even with casting), except perhaps for equality
and inequality comparisons. It would be nice to have something
in C that really enforces type safety; maybe this is a way to
do that.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-03-24 04:57:19 UTC
Permalink
Raw Message
Post by Keith Thompson
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
In many contexts, "scalar type" would exclude pointers.
Post by Keith Thompson
This reminds me of Ada's "derived types". A derived type is defined in
terms of an existing type, but it's a distinct type with no implicit
conversion. (It's what some inexperienced C programmers incorrectly
assume "typedef" means.)
C really needs a means of indicating that a structure type should be
considered to be "derived" from another. In the 1990s, on almost every
compiler, if two structure types started with a common initial sequence,
one could cast a pointer of either type to a pointer of the other, and
use either type of pointer to access members of that initial sequence.
Unfortunately, notwithstanding the fact that such an ability was useful
and a lot of code relied upon it to express things that can't be expressed
any other way, compiler writers have sought to remove such abilities from
the language without bothering to define any viable replacement. Being
able to say that types X and Y derive from Z such that a pointer to X or Y
may be used to access any things in Z, and a pointer to Z may be used to
access things in an X or Y, though a pointer to an X can't access a Y or
vice versa, would be a useful enhancement that would restore the semantic
power that was available in classic dialects.
Ian Collins
2016-03-24 05:16:59 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
In many contexts, "scalar type" would exclude pointers.
But not this one, so restricting the type to an integral type makes sense.
Post by s***@casperkitty.com
Post by Keith Thompson
This reminds me of Ada's "derived types". A derived type is defined in
terms of an existing type, but it's a distinct type with no implicit
conversion. (It's what some inexperienced C programmers incorrectly
assume "typedef" means.)
C really needs a means of indicating that a structure type should be
considered to be "derived" from another. In the 1990s, on almost every
compiler, if two structure types started with a common initial sequence,
one could cast a pointer of either type to a pointer of the other, and
use either type of pointer to access members of that initial sequence.
Unfortunately, notwithstanding the fact that such an ability was useful
and a lot of code relied upon it to express things that can't be expressed
any other way, compiler writers have sought to remove such abilities from
the language without bothering to define any viable replacement. Being
able to say that types X and Y derive from Z such that a pointer to X or Y
may be used to access any things in Z, and a pointer to Z may be used to
access things in an X or Y, though a pointer to an X can't access a Y or
vice versa, would be a useful enhancement that would restore the semantic
power that was available in classic dialects.
How is this relevant to enums?
--
Ian Collins
s***@casperkitty.com
2016-03-24 18:43:34 UTC
Permalink
Raw Message
Post by Ian Collins
How is this relevant to enums?
For enums, a question of whether a pointer to one type is compatible with
a pointer to the type from which is is derived would seem highly relevant,
would you not agree? Would there be any reason to resolve such issues for
enum types only, without adding a more general mechanism for other types?
Keith Thompson
2016-03-24 15:25:06 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
In many contexts, "scalar type" would exclude pointers.
Not in C. N1570 6.2.5p21:

Arithmetic types and pointer types are collectively called scalar
types. Array and structure types are collectively called aggregate
types.

The original proposal was restricted to integer types. Tim Rentsch
proposed extending it to scalar types. It's not likely that Tim would
misuse the word (by "misuse" I mean use it in a manner that's
inconsistent with the standard's definition). But we'll have to wait
for Tim to reply.

[...]
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-03-24 16:03:37 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by s***@casperkitty.com
In many contexts, "scalar type" would exclude pointers.
Arithmetic types and pointer types are collectively called scalar
types. Array and structure types are collectively called aggregate
types.
The original proposal was restricted to integer types. Tim Rentsch
proposed extending it to scalar types. It's not likely that Tim would
misuse the word (by "misuse" I mean use it in a manner that's
inconsistent with the standard's definition). But we'll have to wait
for Tim to reply.
Thanks for the cite. With regard for pointer-enums, I could perhaps see
some usefulness on systems which define a global ranking of all pointers,
if a pointer-enum was guaranteed to compare distinctly from all other
pointers of the indicated type, but was not guaranteed to have any actual
storage allocated to it. An implementation could handle the declaration
of an enum with base type FOO* with values "foo" and "bar" simply by
saying "FOO foo,bar;" but if the compiler/linker knew of parts of the
address space which did not have real storage associated with them, and
in which the compiler or linker would never place a FOO it could allocate
"foo" and "bar" in those areas. Some kinds of code need more pointers
with more than one sentinel value, and an enum of pointer type might work
usefully for such a purpose.
Keith Thompson
2016-03-24 18:17:52 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Keith Thompson
Post by s***@casperkitty.com
In many contexts, "scalar type" would exclude pointers.
Arithmetic types and pointer types are collectively called scalar
types. Array and structure types are collectively called aggregate
types.
The original proposal was restricted to integer types. Tim Rentsch
proposed extending it to scalar types. It's not likely that Tim would
misuse the word (by "misuse" I mean use it in a manner that's
inconsistent with the standard's definition). But we'll have to wait
for Tim to reply.
Thanks for the cite. With regard for pointer-enums, I could perhaps see
some usefulness on systems which define a global ranking of all pointers,
if a pointer-enum was guaranteed to compare distinctly from all other
pointers of the indicated type, but was not guaranteed to have any actual
storage allocated to it. An implementation could handle the declaration
of an enum with base type FOO* with values "foo" and "bar" simply by
saying "FOO foo,bar;" but if the compiler/linker knew of parts of the
address space which did not have real storage associated with them, and
in which the compiler or linker would never place a FOO it could allocate
"foo" and "bar" in those areas. Some kinds of code need more pointers
with more than one sentinel value, and an enum of pointer type might work
usefully for such a purpose.
You can already do something similar like this:

static char do_not_use_0, do_not_use_1;
const FOO *const distinct0 = (FOO*)&do_not_use_0;
const FOO *const distinct1 = (FOO*)&do_not_use_1;

Or you can define FOO objects if alignment might be an issue.

The pointers distinct0 and distinct1 are guaranteed to have values
distinct from any other FOO* values. It doesn't give you ordering, but
I don't think it makes sense to add a new language feature that assumes
that kind of ordering (unless the standard is also updated to require
total ordering of pointer values).
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-03-24 18:37:39 UTC
Permalink
Raw Message
Post by Keith Thompson
The pointers distinct0 and distinct1 are guaranteed to have values
distinct from any other FOO* values. It doesn't give you ordering, but
I don't think it makes sense to add a new language feature that assumes
that kind of ordering (unless the standard is also updated to require
total ordering of pointer values).
Is there any reason the language spec should not be updated so as to require
that implementations report via macro whether, for arbitrary p and q...

0 -- p>q could do anything

1 -- p>q will always return 0 or 1 in completely Unspecified fashion

2 -- p>q will define a ranking which is consistent but not necessarily
unique (implying that p>q and q<p might both be false).

3 -- p>q will define a ranking which is consistent, and in which disjoint
objects register as disjoint.

All that would be required would be to come up with some names for an
identifier and meanings for the values indicated thereby. Since almost
all implementations could easily support #1, and for each step there are
algorithms that can benefit from it, but implementations that would be
unable to meet its requirements, I'd suggest the above as a minimum level
of granularity, though perhaps finer distinctions might be useful.

Code which uses the new spec could run on any existing compilers which
naturally support the proper semantics and would allow the appropriate
macro to be predefined (e.g. via the command line).
Tim Rentsch
2016-03-24 22:20:19 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by s***@casperkitty.com
Post by Keith Thompson
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
In many contexts, "scalar type" would exclude pointers.
Arithmetic types and pointer types are collectively called scalar
types. Array and structure types are collectively called aggregate
types.
The original proposal was restricted to integer types. Tim Rentsch
proposed extending it to scalar types. It's not likely that Tim would
misuse the word (by "misuse" I mean use it in a manner that's
inconsistent with the standard's definition). But we'll have to wait
for Tim to reply.
Yes, I meant scalar type in the same way the Standard uses the
term.
Kaz Kylheku
2016-03-24 05:11:42 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Tim Rentsch
enum blah : unsigned short;
This reminds me of Ada's "derived types". A derived type is defined in
terms of an existing type, but it's a distinct type with no implicit
conversion. (It's what some inexperienced C programmers incorrectly
assume "typedef" means.)
This:

enum IDENT : TYPE ... ;

also reminds me of some languages.

Unfortunately, none of them are related to C in any way.

The syntactic aspect of the proposal shows a marked disregard for the
design style of C declarations.

Syntactically, this is completely doable by supporting an enum specifier
as just another declaration specifier that can be mixed with the integer
ones:

// Doh, obvious:

unsigned int enum { a, b, c } x, *y, z(void);
^^^^^^^^ ^^^ ^^^^^^^^^^^^^^^^
1 2 3

Here, we effectively have three declaration specifiers: unsigned, int
and enum { a, b, c }.

The usual perturbation of order can be permitted:

int enum triple { a, b, c } unsigned x, *y, z(void);

(Or not: it could be a constraint violation if the enum isn't the last
specifier, to prevent a poorly readable split such as the above.)

This, of course, is not allowed:

typedef enum ... whatever;

unsigned int whatever x;

The current enum declaration is then just a special case in which the no
integer type specifiers are present. The type is chosen according to
the old rules for that situation.
Post by Keith Thompson
I like the idea of adding such a feature to C, but I don't think "enum"
is a good way to do it. Off the top of my head, there could be a new
_Type keyword, treated syntactically like typedef but creating a new
_Type unsigned short blah;
This could appear together with the header

#include <stdrlytpdf.h>

which defines the macro:

#define really_typedef _Type /* not just an alias, darn it! */

:)
Ian Collins
2016-03-24 07:32:34 UTC
Permalink
Raw Message
Post by Kaz Kylheku
Post by Keith Thompson
Post by Tim Rentsch
enum blah : unsigned short;
This reminds me of Ada's "derived types". A derived type is defined in
terms of an existing type, but it's a distinct type with no implicit
conversion. (It's what some inexperienced C programmers incorrectly
assume "typedef" means.)
enum IDENT : TYPE ... ;
also reminds me of some languages.
Unfortunately, none of them are related to C in any way.
C++?
Post by Kaz Kylheku
The syntactic aspect of the proposal shows a marked disregard for the
design style of C declarations.
Syntactically, this is completely doable by supporting an enum specifier
as just another declaration specifier that can be mixed with the integer
unsigned int enum { a, b, c } x, *y, z(void);
^^^^^^^^ ^^^ ^^^^^^^^^^^^^^^^
1 2 3
Given how often headers are shared between C and C++, why would you want
to be deliberately incompatible?
--
Ian Collins
Kaz Kylheku
2016-03-24 14:35:59 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Kaz Kylheku
Post by Keith Thompson
Post by Tim Rentsch
enum blah : unsigned short;
This reminds me of Ada's "derived types". A derived type is defined in
terms of an existing type, but it's a distinct type with no implicit
conversion. (It's what some inexperienced C programmers incorrectly
assume "typedef" means.)
enum IDENT : TYPE ... ;
also reminds me of some languages.
Unfortunately, none of them are related to C in any way.
C++?
I see this made it into C++11. In that language, it harmonizes with
class declarations, resembling inheritance: class X : Y { ...

In C, it is foreign.
Post by Ian Collins
Post by Kaz Kylheku
The syntactic aspect of the proposal shows a marked disregard for the
design style of C declarations.
Syntactically, this is completely doable by supporting an enum specifier
as just another declaration specifier that can be mixed with the integer
unsigned int enum { a, b, c } x, *y, z(void);
^^^^^^^^ ^^^ ^^^^^^^^^^^^^^^^
1 2 3
Given how often headers are shared between C and C++, why would you want
to be deliberately incompatible?
Headers /with typed enums/ aren't shared between C and C++ now; (at least
not without the inclusion of some hacky workarounds).

If it's introduced in the above way, they still won't be for a while.
(Is there a rush?)

C++ can easily catch up with the more cogent C-like syntax by adopting it
as an alternative.
Ian Collins
2016-03-27 23:38:37 UTC
Permalink
Raw Message
Post by Kaz Kylheku
Post by Ian Collins
Given how often headers are shared between C and C++, why would you want
to be deliberately incompatible?
Headers /with typed enums/ aren't shared between C and C++ now; (at least
not without the inclusion of some hacky workarounds).
If it's introduced in the above way, they still won't be for a while.
(Is there a rush?)
Given both C and C++ are popular languages for embedded programming and
C++ already has them, I can see these being a popular extension with
embedded C compilers.
--
Ian Collins
Tim Rentsch
2016-03-24 22:19:40 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Tim Rentsch
Post by Keith Thompson
I think this was mentioned here recently, but I can't find the article.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf
is a proposal for an enhancement to C's enumerated types, allowing the
programmer to specify the type used to represent a given enumerated
enum foo : unsigned short { zero, one, two };
and an object of type enum foo would have the same representation as
unsigned short (rather than some implementation-defined integer type as
is currently the case).
I generally like the idea, but I have a few suggestions on the wording.
I got a response from the author. He says he'll make sure my feedback
is noted at the upcoming WG14 meeting where N2008 will be discussed. In
particular, he agrees with my criticism of the way the base type is
specified.
Okay, cool.
Post by Keith Thompson
Post by Tim Rentsch
Post by Keith Thompson
Currently each enumerated type is compatible with some integer type.
The proposal doesn't discuss type compatibility. Either "enum foo" in
the example above should be compatible with "unsigned short", or it
shouldn't be compatible with any integer type.
The syntax for a *enum-specifier* (the tokens between the ":" and the
"{") is too restrictive. It permits "unsigned short" but not "short
unsigned", unlike in any other context. Worse, it forbids the use of a
typedef [name].
Rather than elaborately reinventing the syntax of an integer type name,
it should simply use a *type-name*, with a constraint that it must be
the name of an integer type.
I agree with your reservations. I also have some additional
reservations and further suggestions.
The new-style enum types should have the same representation (and
alignment) as their carrier type but not be compatible with it.
Any (allowed) arithmetic with a new-style enum type should follow
the same rules as if the carrier type of each enum's type were
substituted for the enum type in question.
Sometimes a small number of enumeration names are put inside an
enumeration definition even though these names don't "belong" to
the enumeration. An example is the lowest value not part of the
enumeration. With type 'int' usually this isn't a problem, but
for smaller types, eg, 'char', that value might be outside the
range of the enum type's carrier type. It should be possible to
designate certain enumeration literals as not being subject to
the same range constraints as the regular ones. A possible
enum foo : char {
X = 'x',
Y = 'y',
Z = 'z',
foo_limit : int = Z+1
};
Hmm. No objection, I suppose, but it might be considered out of scope.
Very nearly the same thing could be accomplised by a separate
enum foo : char { X = 'x', Y = 'y', Z = 'z' };
enum : int { foo_limit = Z + 1 };
Whether that works depends on the implicit type conversion rules.
I think declaring "extra" enumeration literals is a fairly common
pattern, depending of course on circumstances. If it is a common
pattern then it seems appropriate for the new feature proposal to
recognize it. Even if this idea ends up not being included it
should at least be considered.
Post by Keith Thompson
Post by Tim Rentsch
One other point. Rather than limiting the set of allowed carrier
types to just integer types, any scalar type should be allowed.
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
Enumeration types typically serve one of two purposes: one, to
have a set of mutually exclusive values whose specific values are
(usually) not important as long as they are distinct; and two, to
define a related set of named constants, with specified values,
and which might not be mutually exclusive, for convenience,
documentation, and (sometimes) to facilitate better compile-time
checking (eg, switch() statements). It's in connection with the
second pattern that I think allowing other scalar types might
sometimes be useful. Also, given that the particular carrier type
is specified by the developer, restricting those to be integer
types seems like an arbitrary restriction. In general I think
programming language features should not impose arbitrary
restrictions, assuming of course there is no compelling reason
otherwise to do so. Here I don't see one. Do you?

A difference between a 'double' enum literal and the declared
variable 'foo' above is that enum literals are constant
expressions, but declared variables are not.
Post by Keith Thompson
Post by Tim Rentsch
Post by Keith Thompson
Should "enum : _Bool { ... }" be permitted?
I think I would say yes to this. The semantics of converting to
such a type should parallel the semantics of converting to _Bool.
I agree, but either way it should be addressed explicitly.
Absolutely.
Post by Keith Thompson
Post by Tim Rentsch
Post by Keith Thompson
Proposal two (which depends on proposal one but may be ignored without
affecting proposal one) uses the phrase "implicit cast". The correct
term is "implicit conversion".
It proposes removing the implicit conversion from integer types to enums
(only for enums defined with the new syntax), but permits implicit
conversion from enums to integers. I'm not convinced this asymmetry is
desirable. On the other hand, it's consistent with C++.
Personally I think proposal two is a bad idea. The Standard
might suggest or recommend in a footnote that implementations
may want to give a diagnostic in such cases, but I don't think
it should be an outright constraint violation.
On the other other, given that the carrier type of a new-style
enum is known without needing to know its full definition, it
might be nice to allow an "opaque" enum definition, as for
enum blah : unsigned short;
This reminds me of Ada's "derived types". A derived type is defined in
terms of an existing type, but it's a distinct type with no implicit
conversion. (It's what some inexperienced C programmers incorrectly
assume "typedef" means.)
I like the idea of adding such a feature to C, but I don't think "enum"
is a good way to do it. [..ideas about how derived types might
be added to C..]
I agree there is some resemblance between derived types in Ada and
what I'm calling "opaque enums", but I don't think of either as a
subset of the other. My key point here is I think there should be
a way to declare a specified-carrier-type enum *without having to
also define the enumeration literals that are part of it*. Under
the current proposal there is no way to do that. The resulting
enum type being "opaque" is just a (happy) natural consequence of
having the two be separated. It might also be nice to have a
facility for derived types in C, but that doesn't take the place
of the "split enum" construct as I mean to convey it.
Post by Keith Thompson
And I wouldn't want to break the ability to write
some_type obj = { 0 };
for any type. (Allowing `{ }`, with *all* members/elements defaulting
to zero, would be a nice feature.)
This problem needs to be thought through, because it isn't
necessarily the case that 0 is a legal value for a particular
enumeration type. IMO this issue should be pointed out to the
proposal's original author.
Keith Thompson
2016-03-24 22:50:50 UTC
Permalink
Raw Message
[SNIP]
Post by Tim Rentsch
Enumeration types typically serve one of two purposes: one, to
have a set of mutually exclusive values whose specific values are
(usually) not important as long as they are distinct; and two, to
define a related set of named constants, with specified values,
and which might not be mutually exclusive, for convenience,
documentation, and (sometimes) to facilitate better compile-time
checking (eg, switch() statements). It's in connection with the
second pattern that I think allowing other scalar types might
sometimes be useful. Also, given that the particular carrier type
is specified by the developer, restricting those to be integer
types seems like an arbitrary restriction. In general I think
programming language features should not impose arbitrary
restrictions, assuming of course there is no compelling reason
otherwise to do so. Here I don't see one. Do you?
Perhaps not. I'm content to sit back and let the committee consider it.
Post by Tim Rentsch
A difference between a 'double' enum literal and the declared
variable 'foo' above is that enum literals are constant
expressions, but declared variables are not.
True. There aren't many cases where it matter whether a floating-point
expression is static or not. One such case is in an initializer for a
static object.

[...]
Post by Tim Rentsch
Post by Keith Thompson
And I wouldn't want to break the ability to write
some_type obj = { 0 };
for any type. (Allowing `{ }`, with *all* members/elements defaulting
to zero, would be a nice feature.)
This problem needs to be thought through, because it isn't
necessarily the case that 0 is a legal value for a particular
enumeration type. IMO this issue should be pointed out to the
proposal's original author.
Currently, 0 (suitably converted) is always a legal value for an
enumeration type.

enum foo { a = 10, b = 20 };
enum foo obj = 0; /* perfectly valid */

I don't think this proposal breaks that.

I've given the author a Google Groups URL for this thread.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Ian Collins
2016-03-24 23:01:13 UTC
Permalink
Raw Message
Post by Keith Thompson
Currently, 0 (suitably converted) is always a legal value for an
enumeration type.
enum foo { a = 10, b = 20 };
enum foo obj = 0; /* perfectly valid */
I don't think this proposal breaks that.
Surly this is something that deserves to be broken!
--
Ian Collins
Keith Thompson
2016-03-24 23:27:51 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Keith Thompson
Currently, 0 (suitably converted) is always a legal value for an
enumeration type.
enum foo { a = 10, b = 20 };
enum foo obj = 0; /* perfectly valid */
I don't think this proposal breaks that.
Surly this is something that deserves to be broken!
"Surly" is an amusing typo. 8-)}

Enumerated types are integer types, not (as they are in some languages)
a distinct kind of type that doesn't interoperate with integers. Each
enumerated type is compatible with some integer type.

This proposal doesn't make enumerated types more abstract. It ties them
even more closely to the underlying integer types by allowing you to
specify exactly what the underlying type is.

And if `enum foo obj = 0;` were broken, it would also break the current
ability to write

some_type obj = { 0 };

for *any* type; it would fail if some_type is either a new-style
enumerated type or an aggregate type containing something of such a type.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Ian Collins
2016-03-27 23:42:32 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Ian Collins
Post by Keith Thompson
Currently, 0 (suitably converted) is always a legal value for an
enumeration type.
enum foo { a = 10, b = 20 };
enum foo obj = 0; /* perfectly valid */
I don't think this proposal breaks that.
Surly this is something that deserves to be broken!
"Surly" is an amusing typo. 8-)}
Enumerated types are integer types, not (as they are in some languages)
a distinct kind of type that doesn't interoperate with integers. Each
enumerated type is compatible with some integer type.
This proposal doesn't make enumerated types more abstract. It ties them
even more closely to the underlying integer types by allowing you to
specify exactly what the underlying type is.
And if `enum foo obj = 0;` were broken, it would also break the current
ability to write
some_type obj = { 0 };
for *any* type; it would fail if some_type is either a new-style
enumerated type or an aggregate type containing something of such a type.
I've just checked this out and the above is legal in C++, so we can write

struct X {
int n;
enum : unsigned { one = 42, two } e;
};

X x = {0};
--
Ian Collins
Tim Rentsch
2016-03-25 17:33:05 UTC
Permalink
Raw Message
Post by Keith Thompson
[SNIP]
Post by Tim Rentsch
Enumeration types typically serve one of two purposes: one, to
have a set of mutually exclusive values whose specific values are
(usually) not important as long as they are distinct; and two, to
define a related set of named constants, with specified values,
and which might not be mutually exclusive, for convenience,
documentation, and (sometimes) to facilitate better compile-time
checking (eg, switch() statements). It's in connection with the
second pattern that I think allowing other scalar types might
sometimes be useful. Also, given that the particular carrier type
is specified by the developer, restricting those to be integer
types seems like an arbitrary restriction. In general I think
programming language features should not impose arbitrary
restrictions, assuming of course there is no compelling reason
otherwise to do so. Here I don't see one. Do you?
Perhaps not. I'm content to sit back and let the committee consider it.
My question here is not "would you recommend this?" but "do you
see any problems that would come up if it were adopted?". I
think that's a fair question considering the previous discussion.
I don't mean to settle the question of what should be adopted
(ie, in the Standard) by talking about it in the newsgroup.
Post by Keith Thompson
[...]
Post by Tim Rentsch
Post by Keith Thompson
And I wouldn't want to break the ability to write
some_type obj = { 0 };
for any type. (Allowing `{ }`, with *all* members/elements
defaulting to zero, would be a nice feature.)
This problem needs to be thought through, because it isn't
necessarily the case that 0 is a legal value for a particular
enumeration type. IMO this issue should be pointed out to the
proposal's original author.
Currently, 0 (suitably converted) is always a legal value for an
enumeration type.
enum foo { a = 10, b = 20 };
enum foo obj = 0; /* perfectly valid */
I don't think this proposal breaks [ie, forbids] that.
Right, but the question is what happens if part II is
adopted, which prevents the conversion of int to an
enumerated type that has an explicit carrier type.
Post by Keith Thompson
I've given the author a Google Groups URL for this thread.
Excellent.
s***@casperkitty.com
2016-03-25 19:29:38 UTC
Permalink
Raw Message
Post by Tim Rentsch
Right, but the question is what happens if part II is
adopted, which prevents the conversion of int to an
enumerated type that has an explicit carrier type.
There are a few possibilities. For one, it would be possible to allow
direct conversions from int to enumeration or from enumeration to int,
without allowing direct implicit conversion between enumerations based
on the same type. For another, it would be possible to regard "0" as a
special case, as is done with pointers.
Tim Rentsch
2016-04-01 00:03:33 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Tim Rentsch
Right, but the question is what happens if part II is
adopted, which prevents the conversion of int to an
enumerated type that has an explicit carrier type.
There are a few possibilities. For one, it would be possible to
allow direct conversions from int to enumeration or from enumeration
to int, without allowing direct implicit conversion between
enumerations based on the same type. For another, it would be
possible to regard "0" as a special case, as is done with pointers.
The question is not what could be done but what does the proposal
say about what would be done (ie, under the proposal).
Ian Collins
2016-03-24 22:59:13 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by Keith Thompson
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
Enumeration types typically serve one of two purposes: one, to
have a set of mutually exclusive values whose specific values are
(usually) not important as long as they are distinct; and two, to
define a related set of named constants, with specified values,
and which might not be mutually exclusive, for convenience,
documentation, and (sometimes) to facilitate better compile-time
checking (eg, switch() statements). It's in connection with the
second pattern that I think allowing other scalar types might
sometimes be useful. Also, given that the particular carrier type
is specified by the developer, restricting those to be integer
types seems like an arbitrary restriction. In general I think
programming language features should not impose arbitrary
restrictions, assuming of course there is no compelling reason
otherwise to do so. Here I don't see one. Do you?
The second case is where the use of enumerated types differs between C
and C++. In C++, the second case is seldom used so there is little
point in allowing anything other than an integral type for the enum base
type. Opening up the possibilities for enum base types to non-integral
types opens up a can of specification worms, such as what should the
rules be for initialisers? Simply extending the base type from one
integral type (int) to any allows the current set of rules to be retained.

Another potentially entertaining result of allowing scalar types is the
an enum's base type could be another enum...
Post by Tim Rentsch
A difference between a 'double' enum literal and the declared
variable 'foo' above is that enum literals are constant
expressions, but declared variables are not.
In C, would you want to introduce constant expression scalar types in
this way? Wouldn't it be better to admit them to the language directly
and keep enumerated type bases integral?

While compile time floating point constants have some use in numerical
algorithms, their use cases are far fewer than compile time integral
constants.
Post by Tim Rentsch
Post by Keith Thompson
Post by Tim Rentsch
Post by Keith Thompson
Should "enum : _Bool { ... }" be permitted?
I think I would say yes to this. The semantics of converting to
such a type should parallel the semantics of converting to _Bool.
I agree, but either way it should be addressed explicitly.
Absolutely.
As it does for C++, "an enum-base shall name an integral type" would
suffice.
--
Ian Collins
s***@casperkitty.com
2016-03-24 23:05:47 UTC
Permalink
Raw Message
Post by Ian Collins
Another potentially entertaining result of allowing scalar types is the
an enum's base type could be another enum...
Not entirely without some reasonable uses, e.g. in cases where many functions
that use enumerated "option-specifier" type have some options which are
common to all of them, and others which vary among the different types. A
programmer would have to keep track of when covariance or contravariance
makes the most sense, but if a language is going to regard different enum
types as not being universally compatible, being able to specify what types
are compatible with each other would seem like an important feature.
Keith Thompson
2016-03-24 23:29:26 UTC
Permalink
Raw Message
Ian Collins <ian-***@hotmail.com> writes:
[...]
Post by Ian Collins
Another potentially entertaining result of allowing scalar types is the
an enum's base type could be another enum...
Enumerated types are already integer types (N1570 6.2.5p17), so if the
tweaks I've suggested are accepted that would be allowed anyway.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Tim Rentsch
2016-03-25 16:55:29 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Tim Rentsch
Post by Keith Thompson
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
Enumeration types typically serve one of two purposes: one, to
have a set of mutually exclusive values whose specific values are
(usually) not important as long as they are distinct; and two, to
define a related set of named constants, with specified values,
and which might not be mutually exclusive, for convenience,
documentation, and (sometimes) to facilitate better compile-time
checking (eg, switch() statements). It's in connection with the
second pattern that I think allowing other scalar types might
sometimes be useful. Also, given that the particular carrier type
is specified by the developer, restricting those to be integer
types seems like an arbitrary restriction. In general I think
programming language features should not impose arbitrary
restrictions, assuming of course there is no compelling reason
otherwise to do so. Here I don't see one. Do you?
The second case is where the use of enumerated types differs between C
and C++. In C++, the second case is seldom used so there is little
point in allowing anything other than an integral type for the enum
base type.
Since the question is about a change to C, what matters is
how C is used, not how C++ is used.
Post by Ian Collins
Opening up the possibilities for enum base types to non-integral
types opens up a can of specification worms, such as what should
the rules be for initialisers? [...]
I see no particular problem. Without thinking about it too
deeply, ISTM that the same rule could apply for non-integer types
as for integer types -- an initializer could have any type that
is assignable to the carrier type for an "open" enumerated type,
and any type that is assignable to the enumerated type itself for
a "closed" enumerated type. So what's the problem?
Ian Collins
2016-03-25 20:06:26 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by Ian Collins
Post by Tim Rentsch
Post by Keith Thompson
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
Enumeration types typically serve one of two purposes: one, to
have a set of mutually exclusive values whose specific values are
(usually) not important as long as they are distinct; and two, to
define a related set of named constants, with specified values,
and which might not be mutually exclusive, for convenience,
documentation, and (sometimes) to facilitate better compile-time
checking (eg, switch() statements). It's in connection with the
second pattern that I think allowing other scalar types might
sometimes be useful. Also, given that the particular carrier type
is specified by the developer, restricting those to be integer
types seems like an arbitrary restriction. In general I think
programming language features should not impose arbitrary
restrictions, assuming of course there is no compelling reason
otherwise to do so. Here I don't see one. Do you?
The second case is where the use of enumerated types differs between C
and C++. In C++, the second case is seldom used so there is little
point in allowing anything other than an integral type for the enum
base type.
Since the question is about a change to C, what matters is
how C is used, not how C++ is used.
Context, given this style of enum is already there.
Post by Tim Rentsch
Post by Ian Collins
Opening up the possibilities for enum base types to non-integral
types opens up a can of specification worms, such as what should
the rules be for initialisers? [...]
I see no particular problem. Without thinking about it too
deeply, ISTM that the same rule could apply for non-integer types
as for integer types -- an initializer could have any type that
is assignable to the carrier type for an "open" enumerated type,
and any type that is assignable to the enumerated type itself for
a "closed" enumerated type. So what's the problem?
enum : void* {
first, // What value to you assign?
last // What value to you assign?
--
Ian Collins
s***@casperkitty.com
2016-03-25 20:16:26 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Tim Rentsch
I see no particular problem. Without thinking about it too
deeply, ISTM that the same rule could apply for non-integer types
as for integer types -- an initializer could have any type that
is assignable to the carrier type for an "open" enumerated type,
and any type that is assignable to the enumerated type itself for
a "closed" enumerated type. So what's the problem?
enum : void* {
first, // What value to you assign?
last // What value to you assign?
For first, any address that is guaranteed not to represent the address of
any object whose address might be taken, and isn't shared with any other
enum. The compiler could treat the above as:

char __dummy_enum_storage[2];

and then regard "first" as equivalent to (void*)(__dummy_enum_storage), and
second as equivalent to (void*)(__dummy_enum_storage), but it could also
use any other addresses meeting the requirement, including the addresses of
static variables or addresses within functions. Such a thing would likely
not be very useful absent some guarantees beyond those normally required by
the Standard [e.g. a guarantee that if p==q, (uintptr_t)p == (uintptr_t)q,
which is not required by the Standard but which most implementations could
offer at at no cost] but could be useful in such contexts, especially if
there were a guarantee that all of the enumerations created within a single
declaration would sort consecutively.
Tim Rentsch
2016-04-01 00:08:55 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Tim Rentsch
Post by Ian Collins
Post by Tim Rentsch
Post by Keith Thompson
So you'd add floating-point, complex, and pointer types. Of those,
floating-point types are the least weird IMHO, but I'm not sure
const double foo = 42.0;
wouldn't serve the same purpose (unless implicit conversions are
restricted). I don't see how complex or pointer enums would be useful.
Enumeration types typically serve one of two purposes: one, to
have a set of mutually exclusive values whose specific values are
(usually) not important as long as they are distinct; and two, to
define a related set of named constants, with specified values,
and which might not be mutually exclusive, for convenience,
documentation, and (sometimes) to facilitate better compile-time
checking (eg, switch() statements). It's in connection with the
second pattern that I think allowing other scalar types might
sometimes be useful. Also, given that the particular carrier type
is specified by the developer, restricting those to be integer
types seems like an arbitrary restriction. In general I think
programming language features should not impose arbitrary
restrictions, assuming of course there is no compelling reason
otherwise to do so. Here I don't see one. Do you?
The second case is where the use of enumerated types differs between C
and C++. In C++, the second case is seldom used so there is little
point in allowing anything other than an integral type for the enum
base type.
Since the question is about a change to C, what matters is
how C is used, not how C++ is used.
Context, given this style of enum is already there.
What C++ has done can be taken into consideration, but there is
no reason it should dictate what choices are made on the C side.
If what C implements is a superset of what C++ implements then
there shouldn't be any problem.
Post by Ian Collins
Post by Tim Rentsch
Post by Ian Collins
Opening up the possibilities for enum base types to non-integral
types opens up a can of specification worms, such as what should
the rules be for initialisers? [...]
I see no particular problem. Without thinking about it too
deeply, ISTM that the same rule could apply for non-integer types
as for integer types -- an initializer could have any type that
is assignable to the carrier type for an "open" enumerated type,
and any type that is assignable to the enumerated type itself for
a "closed" enumerated type. So what's the problem?
enum : void* {
first, // What value to you assign?
last // What value to you assign?
No value, it's a constraint violation: an expression of integer
type may not be assigned to a pointer type without a cast.
Keith Thompson
2016-04-01 01:37:21 UTC
Permalink
Raw Message
[...]
Post by Tim Rentsch
Post by Ian Collins
Post by Tim Rentsch
I see no particular problem. Without thinking about it too
deeply, ISTM that the same rule could apply for non-integer types
as for integer types -- an initializer could have any type that
is assignable to the carrier type for an "open" enumerated type,
and any type that is assignable to the enumerated type itself for
a "closed" enumerated type. So what's the problem?
enum : void* {
first, // What value to you assign?
last // What value to you assign?
No value, it's a constraint violation: an expression of integer
type may not be assigned to a pointer type without a cast.
0 can, which suggests that this:

enum : void* { first };

might be legal, making `first` a constant whose value is a null pointer
(though not necessarily a null pointer constant).

Can you present an example of `enum : void* ...` being useful enough to
justify adding the feature to the language?
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Tim Rentsch
2016-04-01 02:52:28 UTC
Permalink
Raw Message
Post by Keith Thompson
[...]
Post by Tim Rentsch
Post by Ian Collins
Post by Tim Rentsch
I see no particular problem. Without thinking about it too
deeply, ISTM that the same rule could apply for non-integer types
as for integer types -- an initializer could have any type that
is assignable to the carrier type for an "open" enumerated type,
and any type that is assignable to the enumerated type itself for
a "closed" enumerated type. So what's the problem?
enum : void* {
first, // What value to you assign?
last // What value to you assign?
No value, it's a constraint violation: an expression of integer
type may not be assigned to a pointer type without a cast.
enum : void* { first };
might be legal, making `first` a constant whose value is a null pointer
(though not necessarily a null pointer constant).
Right, this case is plausibly reasonable to allow. I thought of
this before but (perhaps too optimistically) didn't say anything
about it. My sense is that the least surprising thing to do is
not allow it, because only null pointer constants -- and not an
integer value 0 that is not a null pointer constant -- are allowed
for implicit conversion to pointer types; since there is no null
pointer constant, the conversion isn't allowed. But I expect I
could be talked into the other point of view, if a good case were
made for it. It seems a minor matter.
Post by Keith Thompson
Can you present an example of `enum : void* ...` being useful enough to
justify adding the feature to the language?
I have an example for another pointer type, if you'll allow me:

enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};

This way of defining these <signal.h> identifiers seems nicer
than having to define them as macros. I admit this is a small
example, but I expect others would come up if the capability
were available. I think the more important question is Does
allowing all scalar types (including pointer types) cause any
problems? I don't know of any, but possibly I have overlooked
something.
Jakob Bohm
2016-04-01 03:22:13 UTC
Permalink
Raw Message
...
Post by Tim Rentsch
Post by Keith Thompson
Can you present an example of `enum : void* ...` being useful enough to
justify adding the feature to the language?
enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};
This way of defining these <signal.h> identifiers seems nicer
than having to define them as macros...
Except that code checking if the platform it is being compiled for has
a definition for a given signal name would then fail because the
preprocessor doesn't detect enum values.



Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Kaz Kylheku
2016-04-01 03:38:28 UTC
Permalink
Raw Message
Post by Jakob Bohm
...
Post by Tim Rentsch
Post by Keith Thompson
Can you present an example of `enum : void* ...` being useful enough to
justify adding the feature to the language?
enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};
This way of defining these <signal.h> identifiers seems nicer
than having to define them as macros...
Except that code checking if the platform it is being compiled for has
a definition for a given signal name would then fail because the
preprocessor doesn't detect enum values.
This could be taken care of by a "compiler conditional". This is an if
statement whose antecedent condition is a "compile-time expression".

A compile-time expression is an expression which is composed of only
constant subexpressions, and at least one constituent which is a
compile-time operator such as the "defined" operator.

Example:

if (defined (SIG_DFL)) {
// consequent
} else {
// alternative
}

The rule for processing a conditional with a compile-time conditional is
that the not-taken branch of the conditional is not fully parsed.

The not-taken branch is only scanned for balancing parantheses, brackets
and braces.

For instance, suppose SIG_DFL is not defined:

// this is OK:
if (defined (SIG_DFL)) {
{ blah { blah [ x ] } ((42,)) } // garbage syntax, but closure
// punctuators balance
}

In a compile-time ignored block, no syntax or constraint violation is
enforced other than the balancing of the closure punctuator tokens.
This allows:

if (defined (foo_type_t)) {
foo_type_t x;
// ...
}

There is no constraint violation that foo_type_t is undeclared if
the condition is false.

The keyword static could mark constant expressions as compile time.

if (static sizeof (foo) > sizeof (bar)) { ... }

Or perhaps:

if static (cond) ...
Jakob Bohm
2016-04-01 04:13:36 UTC
Permalink
Raw Message
Post by Kaz Kylheku
Post by Jakob Bohm
...
Post by Tim Rentsch
Post by Keith Thompson
Can you present an example of `enum : void* ...` being useful enough to
justify adding the feature to the language?
enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};
This way of defining these <signal.h> identifiers seems nicer
than having to define them as macros...
Except that code checking if the platform it is being compiled for has
a definition for a given signal name would then fail because the
preprocessor doesn't detect enum values.
This could be taken care of by a "compiler conditional". This is an if
statement whose antecedent condition is a "compile-time expression".
A compile-time expression is an expression which is composed of only
constant subexpressions, and at least one constituent which is a
compile-time operator such as the "defined" operator.
if (defined (SIG_DFL)) {
// consequent
} else {
// alternative
}
The rule for processing a conditional with a compile-time conditional is
that the not-taken branch of the conditional is not fully parsed.
The not-taken branch is only scanned for balancing parantheses, brackets
and braces.
if (defined (SIG_DFL)) {
{ blah { blah [ x ] } ((42,)) } // garbage syntax, but closure
// punctuators balance
}
In a compile-time ignored block, no syntax or constraint violation is
enforced other than the balancing of the closure punctuator tokens.
if (defined (foo_type_t)) {
foo_type_t x;
// ...
}
There is no constraint violation that foo_type_t is undeclared if
the condition is false.
The keyword static could mark constant expressions as compile time.
if (static sizeof (foo) > sizeof (bar)) { ... }
if static (cond) ...
Haha, you just described the preprocessor (except for its non-access to
language semantics).


Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
s***@casperkitty.com
2016-04-01 14:45:41 UTC
Permalink
Raw Message
Post by Jakob Bohm
Haha, you just described the preprocessor (except for its non-access to
language semantics).
And an car is just like an airplane except that it can't fly. While there
can be some advantages to having preprocessing be a separate step before
compilation (e.g. the possibility of inserting additional filters between
the preprocessor and the compiler), there are also situations where
conditional compilation should depend upon information which an initial
preprocessor pass would not have available to it (e.g. the sizes of
structures), and the present design doesn't work particularly well at
handling such things.
Jakob Bohm
2016-04-01 15:27:44 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Jakob Bohm
Haha, you just described the preprocessor (except for its non-access to
language semantics).
And an car is just like an airplane except that it can't fly. While there
can be some advantages to having preprocessing be a separate step before
compilation (e.g. the possibility of inserting additional filters between
the preprocessor and the compiler), there are also situations where
conditional compilation should depend upon information which an initial
preprocessor pass would not have available to it (e.g. the sizes of
structures), and the present design doesn't work particularly well at
handling such things.
As an alternative to inventing a new syntax, one could do what Borland
C/C++ did in the 1990s: Integrate the preprocessor and the language
parser and allow references to sizeof() etc. in #if directives.

Simple, obvious and has/had an existing production implementation.

Of cause their implementation had the unfortunate side effect that when
using the preprocessor "on its own" (via a different command line), it
would still complain about C or C++ specific semantic issues in the
file being preprocessed.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Keith Thompson
2016-04-01 16:38:23 UTC
Permalink
Raw Message
Jakob Bohm <jb-***@wisemo.com> writes:
[...]
Post by Jakob Bohm
As an alternative to inventing a new syntax, one could do what Borland
C/C++ did in the 1990s: Integrate the preprocessor and the language
parser and allow references to sizeof() etc. in #if directives.
Simple, obvious and has/had an existing production implementation.
Of cause their implementation had the unfortunate side effect that when
using the preprocessor "on its own" (via a different command line), it
would still complain about C or C++ specific semantic issues in the
file being preprocessed.
Some years ago, a poster either here or in comp.lang.c reminisced about
the Old Days when you could use sizeof in preprocessor expressions.

Dennis Ritchie posted a followup: "Must have been before my time."
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
s***@casperkitty.com
2016-04-01 17:22:47 UTC
Permalink
Raw Message
Post by Jakob Bohm
As an alternative to inventing a new syntax, one could do what Borland
C/C++ did in the 1990s: Integrate the preprocessor and the language
parser and allow references to sizeof() etc. in #if directives.
Simple, obvious and has/had an existing production implementation.
How would that work with existing projects that require the injection of a
filter between the preprocessor and the compiler. As a simple example,
consider an embedded system with a non-ASCII display device (a lot of on-
screen display chips have character sets that don't resemble ASCII at all).
Such a thing could be accommodated nicely by having a filter between the
preprocessor and the main compiler which looks for strings of the form
$"text"$ and substitutes, e.g. "\x13\x04\x17\x13". Some existing projects
make use of such injection; how could a one-pass compiler support it?

I could see some usefulness for defining a standardized means of injecting
filters for things [e.g. have the compiler accept the name of, and parameters
for, an executable that would act as a filter] but not all compilers run on
systems that could support such a thing, and so I doubt the Standards
Committee would be eager to require it.
Jakob Bohm
2016-04-04 15:33:20 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Jakob Bohm
As an alternative to inventing a new syntax, one could do what Borland
C/C++ did in the 1990s: Integrate the preprocessor and the language
parser and allow references to sizeof() etc. in #if directives.
Simple, obvious and has/had an existing production implementation.
How would that work with existing projects that require the injection of a
filter between the preprocessor and the compiler. As a simple example,
consider an embedded system with a non-ASCII display device (a lot of on-
screen display chips have character sets that don't resemble ASCII at all).
Such a thing could be accommodated nicely by having a filter between the
preprocessor and the main compiler which looks for strings of the form
$"text"$ and substitutes, e.g. "\x13\x04\x17\x13". Some existing projects
make use of such injection; how could a one-pass compiler support it?
I could see some usefulness for defining a standardized means of injecting
filters for things [e.g. have the compiler accept the name of, and parameters
for, an executable that would act as a filter] but not all compilers run on
systems that could support such a thing, and so I doubt the Standards
Committee would be eager to require it.
For the compiler in question, one could use its (C semantics aware)
preprocessor mode to produce a preprocessed file, apply the private
filter and then feed the result back to the full compiler (which would
run the preprocessor again but not encounter preprocessor directives
other than #pragma") (I don't recall if the standalone mode of the
Borland C preprocessor would actually pass through pragmas or not).

For a standard definition one could describe a C/C++ semantics aware
textual preprocessor which parsed and understood (but discarded) the
semantics of the C and C++ languages (it would take an external
language identifying argument not mentioned in the standard because
each standard would refer only to said preprocessor running according
to that standard). This would not be an efficient implementation
style, just a way to describe the behavior in the standard and a way
to still run such a preprocessor standalone when desired.

It is worth noting that a custom filter could then not change any
semantics actually checked by the preprocessing directives of the
actual program, but it could still cause the "compile" pass to see
different semantics than what the preprocessor saw.



Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Kaz Kylheku
2016-04-01 16:11:36 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Jakob Bohm
Haha, you just described the preprocessor (except for its non-access to
language semantics).
And an car is just like an airplane except that it can't fly. While there
A parser for the C grammar which spits out a syntax tree is just a
"preprocessor" for the stage which analyzes the tree.
Philip Lantz
2016-04-02 20:01:56 UTC
Permalink
Raw Message
... a "compiler conditional ... is an if
statement whose antecedent condition is a "compile-time expression".
A compile-time expression is an expression which is composed of only
constant subexpressions, and at least one constituent which is a
compile-time operator such as the "defined" operator.
if (defined (SIG_DFL)) {
// consequent
} else {
// alternative
}
The rule for processing a conditional with a compile-time conditional is
that the not-taken branch of the conditional is not fully parsed.
The not-taken branch is only scanned for balancing parantheses, brackets
and braces.
And quotation marks, presumably? (Both kinds.)
Keith Thompson
2016-04-02 20:17:24 UTC
Permalink
Raw Message
Post by Philip Lantz
... a "compiler conditional ... is an if
statement whose antecedent condition is a "compile-time expression".
A compile-time expression is an expression which is composed of only
constant subexpressions, and at least one constituent which is a
compile-time operator such as the "defined" operator.
if (defined (SIG_DFL)) {
// consequent
} else {
// alternative
}
The rule for processing a conditional with a compile-time conditional is
that the not-taken branch of the conditional is not fully parsed.
The not-taken branch is only scanned for balancing parantheses, brackets
and braces.
And quotation marks, presumably? (Both kinds.)
That's covered by tokenization. The not-taken branch would have to
consist of a sequence of valid tokens. Parentheses, brackets, and
braces would be matched after that's determined.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Kaz Kylheku
2016-04-02 22:34:24 UTC
Permalink
Raw Message
Post by Philip Lantz
... a "compiler conditional ... is an if
statement whose antecedent condition is a "compile-time expression".
A compile-time expression is an expression which is composed of only
constant subexpressions, and at least one constituent which is a
compile-time operator such as the "defined" operator.
if (defined (SIG_DFL)) {
// consequent
} else {
// alternative
}
The rule for processing a conditional with a compile-time conditional is
that the not-taken branch of the conditional is not fully parsed.
The not-taken branch is only scanned for balancing parantheses, brackets
and braces.
And quotation marks, presumably? (Both kinds.)
Quotation marks constituent of string literal tokens. The above
processing would happen in a phase in which the program is already
decomposed into tokens. (In fact, after preprocessing).
Philip Lantz
2016-04-03 08:12:28 UTC
Permalink
Raw Message
Post by Kaz Kylheku
Post by Philip Lantz
... a "compiler conditional ... is an if
statement whose antecedent condition is a "compile-time expression".
A compile-time expression is an expression which is composed of only
constant subexpressions, and at least one constituent which is a
compile-time operator such as the "defined" operator.
if (defined (SIG_DFL)) {
// consequent
} else {
// alternative
}
The rule for processing a conditional with a compile-time conditional is
that the not-taken branch of the conditional is not fully parsed.
The not-taken branch is only scanned for balancing parantheses, brackets
and braces.
And quotation marks, presumably? (Both kinds.)
Quotation marks constituent of string literal tokens. The above
processing would happen in a phase in which the program is already
decomposed into tokens. (In fact, after preprocessing).
Ah, yes, of course.
s***@casperkitty.com
2016-04-02 20:54:46 UTC
Permalink
Raw Message
Post by Kaz Kylheku
This could be taken care of by a "compiler conditional". This is an if
statement whose antecedent condition is a "compile-time expression".
A compile-time expression is an expression which is composed of only
constant subexpressions, and at least one constituent which is a
compile-time operator such as the "defined" operator.
I don't think I like having something that is so similar syntactically to a
normal 'if' but has such different semantics [requiring the compiler to
refrain from generating link-time references to functions called within
an ignored block, etc.] Such a design may also work poorly in cases where
the purpose of such a conditional would be to define a variable whose type
should vary based upon compile-time factors [e.g. have the type of a struct
member vary depending upon the size of another structure].

On the other hand, since "if" requires an open parenthesis as the next token,
it might be possible to have a different syntax that could allow for such
possibilities.
Kaz Kylheku
2016-04-02 22:37:01 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by Kaz Kylheku
This could be taken care of by a "compiler conditional". This is an if
statement whose antecedent condition is a "compile-time expression".
A compile-time expression is an expression which is composed of only
constant subexpressions, and at least one constituent which is a
compile-time operator such as the "defined" operator.
I don't think I like having something that is so similar syntactically to a
normal 'if' but has such different semantics [requiring the compiler to
refrain from generating link-time references to functions called within
an ignored block, etc.] Such a design may also work poorly in cases where
the purpose of such a conditional would be to define a variable whose type
should vary based upon compile-time factors [e.g. have the type of a struct
member vary depending upon the size of another structure].
Sure.
Post by s***@casperkitty.com
On the other hand, since "if" requires an open parenthesis as the next token,
it might be possible to have a different syntax that could allow for such
possibilities.
Or that "static if (expr)" idea. No juxtaposition of the keywords static
and if is correct syntax currently, so it can be introduced without
ambiguity.

Also, you might want such a static if around some external
declarations/definitions, not only in a context where a statement is
allowed.

// File scope
static if (whatever) {
int foo() { ... } // external definition
} else {
// something else
}
Tim Rentsch
2016-04-06 15:36:58 UTC
Permalink
Raw Message
Post by Jakob Bohm
...
Post by Tim Rentsch
Post by Keith Thompson
Can you present an example of `enum : void* ...` being useful enough to
justify adding the feature to the language?
enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};
This way of defining these <signal.h> identifiers seems nicer
than having to define them as macros...
Except that code checking if the platform it is being compiled for has
a definition for a given signal name would then fail because the
preprocessor doesn't detect enum values.
It was just an example for purposes of illustration. I'm not
suggesting this be used for the <signal.h> identifiers, which
have additional requirements such as the one you point out.
Kaz Kylheku
2016-04-01 03:24:08 UTC
Permalink
Raw Message
Post by Tim Rentsch
enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};
This way of defining these <signal.h> identifiers seems nicer
than having to define them as macros.
But the set of signal handlers is broader than just these constants.
Thus there is no use for the enum type; it's just used for its side
effect of defining the constants.

These constants can't be used in a switch(), or for declaring
arrays or anything that requires a constant expression.

In other words: what is the remaining advantage over:

typedef void (*sighand_t)(int);

const sighand_t SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...;
Tim Rentsch
2016-04-06 15:45:23 UTC
Permalink
Raw Message
Post by Kaz Kylheku
Post by Tim Rentsch
enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};
This way of defining these <signal.h> identifiers seems nicer
than having to define them as macros.
But the set of signal handlers is broader than just these constants.
Thus there is no use for the enum type; it's just used for its side
effect of defining the constants.
That depends on how the type conversion rules work. Types like
this one could be used to provide more type-safe interfaces.
Post by Kaz Kylheku
These constants can't be used in a switch(), or for declaring
arrays or anything that requires a constant expression.
Not for declaring arrays, but plausibly in a switch() by means
of an extension. It might be nice if switch() were extended
to other scalar types, in which case the more general enum
types would fit very nicely.
Post by Kaz Kylheku
typedef void (*sighand_t)(int);
const sighand_t SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...;
The identifiers in an enum type don't occupy any run-time
storage, and are constant expressions. Personally I think using
enum types for manifest constants is a better choice than the
"overloaded const" feature that C++ has.
BartC
2016-04-01 17:54:12 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by Keith Thompson
Can you present an example of `enum : void* ...` being useful enough to
justify adding the feature to the language?
enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};
This is going a long way from what enumerated types were intended for.
Which was to define a related set of symbols that corresponded to some
integer, and where the integer values were usually assigned
automatically, and consecutively.

What your example wants is Named Constants (I'm lost with what C and C++
are up to with these, but I mean entities that don't require storage).
--
Bartc
s***@casperkitty.com
2016-04-01 19:38:55 UTC
Permalink
Raw Message
Post by BartC
What your example wants is Named Constants (I'm lost with what C and C++
are up to with these, but I mean entities that don't require storage).
Actually, they're very close to Java's "enum" types, though the underlying
type for all of those would be an Object reference (for which "void*" would
be the closest C analogue).

On many C implementations, it could be practical and useful to be able to
have an area of address space which is reserved for "enum pointer" values,
and have an intrinsic which identify whether a given pointer had a value
associated with it [and--if so--what that value was]. For most non-
segmented architectures, the code could be something like:

int __enum_ptr_has_value(void* it)
{ return it>=__ENUM_ALLOC_START && it<__ENUM_ALLOC_END; }

int __get_enum_ptr_value(void* it)
{ return (char*)it-__ENUM_ALLOC_START; }

Note that assigning overly large numbers to enums would require allocating
a corresponding amount of address space (which may or may not be affordable)
and some implementations may require more complicated behavior for the
above functions, but being able to have more than one sentinel value for
pointers could be handy, and on many systems such a feature would be cheap
to implement.
BartC
2016-04-01 20:17:59 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by BartC
What your example wants is Named Constants (I'm lost with what C and C++
are up to with these, but I mean entities that don't require storage).
Actually, they're very close to Java's "enum" types, though the underlying
type for all of those would be an Object reference (for which "void*" would
be the closest C analogue).
Sorry, which are close to Java: the proposed pointer constants, or what
I was on about? The examples I've seen of Java enums look like the
latter: versions of the classic enumeration types of Pascal.

(And Pascal enums have lots of nice features and are more type-safe than
what is used in C, but they would be a bad fit in C.)
Post by s***@casperkitty.com
int __enum_ptr_has_value(void* it)
{ return it>=__ENUM_ALLOC_START && it<__ENUM_ALLOC_END; }
int __get_enum_ptr_value(void* it)
{ return (char*)it-__ENUM_ALLOC_START; }
I think this is getting even further away from what enums are supposed
to be! Unless the aim is to think of as many uses for 'enum' as there
are for 'static'.
--
Bartc
s***@casperkitty.com
2016-04-01 20:34:46 UTC
Permalink
Raw Message
Post by BartC
Sorry, which are close to Java: the proposed pointer constants, or what
I was on about? The examples I've seen of Java enums look like the
latter: versions of the classic enumeration types of Pascal.
Each enumeration value in Java is a distinct constructed object, and an
enum-type variable is a reference which can identify one of those
constructed objects. That's distinct from most dialects of Pascal, in
which enumerations are thinly-disguised integers.
BartC
2016-04-01 20:54:28 UTC
Permalink
Raw Message
Post by s***@casperkitty.com
Post by BartC
Sorry, which are close to Java: the proposed pointer constants, or what
I was on about? The examples I've seen of Java enums look like the
latter: versions of the classic enumeration types of Pascal.
Each enumeration value in Java is a distinct constructed object, and an
enum-type variable is a reference which can identify one of those
constructed objects.
That's sounds a rather inefficient implementation then (how do they work
for switch case labels?). But from a programmer point of view they look
like ordinary enums.
Post by s***@casperkitty.com
That's distinct from most dialects of Pascal, in
which enumerations are thinly-disguised integers.
But thinly disguised integers are exactly what we want! Then they can
generate more efficient code, be used as switch-case labels, as array
bounds...

However the C implementation is limited:

enum colours {red, green, blue};
enum lights {red, amber, green};

* C doesn't allow this because there are two reds and two greens

* C doesn't stop an int containing, say, colour.red from being assigned
lights.green. There is no type protection.

* There is no way, given an int containing an enum value, to get the
previous or successive enum (since there is no type identity)

* It's not possible to pick up the first or last of each enum sequence

* etc.

These are quite difficult to add to C without massive changes to the
type system. But, multiple enums, like my example, aren't so difficult.
It does mean having to type colours.red instead of red, as enums still
aren't proper types.

Another way is to have enums inside struct definitions, but that's a
more of a can of worms because it can lead to full classes and all that.
--
Bartc
s***@casperkitty.com
2016-04-01 21:15:16 UTC
Permalink
Raw Message
Post by BartC
Post by s***@casperkitty.com
Each enumeration value in Java is a distinct constructed object, and an
enum-type variable is a reference which can identify one of those
constructed objects.
That's sounds a rather inefficient implementation then (how do they work
for switch case labels?). But from a programmer point of view they look
like ordinary enums.
In Java, every variable is either a primitive or a reference to a user-
defined object. Each live reference will slightly increase the time required
for a garbage-collection cycle, but references can otherwise be manipulated
just as cheaply as integers, so there's not much of a performance penalty.

I'm not positive, but I believe that enumeration objects in Java have a
field which holds a sequential index, and switch/case constructs use that
field. This adds an extra layer of indirection to a switch/case, but if
an enum value is used very often the target of that indirection will be
cached so the cost won't be too high.
Tim Rentsch
2016-04-06 16:02:52 UTC
Permalink
Raw Message
Post by BartC
Post by Tim Rentsch
Post by Keith Thompson
Can you present an example of `enum : void* ...` being useful enough to
justify adding the feature to the language?
enum : void (*)(int) {
SIG_DFL = ...,
SIG_ERR = ...,
SIG_IGN = ...,
};
This is going a long way from what enumerated types were intended
for. Which was to define a related set of symbols that corresponded to
some integer, and where the integer values were usually assigned
automatically, and consecutively.
Yes, it is changing the notion of what enumerated types are,
and that is deliberate even in the original proposal. What
I'm suggesting is that the set of carrier types allowed not
be artificially restricted to integer types, but accept the
more general set of all scalar types.
Post by BartC
What your example wants is Named Constants (I'm lost with what C and
C++ are up to with these, but I mean entities that don't require
storage).
Not quite. What I think you may be missing is the "opaque"
enumerated type idea, where a header file declaration specifies
the carrier type but does not define any of the associated
identifiers. The type definition would be fully fleshed out only
in a single TU (or conceivably more than one, but importantly not
in every TU that includes the "opaque" type declaration). There
are lots of cases where having opaque types could be helpful, to
limit that creation of values of these types to just the TU(s)
that implement the type abstraction. In C, scalar types are the
right fit (ie, as the carrier type) for this application, as
their sizes are known to callers, with the enum "wrapping"
providing the opaqueness.

Ian Collins
2016-04-01 08:43:03 UTC
Permalink
Raw Message
Post by Tim Rentsch
Post by Ian Collins
Post by Tim Rentsch
I see no particular problem. Without thinking about it too
deeply, ISTM that the same rule could apply for non-integer types
as for integer types -- an initializer could have any type that
is assignable to the carrier type for an "open" enumerated type,
and any type that is assignable to the enumerated type itself for
a "closed" enumerated type. So what's the problem?
enum : void* {
first, // What value to you assign?
last // What value to you assign?
No value, it's a constraint violation: an expression of integer
type may not be assigned to a pointer type without a cast.
So what possible benefit is there in allowing anything other than an
integral type as the base type of an enum?
--
Ian Collins
Ian Collins
2016-03-24 03:26:30 UTC
Permalink
Raw Message
Post by Tim Rentsch
On the other other, given that the carrier type of a new-style
enum is known without needing to know its full definition, it
might be nice to allow an "opaque" enum definition, as for
enum blah : unsigned short;
I agree with this, it would be compatible with the new enums in C++11.
--
Ian Collins
Tim Rentsch
2016-03-24 22:21:15 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Tim Rentsch
On the other other, given that the carrier type of a new-style
enum is known without needing to know its full definition, it
might be nice to allow an "opaque" enum definition, as for
enum blah : unsigned short;
I agree with this, it would be compatible with the new enums in C++11.
Wow. Great minds think alike, eh? :)
Hans-Bernhard Bröker
2016-03-24 21:10:21 UTC
Permalink
Raw Message
Post by Keith Thompson
I think this was mentioned here recently, but I can't find the article.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf
is a proposal for an enhancement to C's enumerated types, allowing the
programmer to specify the type used to represent a given enumerated
type.
I have one more observation about this proposal, that I think has been
overlooked here so far. The proposal (and most of the discussion here)
only talks about how enumeration types are defined, but much less about
the enumeration values themselves behave. AFAICS, C99 6.4.4.3 is still
the rule:

"An identifier declared as an enumeration constant has type int."

N2008 shows no intent to change that. I don't see how that's supposed
to square up with uses of, say, en enum : unsigned long long or enum :
char type as proposed here. Particularly part two of the proposal
really makes no sense unless 6.4.4.3 is also changed.

enum E2 d = m21; /* legal */
enum E2 e = 2; /* illegal */

I don't see how one could be legal, but not the other. Both these lines
rely on the same implicit conversion of a constant of type "int" to an
enum type. It's used to initialize an enum-typed object here, but they
evidently intend likewise for assignments and other uses.
s***@casperkitty.com
2016-03-24 21:22:55 UTC
Permalink
Raw Message
Post by Hans-Bernhard Bröker
"An identifier declared as an enumeration constant has type int."
Presumably that rule would only apply to enumeration types that don't specify
some other behavior. It would be silly and obnoxious, for example, to say
that one is allowed to declare an enumeration whose underlying type is long
long, but not for it to have any named members beyond INT_MAX.

Further, if one of the goals is to add semantic validation to the language,
it would be necessary to recognize enumerations as distinct types, and
change the rules for arithmetic operators to accommodate their existence.
Given that the way arithmetic operators work on basic types was designed
more for compiler simplicity than semantic expressiveness, I would think
having new types which use different rules would help ease portability
issues without breaking existing code. Such rules might be too complex to
handle on a machine with 16K of RAM, and would thus not have been suitable
as part of the language in 1975, but this isn't 1975, and it should be
practical for compilers to offer much better semantics than would have
been practical then.
Ian Collins
2016-03-24 21:41:28 UTC
Permalink
Raw Message
Post by Hans-Bernhard Bröker
Post by Keith Thompson
I think this was mentioned here recently, but I can't find the article.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf
is a proposal for an enhancement to C's enumerated types, allowing the
programmer to specify the type used to represent a given enumerated
type.
I have one more observation about this proposal, that I think has been
overlooked here so far. The proposal (and most of the discussion here)
only talks about how enumeration types are defined, but much less about
the enumeration values themselves behave. AFAICS, C99 6.4.4.3 is still
"An identifier declared as an enumeration constant has type int."
The C++ wording would be appropriate:

"Each enumeration also has an underlying type. The underlying type can
be explicitly specified using enum-base; if not explicitly specified,
the underlying type of a scoped enumeration type is int."
--
Ian Collins
Hans-Bernhard Bröker
2016-03-25 01:23:23 UTC
Permalink
Raw Message
Post by Ian Collins
Post by Hans-Bernhard Bröker
I have one more observation about this proposal, that I think has been
overlooked here so far. The proposal (and most of the discussion here)
only talks about how enumeration types are defined, but much less about
the enumeration values themselves behave. AFAICS, C99 6.4.4.3 is still
"An identifier declared as an enumeration constant has type int."
"Each enumeration also has an underlying type. The underlying type can
be explicitly specified using enum-base; if not explicitly specified,
the underlying type of a scoped enumeration type is int."
I don't think that would apply. That wording is about the enum's type,
not that of its enumerators

The actual relevant wording, as I understand C1x draft n3337, is 7.2
[decl.enum] p5s5:

"Following the closing brace of an enum-specifier, each enumerator has
the type of its enumeration."

Because of this, C++ needs implicit integer promotion from enum to
integer for

enum { foo = 3 };
int bar = foo;

to be allowed. But In C, this `foo' already _is_ int.
Ian Collins
2016-03-25 01:49:27 UTC
Permalink
Raw Message
Post by Hans-Bernhard Bröker
Post by Ian Collins
Post by Hans-Bernhard Bröker
I have one more observation about this proposal, that I think has been
overlooked here so far. The proposal (and most of the discussion here)
only talks about how enumeration types are defined, but much less about
the enumeration values themselves behave. AFAICS, C99 6.4.4.3 is still
"An identifier declared as an enumeration constant has type int."
"Each enumeration also has an underlying type. The underlying type can
be explicitly specified using enum-base; if not explicitly specified,
the underlying type of a scoped enumeration type is int."
I don't think that would apply. That wording is about the enum's type,
not that of its enumerators
The actual relevant wording, as I understand C1x draft n3337, is 7.2
"Following the closing brace of an enum-specifier, each enumerator has
the type of its enumeration."
Yes, I agree.

Probably the easiest change is to substitute "enum-type-specifier" for
"int" in the current wording. That is probably the only option that
makes sense given the type specifier can be wider than int.
Post by Hans-Bernhard Bröker
Because of this, C++ needs implicit integer promotion from enum to
integer for
enum { foo = 3 };
int bar = foo;
to be allowed. But In C, this `foo' already _is_ int.
Yes, the difference is expressed in the choice of names:
"enum-type-specifier" in the C proposal and "enum-base" in the C++ standard.
--
Ian Collins
Keith Thompson
2016-03-24 21:55:51 UTC
Permalink
Raw Message
Post by Hans-Bernhard Bröker
Post by Keith Thompson
I think this was mentioned here recently, but I can't find the article.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf
is a proposal for an enhancement to C's enumerated types, allowing the
programmer to specify the type used to represent a given enumerated
type.
I have one more observation about this proposal, that I think has been
overlooked here so far. The proposal (and most of the discussion here)
only talks about how enumeration types are defined, but much less about
the enumeration values themselves behave. AFAICS, C99 6.4.4.3 is still
"An identifier declared as an enumeration constant has type int."
N2008 shows no intent to change that. I don't see how that's supposed
char type as proposed here. Particularly part two of the proposal
really makes no sense unless 6.4.4.3 is also changed.
enum E2 d = m21; /* legal */
enum E2 e = 2; /* illegal */
I don't see how one could be legal, but not the other. Both these lines
rely on the same implicit conversion of a constant of type "int" to an
enum type. It's used to initialize an enum-typed object here, but they
evidently intend likewise for assignments and other uses.
I presume it was simply an oversight. It makes sense for the constants
for enum types of the new form to be of the enumerated type or of the
underlying integer type. (Which one, and whether it matters, depends on
how the implicit conversion rules are adjusted.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Hans-Bernhard Bröker
2016-03-25 01:40:27 UTC
Permalink
Raw Message
Post by Keith Thompson
Post by Hans-Bernhard Bröker
I have one more observation about this proposal, that I think has been
overlooked here so far. The proposal (and most of the discussion here)
only talks about how enumeration types are defined, but much less about
the enumeration values themselves behave. AFAICS, C99 6.4.4.3 is still
"An identifier declared as an enumeration constant has type int."
N2008 shows no intent to change that.
I presume it was simply an oversight.
If so, then quite an oversight it is.

In the MISRA C environment this proposal originated in, I suspect fixing
this oversight might well end up negating a sizable portion of the
intended benefits.
Post by Keith Thompson
It makes sense for the constants
for enum types of the new form to be of the enumerated type or of the
underlying integer type. (Which one, and whether it matters, depends on
how the implicit conversion rules are adjusted.)
Whatever the types and their implicit conversions are decided to be,
they probably won't matter much to MISRA users. MISRA frowns upon all
implicit conversions, so people have to spell them out as explicit
conversions every single time.
Loading...