Discussion:
Can we add @ yet?
(too old to reply)
luserdroog
2016-11-23 23:58:17 UTC
Permalink
Raw Message
Are there any extant or possible implementations of C which
do not yet conform to ASCII 63? If not, can we add @ to the
execution character set already? I vote for it to be a regular
identifier character like _.
Jakob Bohm
2016-11-24 00:23:54 UTC
Permalink
Raw Message
Post by luserdroog
Are there any extant or possible implementations of C which
execution character set already? I vote for it to be a regular
identifier character like _.
I believe from some work I did recently, that systems using the Mach-O
and ELF binary file formats reserve use of the @ sign in assembler
identifiers (which such systems use as the intermediary format when
compiling C programs).

Futhermore the C extension known as "Objective C" uses the @ character
as an additional language operator, thus precluding its use in
identifiers in application code.

From a long time ago, I recall the situation for using the $ sign in
identifiers was somewhat similar, but on other systems.

Thus I don't think allowing either character ($ or @) in portable C
identifiers is a good idea.

Now all of that is about the source code character set. The execution
character set (i.e. what can appear in strings, string constants, I/O
library call arguments etc.) is a completely different matter, though I
don't know if there are still systems around that lack those ASCII
characters in their runtime character set.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
BartC
2016-11-24 01:00:15 UTC
Permalink
Raw Message
Post by Jakob Bohm
Post by luserdroog
Are there any extant or possible implementations of C which
execution character set already? I vote for it to be a regular
identifier character like _.
I believe from some work I did recently, that systems using the Mach-O
identifiers (which such systems use as the intermediary format when
compiling C programs).
as an additional language operator, thus precluding its use in
identifiers in application code.
From a long time ago, I recall the situation for using the $ sign in
identifiers was somewhat similar, but on other systems.
identifiers is a good idea.
What's wrong with using $? (Is it used for anything else?)

Most C compilers I use accept $ in identifiers. But not all (eg. Tiny
C). Which I find a nuisance as $ is invaluable in generated code to
distinguish user-identifiers from translator-created ones.

Using _ for that purpose is hopeless as it's hard to see (eg. names such
as _1, _2, _3 instead of $1, $2, $3) and leading underscores sometimes
have special meaning that you need to steer clear of.

Repeated underscores are also a problem, especially with a proportional
font: take __one and ___one for example, which appear as $$one and
$$$one using '$'.
--
Bartc
Keith Thompson
2016-11-24 01:16:07 UTC
Permalink
Raw Message
BartC <***@freeuk.com> writes:
[...]
Post by BartC
What's wrong with using $? (Is it used for anything else?)
Most C compilers I use accept $ in identifiers. But not all (eg. Tiny
C). Which I find a nuisance as $ is invaluable in generated code to
distinguish user-identifiers from translator-created ones.
If the standard were changed to permit $ in identifiers, you'd lose that
advantage.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
BartC
2016-11-24 11:22:46 UTC
Permalink
Raw Message
Post by Keith Thompson
[...]
Post by BartC
What's wrong with using $? (Is it used for anything else?)
Most C compilers I use accept $ in identifiers. But not all (eg. Tiny
C). Which I find a nuisance as $ is invaluable in generated code to
distinguish user-identifiers from translator-created ones.
If the standard were changed to permit $ in identifiers, you'd lose that
advantage.
But gcc and msvc allow it. They are both very popular, meaning that many
exclusively use those compilers with no intention of using any other,
yet you rarely see $ being used much, if at all.

Clashes would be very unlikely simply because $ /is/ perceived as special.

(I suppose you could have the rare situation of auto-translated code
itself being passed through an auto-translator, both generating names
using $, but there, in fact even in the simple case of one translating
pass, the software could transform most $s occurring in input code into
something that can't clash.)
--
Bartc
Keith Thompson
2016-11-24 19:43:06 UTC
Permalink
Raw Message
Post by BartC
Post by Keith Thompson
[...]
Post by BartC
What's wrong with using $? (Is it used for anything else?)
Most C compilers I use accept $ in identifiers. But not all (eg. Tiny
C). Which I find a nuisance as $ is invaluable in generated code to
distinguish user-identifiers from translator-created ones.
If the standard were changed to permit $ in identifiers, you'd lose that
advantage.
But gcc and msvc allow it. They are both very popular, meaning that many
exclusively use those compilers with no intention of using any other,
yet you rarely see $ being used much, if at all.
Clashes would be very unlikely simply because $ /is/ perceived as special.
There are two options.

1. $ is not permitted in identifiers, except perhaps as an extension.
You can't assume that an identifier containing $ is legal.

2. $ is permitted in identifiers. You can use $ in identifiers -- and
so can anyone else. You can't assume that $ is special.

I suppose there are other possibilities. You could permit $ in
identifiers, but reserve all such identifiers to the implementation
-- but that wouldn't let you safely use them in auto-generated code.
You could reserve them for auto-generated code, but IMHO that
would be a bad idea; for one thing, how do you define which code
is auto-generated?
Post by BartC
(I suppose you could have the rare situation of auto-translated code
itself being passed through an auto-translator, both generating names
using $, but there, in fact even in the simple case of one translating
pass, the software could transform most $s occurring in input code into
something that can't clash.)
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Keith Thompson
2016-11-24 00:53:39 UTC
Permalink
Raw Message
Post by luserdroog
Are there any extant or possible implementations of C which
do not yet conform to ASCII 63?
There are certainly systems that use EBCDIC rather than ASCII -- but
Post by luserdroog
execution character set already? I vote for it to be a regular
identifier character like _.
Probably not; others have mentioned some of the reasons. I've never
found the inability to use "@" in identifiers to be a problem.

Note that, of the printable ASCII characters, '@', '$', and '`' are all
missing from the required C character set.

C already uses more punctuation characters that most other languages do.
I don't think adding more would be useful.
--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Kaz Kylheku
2016-11-24 01:20:11 UTC
Permalink
Raw Message
Post by luserdroog
Are there any extant or possible implementations of C which
execution character set already? I vote for it to be a regular
identifier character like _.
That's rather poor; standardizing it as a constituent of an identifier
precludes it from being used for special syntax.
(Like it is in Objective C).

A compromise might be that it is a "non-terminating constituent
character", the way # is treated in Common Lisp.

This means that an identifier may not start with @, but @ can occur in
the middle.

Lisp has numerous prefix notations dispatched with #, yet abc#def is
just a symbol.

Under this compromise, if an implemenation introduces an extension
in which @ behaves as a binary operator, then ***@B must be written A @B,
otherwise it denotes the identifier ***@B.
BartC
2016-11-24 11:31:33 UTC
Permalink
Raw Message
Post by luserdroog
Are there any extant or possible implementations of C which
execution character set already? I vote for it to be a regular
identifier character like _.
(There was a post in comp.lang.c about 20 minutes before yours putting
forward some bizarre proposals for "@" inside expressions; not for
identifiers. Any connection with your post, or a coincidence?)

"@" I think looks too much like a separate symbol to be really thought
of as part of identifier. It looks too overbearing even to be used as a
mere separator.

Apart from which, a lot of software will think an identifier with an
embedded @, such as ***@def, is an email address, and highlight it
accordingly!
--
Bartc
luserdroog
2016-11-24 19:33:52 UTC
Permalink
Raw Message
Post by BartC
Post by luserdroog
Are there any extant or possible implementations of C which
execution character set already? I vote for it to be a regular
identifier character like _.
(There was a post in comp.lang.c about 20 minutes before yours putting
identifiers. Any connection with your post, or a coincidence?)
Coincidence from my end AFAIAA. I had not seen that
one when I wrote mine.
Post by BartC
of as part of identifier. It looks too overbearing even to be used as a
mere separator.
Apart from which, a lot of software will think an identifier with an
accordingly!
Those seems like pretty good reasons, especially with
Jakob's evidence suggesting that the implementations
which might want to make use of '@' already are doing
so, and messing with the situation would do more harm
than good.
Jakob Bohm
2016-11-24 19:42:29 UTC
Permalink
Raw Message
Post by BartC
Post by luserdroog
Are there any extant or possible implementations of C which
execution character set already? I vote for it to be a regular
identifier character like _.
(There was a post in comp.lang.c about 20 minutes before yours putting
identifiers. Any connection with your post, or a coincidence?)
of as part of identifier. It looks too overbearing even to be used as a
mere separator.
Apart from which, a lot of software will think an identifier with an
accordingly!
The uses on common POSIX systems (those using the ELF and Mach-O object
file formats, at least) use an assembler syntax that is "***@def" to
apply the "def" modifier to the "abc" identifier, since most C
compilers for these platforms compile via a textual C representation,
that precludes use of @ in the middle of identifiers.

Additionally (I didn't mention that previously), Microsoft compatible
compilers use @ and $ as part of the name mangling of C and C++
identifiers. Most notably, functions that use the __stdcall calling
convention are mangled by appending @nn where nn is the decimal number
of argument bytes removed from the stack by the function upon its
return.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
Martin Str|mberg
2016-11-24 17:12:35 UTC
Permalink
Raw Message
Post by luserdroog
Are there any extant or possible implementations of C which
execution character set already? I vote for it to be a regular
identifier character like _.
Are you talking about ASCII 63 (question mark) or ASCII 64 (at-sign)?
(Or should I assume ASCII 063 (3) as you _didn't_ mention that?)
--
MartinS
luserdroog
2016-11-24 19:27:39 UTC
Permalink
Raw Message
Post by Martin Str|mberg
Post by luserdroog
Are there any extant or possible implementations of C which
execution character set already? I vote for it to be a regular
identifier character like _.
Are you talking about ASCII 63 (question mark) or ASCII 64 (at-sign)?
(Or should I assume ASCII 063 (3) as you _didn't_ mention that?)
Sorry for the confusion, I meant 1963. But, upon checking
my facts, I really meant 1967. https://en.wikipedia.org/wiki/ASCII
Loading...