Post by Douglas A. GwynPost by Paul MensonidesThe wording of the C standard makes it clear that the rest of the preprocessing
tokens in the file are *not* included. "If the name of the macro being replaced
is found during this scan of the replacement list (not including the rest of the
file's preprocessing tokens), it is not replaced. Furthermore, if any nested
replacements encounter the name of the macro being replaced it is not replaced."
You left out the preceding paragraph, which sets the
"... Then, the resulting preprocessing token
sequence is rescanned, along with all subsequent
preprocessing tokens of the source file, for
more macro names to replace."
So, by your logic, ever macro expansion is a contained process that recurses
down the entire file. That is completely flawed reasoning. The above paragraph
shows how and where the *resumption* of scanning occurs. The following
paragraph defines the concept of identifier painting with the explicit caveat
that the rest of the preprocessing tokens are not part of that painting process.
Post by Douglas A. GwynThen the first sentence of the part you quoted is not
applicable to the inner "B" in our example, because it
does not exist yet during "this" scan (the scan of the
immediate consequence of tne parameter substitution
and #,## processing, looking for macro names).
However, the second sentence ("Furthermore, ..." does
apply to the name "B" that results from the nested "A"
replacement (which recursively involves its own scan
for further macro names).
The scan of the "immediate consequence" is normal scanning that proceeds
directly into the tokens that follow the outer invocation. The tokens that came
from the replacement list of that invocation constitute the explicit boundaries
for identifier painting. The second sentence is still under the context
established by the first sentence. The only question is whether a partial
invocation constitutes a nested invocation. Further, as I say again, your model
is flawed. The macro expansion process is not, by definition, like a function
call that ultimately returns the rescanned tokens of the replacement list. An
invocation is replaced by the replacement list before rescanning ever occurs
(which is explicit and unarguable). As such, there is no procedural nesting,
which ultimately mimics the intent of the recursive algorithm from which the
standard text was derived.
Post by Douglas A. GwynPost by Paul MensonidesNo, I didn't. The context no longer exists because the invocation is not nested
(see below). You are blatantly ignoring what the standard dictates because of a
typical view that macro expansion is like a procedural hierarchy. Yes, scanning
is not finished with the preprocessing tokens of B's replacement list, but is
finished with the *replacement* of the invocation of B.
No, it is not. The subclauses of 6.10.3 all are part
of the process of macro replacement described by clause
6.10.3. That includes 6.10.3.4 (rescanning and further
replacement).
This is what the standard literally says. This is *unarguable* because there is
no other literal way to interpret it.
[quote]
A preprocessing directive of the form
# define identifier replacement-list new-line
defines an object-like macro that causes each subsequent instance of the macro
name to be replaced by the replacement list of preprocessing tokens that
constitute the remainder of the directive.
[end quote]
It then further goes on to say:
[quote]
Each subsequent instance of the function-like macro name followed by a ( as the
next preprocessing token introduces the sequence of preprocessing tokens that is
replaced by the replacement list in the definition (an invocation of the macro).
[end quote]
This is explicit. An invocation is replaced by the replacement list in the
definition of the macro, not by the rescanned replacement list (not even by the
argument-substituted, stringized, and token-pasted replacement list).
Further, you logic that 6.10.3 describes a single macro expansion from start to
finish is incorrect. 6.10.3 describes macros. That includes the definition of
macros (which has nothing to do with the macro expansion process), the scope of
macro definitions, the invocation of macros, the semantics of macros, and the
undefinition of macros (which also has nothing to do with the macro expansion
process).
Post by Douglas A. GwynThere are ways for implementations to take shortcuts,
as it it well known that tail recursion can be replaced
by iteration. But the specification does not refer to
any such shortcuts, and if an implementation loses a
context due to being in too big a hurry to move on, it
is simply a defective implementation.
This is based on an incorrect assumption that a procedural model exists--which
it doesn't. You have to proof that "replaced by the replacement list in the
definition" and similar does not mean what it literally says. Instead, you
assume that it is procedural model and try to bend the standard text to that
assumption. As such, an implementation can implement macro expansion
recursively, but then it has to mimic the behavior described in the standard.
If it does not, then it is a defective implementation.
Post by Douglas A. GwynIf we were to take you view to its logical extreme,
then there would be no possibility of an in-process
macro name appearing within a nested expansion (past
the first level). That is contradicted by the standard
specifying special consequences for cases where that
*does* occur.
No, it isn't. It doesn't appear procedurally nested, but it does appear
geographically or syntactically nested. If we were to take your view to its
logical extreme, then each macro expansion would denote a recursion into all the
rest of the preprocessing tokens of a file, each time going deeper and deeper
into a procedural hierarchy that ultimately returns only when the end of the
file is reached. That is ridiculous.
Post by Douglas A. GwynHowever, the standard specifies when they are examined and/pr
replaced and when they are not examined and/or replaced.
When I say colloquially that tokens are "fetched", I mean
merely that they are available for examination and possible
replacement.
Okay. The only reason that I mention it is that you've made references to a
"replacement buffer" which (though is a possible implementation strategy) does
not exist in the conceptual model defined by the standard.
Post by Douglas A. GwynPost by Paul Mensonides.. That is not what the standard specifies. It specifies replacement first,
then rescanning, not rescanning followed by replacement.
I am not at all confused about the sequencing.
No, you're not confused. You're flat out wrong.
Post by Douglas A. GwynThe
process of macro replacement, starting just after
an identifier has been recognized as a defined macro
name during the appropriate phase of processing,
involves temporarily flagging that global identifer
as "in the process of being replaced", locating the
tokens for the macro arguments and fully macro-replacing
each of *them* [recursive subprocessing], substituting
^^^^^^^^^^^^^^^^^^^^^^^
Yes.
Post by Douglas A. Gwyneach fully-expanded argument for the corresponding
parameter in the definition for the current macro,
concurrent with #,## processing, then rescanning the
result of the previous operation, looking for
identifiers corresponding to defined macros, and for
if (global identifier is "in the process of
being expanded") then apply permanent blue
paint to that identfier pp token (not to
the global identifier, as I might have
mistakenly indicated in a previous posting).
if (identifier token has ever had blue paint
applied) then leave it intact;
else (identifier is a defined macro name,
and has not been painted blue) so
begin a macro replacement process for
that identifier, and if it is a
function-like macro, it is allowed to
access remaining pp tokens (i.e. those
not involved in any nesting replacement
process from expansion of macros that
ay still be in process) when fetching
its arguments.
*After* each such name (we're now back to the original
replacement buffer) has been fully macro-replaced, the
current macro replacement process (6.10.3) is complete
and the "in the process of being replaced" flag is
removed from the global identifier. The context now
pops back to whatever scanning was being done (top
level or some nested macro replacement).
That is absolutely *not* what the standard specifies. That is what you're
flawed conceptual model specifies. This is what the standard actually
specifies:
The process of macro replacement starts just after an identifier has been
recognized as a defined *object-like* macro name or just after a ( following an
identifier that has been recognized as a defined *function-like* macro name. If
it is a function-like macro invocation, the arguments are identified and each of
them are recursively subprocessed *if* the corresponding parameter appears in
the replacement list of the macro without being an operand of the stringizing or
token-pasting operators. The replacement list of the macro replaces the
invocation of the macro. Actual parameters (or fully macro expanded actual
parameters depending on the # and ## operators) are substituted for formal
parameters in the sequence of preprocessing tokens that came from the
replacement list.. Stringizing and token-pasting occur. Scanning resumes at
the first preprocessing token that came from the replacement list, and a context
is established (that will paint the macro name preprocessing tokens if found)
that endures until just after the last preprocessing token that came from the
replacement list.
This single process proceeds from the beginning of the file to the end of the
file--with the exception of the interpretation of directives (or
"non-directives") which only has a defined meaning if no contexts currently
exist and when a recursive subprocess of an argument is currently active. Note
once again that this process is not a direct translation of the "intent
algorithm". If it was, it would "engender" implementation strategies that would
be horribly inefficient, though it would produce similar results. The point
that is particularly important about the model is that no procedural nesting
occurs. Therefore, "nested" as used in the standard text can only mean nesting
within a context. The only question is whether an invocation that is partially
nested is considered nested. Even though that is unclear in a literal sense,
examination of the original intent shows that partial nesting should not be
considered nested. Granted, "original intent" is non-normative :), but it is a
reasonable thing to mimic in an implementation when the standard contains a
non-normative note in Annex J that says it is unspecified and when a C++ DR says
that it should be that way as well. Further, examination of the underlying
purpose of painting (which is to prevent the preprocessor implementation from
having to deal with infinite macro expansion which is very difficult to detect
because of the iterative model) yields the logical conclusion that only an
invocation that is completely nested could cause infinite expansion. This,
coupled with a great deal of existing code and the semantics of virtually every
existing C and C++ preprocessor in common use, results in only a single viable
way of interpreting an invocation that is partially nested.
Post by Douglas A. GwynPost by Paul Mensonides... it as if every invocation is physically inlined into the stream
of preprocessing tokens where the invocation existed (which is either the
top-level stream or an argument being processed as an separate stream).
That is merely what "substitution" consists of, and
has no deeper meaning.
"Substitution" occurs with arguments, replacement occurs for a macro invocation.
That replacement consists of the exact preprocessing tokens of the replacement
list as explicitly specified by the standard. There is no deeper meaning
involved, that is simply what the standard specifies.
Post by Douglas A. GwynPost by Paul MensonidesNo it doesn't. The invocation of B at this point is nested within the
rescanning of A's replacement list, but not B's.
That makes no sense if "nesting" has its logical
(procedural) meaning rather than a geographical
interpretation.
Yes, it makes no sense if "nesting" has a procedural meaning, but it doesn't. I
disagree that "logical" meaning implies procedural. Both procedural nesting and
syntactic nesting are perfectly logical forms of nesting.
Post by Douglas A. GwynRepeating what I said above, that
would mean that there would be no possibility of an
in-process macro name appearing within a nested expansion
past the first level, which is contradicted by the
standard discussing exactly that case.
The standard never refers to it those terms at all. The standard discusses only
the nesting within a context.
Post by Douglas A. GwynPost by Paul MensonidesPost by Douglas A. GwynNote that the process described in 6.10.3.4 is one
*component* of the process 6.10.3, not something that
happens after 6.10.3 is complete.
(re)scanning -> replacement -> substitution -> #/## ->
^ |
|____________________________________________________|
That is contrary to all similar constructions in the
standard, such as what constitutes the referred-to
operand in a subexpression,
Yes, macro expansion is quite different than a procedural hierarchy. The
assumption that it is because it seems "normal" is flawed. Just the fact that
partial nesting can occur shows that the model is very foreign to the normal
semantics of a procedural hierarchy.
Post by Douglas A. Gwynand is clearly contradicted
by the explicitly recursive wording of the specification,
The standard does not have any explicitly recursive wording regarding macro
expansion whatsoever. Instead, it literally and explicitly dictates an
iterative model (except for argument processing which is correctly recursively
subprocessed as you say). In fact, it is incredibly clear in this regard. The
*only* thing that isn't clear is what happens when an invocation is partially
nested. If procedural nesting occurred, "partial nesting" is impossible.
(also, if procedural nesting occurred there would probably be an
implementation-defined nesting limit (which there isn't)) What you keep
referring to are general notions "recursive wording" and that 6.10.3 defines a
single complete process for any give macro replacement. Neither of those
notions are present in the text of the standard. 6.10.3 refers to many things
besides the replacement process, the fact that 6.10.3.1 through 6.10.3.4 are
nested within 6.10.3 does not imply that they are subprocesses of some macro
expansion that began at 6.10.3 (and it especially doesn't imply that they are
subprocesses of macro definition or sibling processes of macro undefinition).
Post by Douglas A. Gwynand the explicit reference to more than one level of
nesting. (Your model involves only one level.)
The model involves only one level of *procedural* nesting (except for arguments
which can cause an number of levels), but any number of levels of *contextual*
nesting.
Post by Douglas A. GwynNote
also that macro replacement within each argument to a
macro is necessarily performed as if 6.10.3 is invoked
as a subroutine.
Not 6.10.3 as a whole (which makes no sense). Rather, only the parts dealing
with the macro replacement process. It is as if it was a separate file (i.e.
stream of preprocessing tokens), but possibly with some contexts already
established.
Post by Douglas A. GwynPost by Paul Mensonides6.10.3/9 - "...that causes each subsequent instance of the macro name to be
***replaced by the replacement list of preprocessing tokens that constitute the
remainder of the directive***."
6.10.3/10 - "...that is ***replaced by the replacement list*** in the
definition..."
Yes, that explains the eventual purpose of the "body"
of a macro *definition*. The actual replacement
process does not begin at that point, but upon
subsequent recognition of a defined identifier during
the top-level pass across the input to translation
phase 4, and also upon recognition of defined
identifiers during collection of macro arguments, and
also during recognition of defined identifiers during
the rescan phase of *each* macro expansion process,
except when the identifier pp token has had blue paint
applied.
No, those two locations are the only locations that describe the semantics of
replacing an instance of a macro invocation. It does not say that it does that
*anywhere* else. I agree that this is not a good way of describing the process,
but nevertheless, that is the way it is.
Post by Douglas A. GwynPost by Paul MensonidesA macro invocation is replaced by the preprocessing tokens of the replacement
list, *not* by the rescanned (i.e. macro-expanded) preprocessing tokens of the
replacement list. Conceptually, this step happens even before argument
substitution.
No, not at all. The complete replacement can involve
much more than what was contained explicitly within
the body of the definition. That is why we spell out
the process in detail.
"Completely replaced" is a term used only when referring to argument
subprocessing, not a single macro expansion. The standard uses "completely
replaced" referring to all the preprocessing tokens that make up an argument.
It does not say that all of the macro invocations within that sequence of
preprocessing tokens are completely replaced--only that the entire sequence is
completely replaced. (It does say that all the macro invocations are
"expanded", however.)
The replacement itself is specified in only two places (those that I quoted).
Those are not forward references to macro expansion, they define the initial
step of macro expansion as scanning procedes.
Post by Douglas A. GwynPost by Paul Mensonides... There is no procedural nesting, there is only
physical/geographical nesting designed to prevent macro expansion from looping
forever.
Actually, "looping forever" is a logical/procedural
notion, not a geographic one.
No, recursing forever is a procedural notion. Looping forever is an iterative
notion. And, once again, "logical" is not a synonym of "procedural" even when
computer science context is applied.
Post by Douglas A. GwynPost by Paul MensonidesPost by Douglas A. Gwyn... Further, the specification makes it very clear
that blue paint is permanent, so there is no way for the
newly created "B" to ever trigger macro replacement.
Absolutely, blue paint is permanent--
But since the inner "B" was not *geographically* nested,
by your interpretation you'd have to say that it wasn't
painted blue.
The second (i.e. "inner") B invocation (at point #2) is completely nested within
the A-disabling context. It therefore expands within that context which
subsequently causes the A identifier preprocessing token to be permanently
painted.
Post by Douglas A. GwynI still think this whole matter boils down to whether
"nested" means procedurally or geographically.
I think that it boils down to whether partially nested is considered nested in
this context. Either view amounts to the same thing though.
Post by Douglas A. GwynI know
that we recently discussed this in a WG14 meeting, but
I seem to recall that it was during an informal session
and that hope was expressed that nobody would file a DR
asking for a clarification. If as you have said there
is a "ton" of existing software that *relies* on that
detail, then it might even be unwise to ask for an
official ruling, which is likely to be as I have
explained, forcing implementations that have done it
wrong to make a choice between conformance or making
existing bogus code continue to do what the programmer
expected.
What you say is "wrong" is the original intent as specified by the (now
infamous) "intent algorithm". I realize that you probably don't think that C++
should matter to C in this area, but C++ already has a DR for this exact thing
with a probable resolution that it should adhere to the original intent as
defined by the algorithm in this area though it needs to be agreed on by both
committees. Some examples of the existing code to which I refer are C++
libraries in Boost whose use is widespread, the Boost preprocessor library
(which is both a C and C++ library) whose use outside of Boost exists in many
places (I even heard that someone was porting Loki to C with it.), Andrei
Alexandrescu's and John Torjo's SMART_ASSERT article/implementation (available
at CUJ), and Chaos (which is a significantly more advanced version of Boost
preprocessor that supports C99 constructs fully (such as variadic macros and
placemarkers)). The moral of the story is that the preprocessor is used for
code generation in many places, and its use for that purpose is growing. An
interpretation other than the original intent regarding partial invocation
causes a horribly exponential increase in the number of macros required to
implement a general purpose component in such a framework. Such a specification
would be pointless, and likely cause a divergence between the C and C++
preprocessors which would be unfortunate--especially considering that the C++
EWG has actively agreed that compatibility with the C preprocessor wherever
possible should happen.
Post by Douglas A. Gwyn(More likely, another option would be added
to the compiler, and mentioned in the conformance
section of the documentation.)
It already is in Annex J of C99.
Post by Douglas A. GwynIt would be good to advise programmers not to rely on
your model (the geographic nesting one) nor on my model
(the procedural nesting one) since implementations may
differ on this score.
And according to the (non-normative) note in Annex J, the partial invocation
case can go either way. However, the behavior that I support is by far the most
prevalent in existing preprocessors and matches the result of the original
intent and does not defy the underlying point of painting in the first place.
Regards,
Paul Mensonides