msize(ptr) function to report allocated area (was Is there really no standard-conforming function pointer equivalent of void*)

Depending upon how msize is defined, it may be possible for every possible
malloc implementation to implement it, though the usefulness of such
implementations could vary.

That definition is too weak as your dummy implementation shows.

The definition is perfectly adequate for a number of usage cases. There
are other possible usage cases for which it is inadequate. Oh the other
hand, the way I have defined the function would allow it to be implemented
legitimately on every C implementation. Code which was written to take
advantage of the function to yield better performance on systems that could
implement it usefully would not receive those performance benefits on those
systems which didn't implement it usefully, but would still run correctly
regardless. Since most systems could implement it somewhat usefully, adding
the function to the Standard would make it possible for portable code to
receive performance improvement on most systems while remaining compatible
with all.

It is >= the allocation size passed to the allocation function.

Adding that to the standard would increase the storage cost of mallco() on
some platforms, even when using code which didn't take any advantage of it.
In some cases, when using code that allocates many small objects, the extra
cost could be very significant.

Note that the only requirement of this hypothetical implementation is
that memory passed to free() can be returned by later calls to malloc().

I would suggest that in many common usage scenarios a quick answer to the
question "Is block X guaranteed to be large enough to hold Y bytes" would
be more useful than a really slow answer to the question "Exactly how many
bytes are available in block X". There could be some benefits to having
a more precise estimate available, but in many cases there would be a
workable strategy which code could follow if it didn't know the block size
(e.g. use realloc). The only question would be whether a faster strategy
would be whether an alternative might be better (e.g. don't use realloc,
and avoid the need to regenerate any pointers which might have been changed
if code had used realloc).

I would suggest that a similar principle should be applied to things like
relational operators on pointers. I can't think of any implementations for
which efficient code generation would be impaired by a requirement that
use of relational operators on pointers which are either valid or null
will never do anything other than yield 0, yield 1, or raise an
implementation-defined signal, but there are many cases where adding such
a guarantee to the language would make it possible for portable code to
run much more efficiently on common implementations. To be sure, stronger
guarantees would be better still, but even a weak guarantee like the
above would be vastly more useful than having no sane way to check things
like whether two regions could be guaranteed not to overlap, except via:

for (int i=0; i<region_size; i++)
if ((char*)p1 == ((char*)p2+i) ||
(char*)p2 == ((char*)p1+i))
return 1;

which would be portable, but would on many implementations be absurdly
slow.

Jakob Bohm

2016-02-02 01:41:46 UTC

Thanks for making this a new thread

Depending upon how msize is defined, it may be possible for every possible
malloc implementation to implement it, though the usefulness of such
implementations could vary.

That definition is too weak as your dummy implementation shows.

It is >= the allocation size passed to the allocation function.

And what platforms would that be, given the argument that you snipped
right here?

Note that the only requirement of this hypothetical implementation is
that memory passed to free() can be returned by later calls to malloc().

The intended definition (in less precise terms) would be that it
answers exactly "Is block X guaranteed to be large enough to hold Y
bytes", with the additional provision that it cannot contradict the
defined semantics of the other malloc-family functions by returning
less than what those calls guarantee by definition.

Nothing in the definition prevents an implementation from returning
less than what is actually available, just not less than what was
explicitly allocated.

Post by s***@casperkitty.com
I would suggest that a similar principle should be applied to things like
relational operators on pointers. I can't think of any implementations for
which efficient code generation would be impaired by a requirement that
use of relational operators on pointers which are either valid or null
will never do anything other than yield 0, yield 1, or raise an
implementation-defined signal, but there are many cases where adding such
a guarantee to the language would make it possible for portable code to
run much more efficiently on common implementations. To be sure, stronger
guarantees would be better still, but even a weak guarantee like the
above would be vastly more useful than having no sane way to check things
for (int i=0; i<region_size; i++)
if ((char*)p1 == ((char*)p2+i) ||
(char*)p2 == ((char*)p1+i))
return 1;
which would be portable, but would on many implementations be absurdly
slow.

With that I agree.

However there are two special cases not handled by your proposed weak
guarantees for pointer comparison:

If an implementation has a maximum object/allocation size a lot less
than the address range, relative operators could be implemented by
comparing only the low bits of the pointers. This would work if those
low bits are never 0 for a non-null pointer and objects/allocations
never cross the corresponding boundaries. An example would be some
compilers for the x86 platform in 32 or 16 bit modes (using 48 or 32
bit seg:ofs pointers).

If an implementation aliases certain portions of the address space,
but the comparisons compare the full pointer value, and the platform
creates situations where programs are likely to process such aliased
pointer values. One example would be x86_16 in the classic "real" or
"virtual real" modes where real_address = hi16bits(pointer) * 16 +
lo16bits(pointer).

Enjoy

Jakob

--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

s***@casperkitty.com

2016-02-02 17:16:44 UTC

Post by Jakob Bohm

Post by s***@casperkitty.com
Adding that to the standard would increase the storage cost of mallco() on
some platforms, even when using code which didn't take any advantage of it.
In some cases, when using code that allocates many small objects, the extra
cost could be very significant.

And what platforms would that be, given the argument that you snipped
right here?

Those in which memory allocations are rounded up to a significant size
multiple. If memory allocations need to be a multiple of 16 bytes and are
handled by an allocation service outside the control of the C library (may
be necessary for inter-operation with other languages) which does not make
the desired information available, the only way for malloc and friends to
make the information available to user code would be for malloc to add its
own 16-byte header to every object.

Post by Jakob Bohm
The intended definition (in less precise terms) would be that it
answers exactly "Is block X guaranteed to be large enough to hold Y
bytes", with the additional provision that it cannot contradict the
defined semantics of the other malloc-family functions by returning
less than what those calls guarantee by definition.

I would leave out that additional provision.

Post by Jakob Bohm

Post by s***@casperkitty.com
I would suggest that a similar principle should be applied to things like
relational operators on pointers.

However there are two special cases not handled by your proposed weak
If an implementation aliases certain portions of the address space,
but the comparisons compare the full pointer value, and the platform
creates situations where programs are likely to process such aliased
pointer values. One example would be x86_16 in the classic "real" or
"virtual real" modes where real_address = hi16bits(pointer) * 16 +
lo16bits(pointer).

My proposed pointer semantics would allow for that. If two pointers x
and y identify different objects, the expression of "x > y" would, in the
absence of an implementation-defined signal, be required to yield 0 or
yield 1, but could make the selection in Unspecified fashion.

Keith Thompson

2016-02-02 19:26:56 UTC

Post by Jakob Bohm

Post by s***@casperkitty.com
Adding that to the standard would increase the storage cost of
mallco() on some platforms, even when using code which didn't take
any advantage of it. In some cases, when using code that allocates
many small objects, the extra cost could be very significant.

And what platforms would that be, given the argument that you snipped
right here?

Those in which memory allocations are rounded up to a significant size
multiple. If memory allocations need to be a multiple of 16 bytes and
are handled by an allocation service outside the control of the C
library (may be necessary for inter-operation with other languages)
which does not make the desired information available, the only way
for malloc and friends to make the information available to user code
would be for malloc to add its own 16-byte header to every object.

Such platforms certainly could exist -- but do they?

[...]

Post by s***@casperkitty.com
My proposed pointer semantics would allow for that. If two pointers x
and y identify different objects, the expression of "x > y" would, in the
absence of an implementation-defined signal, be required to yield 0 or
yield 1, but could make the selection in Unspecified fashion.

I presume the "implementation-defined signal" wording is inspired by the
semantics of an overflowing conversion to a signed integer type (that
was added in C99).

What is the benefit of permitting a pointer comparison to raise an
implementation-defined signal? It would be impossible for a portable
program to handle such a signal. I have the same question about signed
integer conversion, but I'm currently asking why you want to add such
wording for pointer comparison.

--
Keith Thompson (The_Other_Keith) kst-***@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

s***@casperkitty.com

2016-02-02 21:52:11 UTC

Post by s***@casperkitty.com
Those in which memory allocations are rounded up to a significant size
multiple. If memory allocations need to be a multiple of 16 bytes and
are handled by an allocation service outside the control of the C
library (may be necessary for inter-operation with other languages)
which does not make the desired information available, the only way
for malloc and friends to make the information available to user code
would be for malloc to add its own 16-byte header to every object.

Such platforms certainly could exist -- but do they?

Platforms where memory must be rounded up to a significant multiple are
common. Platforms where a C library couldn't readily ascertain the size
of an allocated block are less common, but I can imagine some allocation
systems where that would be the case, especially on systems where the
stack grows from the bottom of memory and the heap grows from the top.

The authors of the Standard are very loath to add any new requirements for
a system to be capable of "running C programs".

I presume the "implementation-defined signal" wording is inspired by the
semantics of an overflowing conversion to a signed integer type (that
was added in C99).

Yep.

Post by Keith Thompson
What is the benefit of permitting a pointer comparison to raise an
implementation-defined signal? It would be impossible for a portable
program to handle such a signal. I have the same question about signed
integer conversion, but I'm currently asking why you want to add such
wording for pointer comparison.

If code isn't expected to use any relational comparisons between pointers
to disjoint objects, being able to trap on such accesses could possibly
be useful. I expect such an allowance would probably get used about as
much as the provision allowing out-of-range integer conversions raise
signals, but it would make clear that the real requirement is that any
implementation which does anything other than yield a value must document
its unusual behavior.

Keith Thompson

2016-02-03 00:30:49 UTC

Such platforms certainly could exist -- but do they?

That sounds like an "I don't know". (BTW, neither do I.)

Post by s***@casperkitty.com
The authors of the Standard are very loath to add any new requirements for
a system to be capable of "running C programs".

True -- but they've been willing to quietly add new requirements
that all real-world implementations already meet. For example, the
requirement that all-bits-zero is a representation of 0 (perhaps not
the only or even canonical representation, but *a* representation)
for all integer types was added in one of the post-C99 Technical
Corrigenda.

If it turns out that all existing hosted implementations allow a
reliable msize() implementation, then the theoretical possibility of
an implementation that doesn't wouldn't necessarily be an obstacle
to standardizing it. Of course the committee could have other
reasons not to do so.

I presume the "implementation-defined signal" wording is inspired by the
semantics of an overflowing conversion to a signed integer type (that
was added in C99).

Yep.

I've never heard of a compiler that takes advantage of the
permission to raise a signal on a signed conversion overflow.
That suggests to me that the main result of adding it in C99
was to give C programmers one more thing to worry about, with no
corresponding real-world benefit. (I welcome counterarguments.)
It would have made more sense, IMHO, to permit a standard-defined
signal to be raised, so that portable programs can have some hope of
handling it. (It's also odd that conversion is treated differently
than arithmetic operations.)

As for pointer comparisons that currently have undefined behavior,
I suggest that an unspecified result, either 0 or 1, would be better
than permitting a signal to be raised.

As I understand it, relational comparisons between pointers to
different objects are undefined to allow for segmented architectures,
permitting such implementations to compare only the offset portion
of the pointers. I'm not aware of any current implementations that
do anything other than yielding 0 or 1, though the result might
not be meaningful. In other words, I *think* that changing such
comparisons from undefined behavior to an unspecified result would
not affect any existing implementations.

s***@casperkitty.com

2016-02-03 18:16:32 UTC

Post by s***@casperkitty.com
Platforms where memory must be rounded up to a significant multiple are
common. Platforms where a C library couldn't readily ascertain the size
of an allocated block are less common, but I can imagine some allocation
systems where that would be the case, especially on systems where the
stack grows from the bottom of memory and the heap grows from the top.

That sounds like an "I don't know". (BTW, neither do I.)

Many embedded systems "weakly" define malloc() and friends so that they
may be replaced by programmer-supplied alternates; the behavior of a
which defines its own "malloc" substitutes is not covered by the Standard,
but many implementations define it as usefully employing the programmer's
supplied alternate function. If mspace were allowed to return zero, a
system where malloc is weakly defined as equivalent to __builtin_malloc
could define mspace as

if (malloc == __builtin_malloc)
return __builtin_mspace(ptr);
else
return 0;

and remain compatible with existing code that overrides malloc. What would
you suggest as an implementation for mspace() if it couldn't return zero?

Post by Keith Thompson
True -- but they've been willing to quietly add new requirements
that all real-world implementations already meet. For example, the
requirement that all-bits-zero is a representation of 0 (perhaps not
the only or even canonical representation, but *a* representation)
for all integer types was added in one of the post-C99 Technical
Corrigenda.

What's needed is a willingness to acknowledge behaviors as normative but
not necessarily universal. If a system represents null pointers as all
bits zero, using "calloc" to create a new array of structures holding
a mix of pointers and integers will be cleaner than having to wipe them
all individually, and if it's impossible for two structures whose storage
was initially zeroed to have all of their corresponding elements be equal
if the underlying bytes don't all match, then "memcmp" may be cleaner than
a bunch of structure-element comparisons. Such optimizations would be
valid on the vast majority of C implementations; it's too bad there are no
standard macros programs can test to determine or assert their usability.

Post by Keith Thompson
I've never heard of a compiler that takes advantage of the
permission to raise a signal on a signed conversion overflow.
That suggests to me that the main result of adding it in C99
was to give C programmers one more thing to worry about, with no
corresponding real-world benefit. (I welcome counterarguments.)
It would have made more sense, IMHO, to permit a standard-defined
signal to be raised, so that portable programs can have some hope of
handling it. (It's also odd that conversion is treated differently
than arithmetic operations.)

For many applications, except when it would cause compatibility problems
with existing code, it would be very useful to guarantee that if/when the
program returns from main(), one of the following will hold:

1. All operation on "number" types [as distinct from "algebraic ring"
types which are specified as wrapping values outside their range]
will have been performed in arithmetically-correct fashion.

2. The implementation will have indicated via some means that overflow
or narrowing integer conversion may have caused results to be computed
in unexpectedly-incorrect fashion.

Many other languages can, at least optionally, provide such a guarantee. If
the guarantees are as stated (note that #1 does not assert that no overflows
have occurred--merely that they have not caused the computation of incorrect
results) the performance impact from having a language offer such guarantees
could be far less than that of requiring user code to include all the checks
necessary to provide them. Perhaps the authors of the Standard didn't wish
to preclude the possibility that C might catch up to other programming
languages, some of which have been able to offer such guarantees for over
half a century.

I don't think it would be useful for an implementation to raise an
implementation-defined signal if they don't also specify a behavior on
overflow, but I would think it could be useful for an implementation to
specify traps on overflow and specify those same traps on out-of-range
narrowing conversions.

Post by Keith Thompson
As I understand it, relational comparisons between pointers to
different objects are undefined to allow for segmented architectures,
permitting such implementations to compare only the offset portion
of the pointers. I'm not aware of any current implementations that
do anything other than yielding 0 or 1, though the result might
not be meaningful. In other words, I *think* that changing such
comparisons from undefined behavior to an unspecified result would
not affect any existing implementations.

That's probably true, but allowing implementations to specify that something
else will happen would avoid foreclosing the possibility that some present
or future implementation might have some alternative useful behavior.

The key point is that right now programmers have to worry that compilers
can arbitrarily change the behavior of code which uses such comparisons so
as to yield results totally different from anything that could happen if
the comparisons yielded 1 or yielded 0. Under the present standard, given:

int arr[10];
void foo(int *p)
{
*p = 123;
if (p >= arr && p < arr+10)
...
...
}

an 8086 compiler which knew that "arr" was in the main data segment
could "safely" ignore the segment part of *p not only when doing the
comparison, but **also when doing the assignment**. Allowing a compiler
to raise an implementation-defined signal at the point of comparison
wouldn't make a programmer worry about anything the programmer doesn't
have to worry about, but would instead let the programmer know that
unless an implementation documents unusual behavior, it's not going
to do anything wacky.

Kaz Kylheku

2016-02-03 01:11:10 UTC

Such platforms certainly could exist -- but do they?

Platforms where the C library *doesn't* are very common. Malloc
implementations commonly do not retain this information at all; the
request size is mapped to some available block size which fits the
request, and then forgotten.

Even this allocated size may not be readily available. For instance
a request for 47 bytes could come from a special heap which contains
only objects of 48 bytes, tightly packed together with no meta-data
between them. At free() time the implementation takes the pointer,
and from it recovers the pointer to the entire heap structure.
Only then does it know that this is a 48 byte object, in the 205th
position in the array (and no trace of the original 47 is
retained anywhere).

Any requirement to recover the 47 would interfere with the design goals
of this scheme, which is to have very tight packing of allocated
objects, with only a few bits of overhead per unit: perhaps even less
than one pit.

Keith Thompson

2016-02-03 01:33:28 UTC

Such platforms certainly could exist -- but do they?

Platforms where the C library *doesn't* are very common. Malloc
implementations commonly do not retain this information at all; the
request size is mapped to some available block size which fits the
request, and then forgotten.
Even this allocated size may not be readily available. For instance
a request for 47 bytes could come from a special heap which contains
only objects of 48 bytes, tightly packed together with no meta-data
between them. At free() time the implementation takes the pointer,
and from it recovers the pointer to the entire heap structure.
Only then does it know that this is a 48 byte object, in the 205th
position in the array (and no trace of the original 47 is
retained anywhere).
Any requirement to recover the 47 would interfere with the design goals
of this scheme, which is to have very tight packing of allocated
objects, with only a few bits of overhead per unit: perhaps even less
than one pit.

I don't think anyone was suggesting that msize() must report the exact
size that was requested. Rather, it would return a value >= the
originally requested size, such that memory up to the reported size is
accessible.

For example:

char *p = malloc(47); // allocates, say, 48 bytes; the 47 is forgotten
size = msize(p); // size == 48
for (i = 0; i < size; i ++) {
// p[i] is accessible
}

So my question is whether retrieving the actual allocated size is
possible on all existing implementations.

Of course it could be implemented to storing the allocated size
separately, but I don't think we'd want to impose a non-zero overhead
for that unless the user asks for it.

Kaz Kylheku

2016-02-03 05:01:09 UTC

Such platforms certainly could exist -- but do they?

Platforms where the C library *doesn't* are very common. Malloc
implementations commonly do not retain this information at all; the
request size is mapped to some available block size which fits the
request, and then forgotten.
Even this allocated size may not be readily available. For instance
a request for 47 bytes could come from a special heap which contains
only objects of 48 bytes, tightly packed together with no meta-data
between them. At free() time the implementation takes the pointer,
and from it recovers the pointer to the entire heap structure.
Only then does it know that this is a 48 byte object, in the 205th
position in the array (and no trace of the original 47 is
retained anywhere).
Any requirement to recover the 47 would interfere with the design goals
of this scheme, which is to have very tight packing of allocated
objects, with only a few bits of overhead per unit: perhaps even less
than one pit.

I don't think anyone was suggesting that msize() must report the exact
size that was requested. Rather, it would return a value >= the
originally requested size, such that memory up to the reported size is
accessible.
char *p = malloc(47); // allocates, say, 48 bytes; the 47 is forgotten
size = msize(p); // size == 48
for (i = 0; i < size; i ++) {
// p[i] is accessible
}
So my question is whether retrieving the actual allocated size is
possible on all existing implementations.

Modulo the API function to do it being actually available,
it must be possible in any implementation that meaningfully supports
the free function.

The proof is by Reductio ad Absurdum. If we suppose that it is not
possible (that is to say, after handing out the block to the program,
the implementation has absolutelly no recollection of that block's extent,
and no way to calculate it) then it is not possible for the free
function to return the block to storage. This contradicts the working
assumption that we have a working, meaningful free function for
recycling storage.

Here I am making the claim that given a pointer to some block which was
previously allocated, it is not possible to liberate the memory, so that
it is available to new allocations, without knowing how far that memory
extends. The memory can only be liberated as far as the lesser of the
next allocated object, or else the end of memory. If we know where the
next allocated object lies, then in fact we do know the extent of the
allocated block, contradicting the assumption that this information is
not known or computable. If we don't know where the next allocated
object lies, then we have no basis for liberating the memory. We have
to know how much to liberate, and if we have no idea whether the guessed
extent includes an allocated object or not, we can hardly proceed.

Not to know the size of the block means not to have it stored anywhere
(such as in a header), *and* not having a list of all allocated objects
which would allow us to traverse them and determine the lowest addressed
one whose address is higher than the given block.

(If we instead have a list of all free space, and only that, then the
areas between the free spaces are allocated blocks. But those areas
alone don't provide the information about how they are divided into
allocated blocks. A given allocated zone could be a single object, or
it could be a pair of objects, or three. In these cases, we have no idea
where the boundaries lie. Our free function needs a size argument.)

Post by Keith Thompson
Of course it could be implemented to storing the allocated size
separately, but I don't think we'd want to impose a non-zero overhead
for that unless the user asks for it.

Any allocator which doesn't have this info needs a free function that
takes the size as an argument (and that size is blindly trusted).

Such an allocator needs not associate any meta-data with allocated blocks
at all, and thus can thus have low space overhead. It can keep meta-data
about free regions only.

This is actually reasonable. Many programs, or program modules, not only
don't need to inquire about the size of an allocated object, but in
fact they know what it is, statically (from the type it is used for).

Jakob Bohm

2016-02-03 10:34:15 UTC

Such platforms certainly could exist -- but do they?

Platforms where the C library *doesn't* are very common. Malloc
implementations commonly do not retain this information at all; the
request size is mapped to some available block size which fits the
request, and then forgotten.
Even this allocated size may not be readily available. For instance
a request for 47 bytes could come from a special heap which contains
only objects of 48 bytes, tightly packed together with no meta-data
between them. At free() time the implementation takes the pointer,
and from it recovers the pointer to the entire heap structure.
Only then does it know that this is a 48 byte object, in the 205th
position in the array (and no trace of the original 47 is
retained anywhere).
Any requirement to recover the 47 would interfere with the design goals
of this scheme, which is to have very tight packing of allocated
objects, with only a few bits of overhead per unit: perhaps even less
than one pit.

I don't think anyone was suggesting that msize() must report the exact
size that was requested. Rather, it would return a value >= the
originally requested size, such that memory up to the reported size is
accessible.
char *p = malloc(47); // allocates, say, 48 bytes; the 47 is forgotten
size = msize(p); // size == 48
for (i = 0; i < size; i ++) {
// p[i] is accessible
}
So my question is whether retrieving the actual allocated size is
possible on all existing implementations.

Modulo the API function to do it being actually available,
it must be possible in any implementation that meaningfully supports
the free function.
The proof is by Reductio ad Absurdum. If we suppose that it is not
possible (that is to say, after handing out the block to the program,
the implementation has absolutelly no recollection of that block's extent,
and no way to calculate it) then it is not possible for the free
function to return the block to storage. This contradicts the working
assumption that we have a working, meaningful free function for
recycling storage.
Here I am making the claim that given a pointer to some block which was
previously allocated, it is not possible to liberate the memory, so that
it is available to new allocations, without knowing how far that memory
extends. The memory can only be liberated as far as the lesser of the
next allocated object, or else the end of memory. If we know where the
next allocated object lies, then in fact we do know the extent of the
allocated block, contradicting the assumption that this information is
not known or computable. If we don't know where the next allocated
object lies, then we have no basis for liberating the memory. We have
to know how much to liberate, and if we have no idea whether the guessed
extent includes an allocated object or not, we can hardly proceed.
Not to know the size of the block means not to have it stored anywhere
(such as in a header), *and* not having a list of all allocated objects
which would allow us to traverse them and determine the lowest addressed
one whose address is higher than the given block.
(If we instead have a list of all free space, and only that, then the
areas between the free spaces are allocated blocks. But those areas
alone don't provide the information about how they are divided into
allocated blocks. A given allocated zone could be a single object, or
it could be a pair of objects, or three. In these cases, we have no idea
where the boundaries lie. Our free function needs a size argument.)

Post by Keith Thompson
Of course it could be implemented to storing the allocated size
separately, but I don't think we'd want to impose a non-zero overhead
for that unless the user asks for it.

Any allocator which doesn't have this info needs a free function that
takes the size as an argument (and that size is blindly trusted).
Such an allocator needs not associate any meta-data with allocated blocks
at all, and thus can thus have low space overhead. It can keep meta-data
about free regions only.
This is actually reasonable. Many programs, or program modules, not only
don't need to inquire about the size of an allocated object, but in
fact they know what it is, statically (from the type it is used for).

Indeed, the pascal language standard allocator does take the size as an
(explicit or implicit) arguments to its memory deallocation function.

Enjoy

Jakob

Keith Thompson

2016-02-03 16:57:45 UTC

[...]

Post by Keith Thompson
I don't think anyone was suggesting that msize() must report the exact
size that was requested. Rather, it would return a value >= the
originally requested size, such that memory up to the reported size is
accessible.
char *p = malloc(47); // allocates, say, 48 bytes; the 47 is forgotten
size = msize(p); // size == 48
for (i = 0; i < size; i ++) {
// p[i] is accessible
}
So my question is whether retrieving the actual allocated size is
possible on all existing implementations.

Modulo the API function to do it being actually available,

Right, it might not be available to the C library implementation.

Post by Kaz Kylheku
it must be possible in any implementation that meaningfully supports
the free function.
The proof is by Reductio ad Absurdum. If we suppose that it is not
possible (that is to say, after handing out the block to the program,
the implementation has absolutelly no recollection of that block's extent,
and no way to calculate it) then it is not possible for the free
function to return the block to storage.

Hmm. I'm not entirely convinced of that. I can imagine implementations
where free() can work, but retrieving the allocated size is difficult.
I haven't taken the time to construct a plausible (or implausible)
scenario. (I have some vague thoughts about a collection of linked
lists, one list for each allocation size.)

Post by Kaz Kylheku
This contradicts the working
assumption that we have a working, meaningful free function for
recycling storage.
Here I am making the claim that given a pointer to some block which was
previously allocated, it is not possible to liberate the memory, so that
it is available to new allocations, without knowing how far that memory
extends.

Even if the size has to be known at some level, it might not be known to
the C library.

But if there are no such systems in the real world, it might be
reasonable to standardize an msize() function (assuming it's
sufficiently useful).

[...]

Post by Kaz Kylheku
This is actually reasonable. Many programs, or program modules, not only
don't need to inquire about the size of an allocated object, but in
fact they know what it is, statically (from the type it is used for).

That doesn't work if you allocate a non-constant sized array:

double *data = malloc(n * sizeof *data);

(Presumably the program will remember the value of n.)

Jakob Bohm

2016-02-03 22:40:07 UTC

Post by Keith Thompson
[...]

Modulo the API function to do it being actually available,

Right, it might not be available to the C library implementation.

Hence the reason my description of msize() provided no performance
guarantees.

Even if the size has to be known at some level, it might not be known to
the C library.
But if there are no such systems in the real world, it might be
reasonable to standardize an msize() function (assuming it's
sufficiently useful).
[...]

double *data = malloc(n * sizeof *data);
(Presumably the program will remember the value of n.)

Those kinds of arrays are indeed the reason, combined with an undesire
to explicitly remember n (especially in layered contexts where multiple
layers would otherwise have to allocate space for, and execute code to
manage, a layer-specific copy of the allocation size).

For example, here are some common uses I have had before (simplified),
neither would work if msize() could return less than the requested
allocation size (or even 0 for a nonzero allocation size).

/* Use instead of free() where the memory may have contained passwords
* etc. */
void free_secure(void *p)
{
if (p) {
memset_no_optimize_away(p, 0, msize(p));
free(p);
}
}

/* Same as realloc(), but ensures any added memory is all zero,
* Use with a similar calloc()-wrapper that zeroes any
* overallocation, because calloc() doesn't guarantee that */
void *realloc_zero(void *p, size_t s_new)
{
size_t s_old = 0;
void *p_new;

if (p)
s_old = msize(p);
p_new = realloc(p, s_new);
if (p_new)
p = p_new; /* In case realloc succeeded */
if (p) { /* Check if old p grew in a failed realloc too */
s_new = msize(p);
if (s_new > s_old)
memset(((char*)p) + s_old, 0, s_new - s_old);
}

return p_new;
}

Enjoy

Jakob

s***@casperkitty.com

2016-02-03 23:59:54 UTC

Post by Jakob Bohm
For example, here are some common uses I have had before (simplified),
neither would work if msize() could return less than the requested
allocation size (or even 0 for a nonzero allocation size).

If there were a macro which indicated what level of guarantees an msize
implementation could provide, code which would rely upon such guarantees
could refuse compilation on platforms which couldn't provide them, without
affecting the usability of platforms that can't offer such guarantees to
run programs that don't need them.

Post by Jakob Bohm
/* Use instead of free() where the memory may have contained passwords
* etc. */
void free_secure(void *p);

That really needs to be a separate library function, given that even if you
zero out memory before freeing it a compiler could still legitimately omit
that operation since there would be no defined means by which a legitimate
C program could detect such omission.

Post by Jakob Bohm
/* Same as realloc(), but ensures any added memory is all zero,
* Use with a similar calloc()-wrapper that zeroes any
* overallocation, because calloc() doesn't guarantee that */
void *realloc_zero(void *p, size_t s_new);

What should be the effect of:

unsigned char *p = realloc_zero(0, 1000000); // Create new region
p[999999] = 123;
for (int i=999999; i>=1; i--)
p = realloc_zero(p, i);
p = realloc_zero(p, 1000000);

How long should the code take if only a few (if any) of the realloc
operations bother to resize p? If a 1000000-byte block is reallocated
as 900000 but might report itself as being 1000000 afterward the only
way to know that a future realloc_zero(p, 1000000) will have the last
100,000 bytes zeroed out will be to zero out the memory before the
realloc_zero. If after that operation the size is still reported as
1000000 then the next attempt to realloc_zero(p, 899999) will have to
clear out the last 100k+1 bytes again.

While I can see some uses for msize() which would require that it yield
a value at least as big as the requested space, and would favor a macro
indicating whether msize can be counted upon to behave that way, I can
also see uses for an msize which might not always be able to report the
allocated size and think a design should allow for that.

Jakob Bohm

2016-02-05 06:50:50 UTC

Post by Jakob Bohm
/* Use instead of free() where the memory may have contained passwords
* etc. */
void free_secure(void *p);

Hence my reference to the special memset_no_optimize_away()
hypothetical library function, which can also do this for e.g.
automatic variables that are about to be "removed" from the
stack.

A similar zeroizing function is often found in libraries used when
needing such a free_zero() wrapper.

unsigned char *p = realloc_zero(0, 1000000); // Create new region
p[999999] = 123;
for (int i=999999; i>=1; i--)
p = realloc_zero(p, i);
p = realloc_zero(p, 1000000);

It may end up with p[999999] equal to 123 or 0, but not any arbitrary
uninitialized value.

Post by s***@casperkitty.com
How long should the code take if only a few (if any) of the realloc
operations bother to resize p? If a 1000000-byte block is reallocated
as 900000 but might report itself as being 1000000 afterward the only
way to know that a future realloc_zero(p, 1000000) will have the last
100,000 bytes zeroed out will be to zero out the memory before the
realloc_zero. If after that operation the size is still reported as
1000000 then the next attempt to realloc_zero(p, 899999) will have to
clear out the last 100k+1 bytes again.

Actually not, hence the check of both old and new size. If a
particular realloc() call acts like a NOP, the ifs would skip the
memset() call, but use two msize() calls to determine that it actually
acted as NOP.

Post by s***@casperkitty.com
While I can see some uses for msize() which would require that it yield
a value at least as big as the requested space, and would favor a macro
indicating whether msize can be counted upon to behave that way, I can
also see uses for an msize which might not always be able to report the
allocated size and think a design should allow for that.

So far, I have seen no realistic arguments why any real implementation
would need to make msize() fail/return "unknown".

Specifically, for the "overridable malloc" case, those would obviously
need to also allow msize() override and document this as an important
thing to do when upgrading programs that actually override malloc.

For the "calls lower level allocator", virtually all of those lower
level allocators either provide a usable msize-equivalent or require
the malloc implementation to store the requested size for passing to
the lower level free-equivalent.

Enjoy

Jakob

Francis Glassborow

2016-02-04 11:20:25 UTC

Post by Keith Thompson
[...]

Modulo the API function to do it being actually available,

Right, it might not be available to the C library implementation.

Even if the size has to be known at some level, it might not be known to
the C library.
But if there are no such systems in the real world, it might be
reasonable to standardize an msize() function (assuming it's
sufficiently useful).
[...]

double *data = malloc(n * sizeof *data);
(Presumably the program will remember the value of n.)

Who owns unallocated memory? Does the standard prohibit being owned by
the OS. All that free has to do is to notify the heap manager that the
block of memory it originally handed out with this address is no longer
in use. If the heap manager is not directly accessible to the runtime
system (works my receiving and sending messages) then it is entirely
possible that all your executable can so is use the block it has
acquired with malloc and free it at the end. If the program tries to
access memory outside the block chaos can occur but such access is
undefined behaviour so chaos is allowed.

Where does the Standard require that the runtime system track the size
of allocated blocks of dynamic memory?

Francis

James Kuyper

2016-02-04 15:18:21 UTC

On 02/04/2016 06:20 AM, Francis Glassborow wrote:
...

Post by Francis Glassborow
Who owns unallocated memory? Does the standard prohibit being owned by
the OS. All that free has to do is to notify the heap manager that the
block of memory it originally handed out with this address is no longer
in use. If the heap manager is not directly accessible to the runtime
system (works my receiving and sending messages) then it is entirely
possible that all your executable can so is use the block it has
acquired with malloc and free it at the end. If the program tries to
access memory outside the block chaos can occur but such access is
undefined behaviour so chaos is allowed.

That just passes the issue one level farther down. Whoever actually is
managing unallocated memory must know how much space was set aside for a
given allocation. It should, therefore, be feasible for the memory
manager (whether it's the OS or the malloc() family of functions or some
other piece of software) to answer questions about the size. If it
doesn't, the people responsible for that software can be asked to add
that feature, which should be no more expensive than in terms of
development and execution time than free().

Post by Francis Glassborow
Where does the Standard require that the runtime system track the size
of allocated blocks of dynamic memory?

It doesn't. If it did, there would be no need to propose that the
standard be changed to require it. The issues is whether imposing such a
requirement would be unacceptably burdensome on whoever wrote the
relevant memory management system. I don't see how it could be - even on
systems like the one you describe, it shouldn't be a difficult request
to make of the authors of the OS.

--
James Kuyper

s***@casperkitty.com

2016-02-04 17:23:42 UTC

Post by James Kuyper
If it
doesn't, the people responsible for that software can be asked to add
that feature, which should be no more expensive than in terms of
development and execution time than free().

The free() function does not need to determine the size of an allocation
at the time that it is freed. It would be legitimate (and in some cases
helpful) for an implementation to merely have free() set a bit in the byte
preceding an allocation, and have the memory allocation function examine
the marker bytes for allocations at its convenience (not necessarily before
every allocation) and see which ones are still in use.

There exist compilers, certainly for embedded systems and probably for hosted
systems as well, which allow user-installed allocation handlers; such things
are sometimes necessary to make code written in C coexist nicely with code
written in other languages. I've seen no indication of how msize() could
return useful information in such an implementation.

BTW, I think a more fundamental weakness lies in realloc; there are a number
of usage cases for that function; the optimal behavior for the function
varies widely among the different cases, but there's no way an implementation
can know which usage case fits any particular call. At minimum, I'd suggest
at least the following usage cases:

1. Expand the block size if practical without relocating it, or else report
that relocation might be necessary. If the block is larger than the
requested size, keep its size. Report actual allocated size if
practical.

2. Expand or shrink the block size if practical without relocating it, or
else report that relocation might be necessary. Report the actual size
if practical.

3. Expand the block size to at least the requested value; if relocation is
necessary, over-allocate on the expectation that further expansions may
be required. If the block is larger than requested size, keep its size.
Report allocated size if practical, but report at least requested size
unless allocation fails.

4. Set the block size to the exact requested value if practical; eagerly
relocate blocks in cases where it will reduce fragmentation. Report
size as with #3.

In many cases, code may need to have multiple objects that can change size
while they're being built, but will know at some point that objects have
reached their final size. Over-allocation is useful while objects are being
built, but wasteful once construction is complete. An implementation could
implement the above usage cases by having #1 and #2 simply report that
relocation may be required, and size is unavailable, and having #3 and #4
simply call realloc and report that relocation may have occurred. That
wouldn't be as efficient as having code tailored to the above usage
patterns, but would allow code written for the new usage patterns to be used
on systems that can't implement them by adding a simple wrapper.

Richard Damon

2016-02-04 23:30:37 UTC

Post by Keith Thompson
So my question is whether retrieving the actual allocated size is
possible on all existing implementations.

Modulo the API function to do it being actually available,
it must be possible in any implementation that meaningfully supports
the free function.
The proof is by Reductio ad Absurdum. If we suppose that it is not
possible (that is to say, after handing out the block to the program,
the implementation has absolutelly no recollection of that block's extent,
and no way to calculate it) then it is not possible for the free
function to return the block to storage. This contradicts the working
assumption that we have a working, meaningful free function for
recycling storage.
Here I am making the claim that given a pointer to some block which was
previously allocated, it is not possible to liberate the memory, so that
it is available to new allocations, without knowing how far that memory
extends. The memory can only be liberated as far as the lesser of the
next allocated object, or else the end of memory. If we know where the
next allocated object lies, then in fact we do know the extent of the
allocated block, contradicting the assumption that this information is
not known or computable. If we don't know where the next allocated
object lies, then we have no basis for liberating the memory. We have
to know how much to liberate, and if we have no idea whether the guessed
extent includes an allocated object or not, we can hardly proceed.
Not to know the size of the block means not to have it stored anywhere
(such as in a header), *and* not having a list of all allocated objects
which would allow us to traverse them and determine the lowest addressed
one whose address is higher than the given block.
(If we instead have a list of all free space, and only that, then the
areas between the free spaces are allocated blocks. But those areas
alone don't provide the information about how they are divided into
allocated blocks. A given allocated zone could be a single object, or
it could be a pair of objects, or three. In these cases, we have no idea
where the boundaries lie. Our free function needs a size argument.)

Post by Keith Thompson
Of course it could be implemented to storing the allocated size
separately, but I don't think we'd want to impose a non-zero overhead
for that unless the user asks for it.

Any allocator which doesn't have this info needs a free function that
takes the size as an argument (and that size is blindly trusted).
Such an allocator needs not associate any meta-data with allocated blocks
at all, and thus can thus have low space overhead. It can keep meta-data
about free regions only.
This is actually reasonable. Many programs, or program modules, not only
don't need to inquire about the size of an allocated object, but in
fact they know what it is, statically (from the type it is used for).

The fundamental question is does there exist any (reasonable)
implementation that defers (at least some) allocations to the OS, and
the OS doesn't provide an API to get some form of block size.

I could see an implementation that uses a normal heap for 'normal' sized
objects, but for efficiency defers 'large' allocations to the OS to
handle (especially for things like calloc, the OS might be able to
provide zero pages cheaper). If the OS doesn't currently support the
ability to get the size of a block allocated this way, then the
implementation can't support the msize definition. Asking the OS to add
the feature isn't very useful. What would you think of a implementation
that says only conforms when the generated program is run on Windows 11
or later (when Windows 10 is the current version)?

James Kuyper

2016-02-05 03:10:08 UTC

On 02/04/2016 06:30 PM, Richard Damon wrote:
...

Post by Richard Damon
The fundamental question is does there exist any (reasonable)
implementation that defers (at least some) allocations to the OS, and
the OS doesn't provide an API to get some form of block size.
I could see an implementation that uses a normal heap for 'normal' sized
objects, but for efficiency defers 'large' allocations to the OS to
handle (especially for things like calloc, the OS might be able to
provide zero pages cheaper). If the OS doesn't currently support the
ability to get the size of a block allocated this way, then the
implementation can't support the msize definition.

I think you're loosing sight of the relevant issue. No matter what the
OS does, the malloc() family can always support msize() by simply
keeping track of the information itself. If you request a block from the
OS of size N, and the OS provides it, then msize() could, at a minimum,
simply return N. The OS might allocate a block of memory larger than N,
and might not provide any mechanism of determining how much larger, but
the malloc() family doesn't need to know how much extra space the OS
allocated - the specification for msize() allows it to return the
requested size.

The question in this thread has never been about whether it would be
possible to implement msize() - it obviously is. The question has been
about whether it would be trivial to implement it on all platforms. It
certainly would be trivial on most platforms, because they already have
ways of quickly determining that information. The question has been
about whether there are any significant number of obscure platforms that
would require major changes to the C standard library to implement msize().

Post by Richard Damon
... Asking the OS to add
the feature isn't very useful. What would you think of a implementation
that says only conforms when the generated program is run on Windows 11
or later (when Windows 10 is the current version)?

I'd have little use for it; I'd have equally little use for it even if
it ran on all versions of Windows. But there's nothing particularly odd
about applications that only work with the most recent versions of an OS.

--
James Kuyper

Richard Damon

2016-02-05 12:09:56 UTC

I think you're loosing sight of the relevant issue. No matter what the
OS does, the malloc() family can always support msize() by simply
keeping track of the information itself. If you request a block from the
OS of size N, and the OS provides it, then msize() could, at a minimum,
simply return N. The OS might allocate a block of memory larger than N,
and might not provide any mechanism of determining how much larger, but
the malloc() family doesn't need to know how much extra space the OS
allocated - the specification for msize() allows it to return the
requested size.
The question in this thread has never been about whether it would be
possible to implement msize() - it obviously is. The question has been
about whether it would be trivial to implement it on all platforms. It
certainly would be trivial on most platforms, because they already have
ways of quickly determining that information. The question has been
about whether there are any significant number of obscure platforms that
would require major changes to the C standard library to implement msize().

You seem to be talking on both sides of the coin. The question has been
can this feature be implemented everywhere at low cost. If we have an
implementation that forwards 'page' sized request directly to the OS,
then to require the implementation to add overhead to that request to
track the size of the allocation can be expensive. I would imagine that
a large percentage of such request could well be exactly the size of a
page (particularly for application wanting efficiency and knowing that
such requests get forwarded as a page size request to the OS). As such
adding ANY information as a header block has a reasonable chance of
incurring an overhead of a full memory page to store that data.

I don't know if there are currently any implementations that do work
this way, but I do believe there have been.

Note, I didn't say only the most recent version of Windows, but only on
the NEXT version of Windows. If the standard added a feature which
requires an OS upgrade to support, then it would not work on the
'current' version of the OS, only something later. Perhaps Microsoft's
update cycle is faster than the ISO Standard Body, but it would be
highly unusual for the ISO committee to make a requirement that is
costly to be meet on an implementation.

If the addition allows the implementation to return 0 (and not a number
at least as large as the original request), then it can be trivially
added (as you always have the option of just always returning 0) but
then programs can't depend on the value being useful. I could see the
requirement being a number between the requested size or the allocated
size, or 0, so the program can at least know if the implementation is
being unhelpful (except for the case of an implementation returning
valid pointers for a 0 sized block, but there isn't much the program can
do with such blocks anyway).

Jakob Bohm

2016-02-05 14:53:13 UTC

I think you're loosing sight of the relevant issue. No matter what the
OS does, the malloc() family can always support msize() by simply
keeping track of the information itself. If you request a block from the
OS of size N, and the OS provides it, then msize() could, at a minimum,
simply return N. The OS might allocate a block of memory larger than N,
and might not provide any mechanism of determining how much larger, but
the malloc() family doesn't need to know how much extra space the OS
allocated - the specification for msize() allows it to return the
requested size.
The question in this thread has never been about whether it would be
possible to implement msize() - it obviously is. The question has been
about whether it would be trivial to implement it on all platforms. It
certainly would be trivial on most platforms, because they already have
ways of quickly determining that information. The question has been
about whether there are any significant number of obscure platforms that
would require major changes to the C standard library to implement msize().

You seem to be talking on both sides of the coin. The question has been
can this feature be implemented everywhere at low cost. If we have an
implementation that forwards 'page' sized request directly to the OS,
then to require the implementation to add overhead to that request to
track the size of the allocation can be expensive. I would imagine that
a large percentage of such request could well be exactly the size of a
page (particularly for application wanting efficiency and knowing that
such requests get forwarded as a page size request to the OS). As such
adding ANY information as a header block has a reasonable chance of
incurring an overhead of a full memory page to store that data.
I don't know if there are currently any implementations that do work
this way, but I do believe there have been.

This would only be a problem if the OS lacks a call to indicate the
number of pages allocated in a page-allocation request (i.e. an
msize-like syscall for page allocations).

For instance Win32 and Win64 have the VirtualQuery() call to get this
information (and this API is as old as the Win32 API itself).

Similarly, Win16 has/had the GlobalSize() and LocalSize() calls for
those system allocators, and GlobalSize() could usually be inlined as a
CPU instruction.

I am not certain what the equivalent is for an anonymous mmap() in the
POSIX standard, which is always a useful place to check when adjusting
the C standard.

Note, I didn't say only the most recent version of Windows, but only on
the NEXT version of Windows. If the standard added a feature which
requires an OS upgrade to support, then it would not work on the
'current' version of the OS, only something later. Perhaps Microsoft's
update cycle is faster than the ISO Standard Body, but it would be
highly unusual for the ISO committee to make a requirement that is
costly to be meet on an implementation.
If the addition allows the implementation to return 0 (and not a number
at least as large as the original request), then it can be trivially
added (as you always have the option of just always returning 0) but
then programs can't depend on the value being useful. I could see the
requirement being a number between the requested size or the allocated
size, or 0, so the program can at least know if the implementation is
being unhelpful (except for the case of an implementation returning
valid pointers for a 0 sized block, but there isn't much the program can
do with such blocks anyway).

Such implementations should at least signal via a define (standardized
along with msize itself) if they

a) Always fails msize() calls, in which case applications can avoid
trying to use it.
b) Sometimes fails msize() calls, in which case applications need to
check for the error return value (0 in your suggestion), or may
decide they have no use for an unreliable msize() function.
c) Never fails msize() calls for valid memory allocations, in which
case applications can omit the overhead of checking for the error
return value.

For type b implementations, it should be required that it fails for a
valid allocation if and only if it will consistently fail for that
allocation as long as it remains valid (i.e. until it is freed by a
call to free(), realloc() with some arg/retval combos or any future
similar API).

Enjoy

Jakob

Richard Bos

2016-02-05 16:39:32 UTC

Post by Jakob Bohm
b) Sometimes fails msize() calls, in which case applications need to
check for the error return value (0 in your suggestion), or may
decide they have no use for an unreliable msize() function.

Erm.

I would suggest that, as with malloc(), this is _always_ a good idea,
regardless of the theoretical capabilities of the system.

For one, you Implementors are only _slightly_ more perfect than us
normal programmers. I'll trust you, but I _will_ tie my camel's leg.

Richard

Jakob Bohm

2016-02-05 17:01:02 UTC

Post by Richard Bos

Erm.
I would suggest that, as with malloc(), this is _always_ a good idea,
regardless of the theoretical capabilities of the system.
For one, you Implementors are only _slightly_ more perfect than us
normal programmers. I'll trust you, but I _will_ tie my camel's leg.
Richard

Actually, I am not a C language implementer, but an advanced C language
user who often writes wrappers that abstract away compiler and system
differences.

And dealing with systems where basic functions don't always do what
they should adds real overhead to code that needs to check for those
errors and dynamically adapt. For instance some wrappers would need to
completely abandon msize() use if it won't work every time. Others
would have to set up different own data structure for allocations where
msize() fails than for those where it succeeds (specifically to store
the allocation size when the system doesn't). Either type would
benefit immensely from knowing what the system does and does not
provide in a way so the unneeded or unused alternative algorithm and
data structure can be completely omitted in the many common cases where
they are not needed.

Enjoy

Jakob

James Kuyper

2016-02-05 19:25:34 UTC

...

Note, I didn't say only the most recent version of Windows, but only on
the NEXT version of Windows.

You're saying that someone would release an implementation of C that
requires Windows 11, before Windows 11 has become "the current version
of Windows"? I knew the Windows world differed from the Linux world, but
I didn't think the differences extended that far.

Post by Richard Damon
... If the standard added a feature which
requires an OS upgrade to support, then it would not work on the
'current' version of the OS, only something later.

That's not particularly odd. New features if C almost never work on the
current version of the C implementation, either. There's always a
waiting period before full conformance to a new version of the standard
become commonplace. Arguably, C99 never even reached that point.

Post by Richard Damon
... Perhaps Microsoft's
update cycle is faster than the ISO Standard Body, but it would be
highly unusual for the ISO committee to make a requirement that is
costly to be meet on an implementation.

True - but unless and until you can explain why, I find it difficult to
imagine that adding the necessary functionality to Windows to enable
implementation of msize() would be "costly"

Post by Richard Damon
If the addition allows the implementation to return 0 (and not a number
at least as large as the original request), then it can be trivially
added (as you always have the option of just always returning 0) but
then programs can't depend on the value being useful. I could see the
requirement being a number between the requested size or the allocated
size, or 0, so the program can at least know if the implementation is
being unhelpful (except for the case of an implementation returning
valid pointers for a 0 sized block, but there isn't much the program can
do with such blocks anyway).

No one has suggested that msize() be specified in a way that would allow
such an implementation. More tellingly, no one has suggested a plausible
reason why it would be difficult to add the necessary functionality (if
necessary, at the OS level rather than in the malloc() family), so
there's no obvious need to allow such an implementation. I would expect
that implementations which let the OS handle some of that allocations,
where the OS implementor is unwilling to provide the needed information,
are sufficiently rare that I'm quite comfortable with the idea of
mandating that such implementations keep track of the information
themselves.

Richard Damon

2016-02-08 03:59:37 UTC

Note, I didn't say only the most recent version of Windows, but only on
the NEXT version of Windows.

What I am saying that IF it is the case that an implementation sometimes
defers malloc calls directly to the OS, and the OS doesn't provide a
suitable API to get the allocated block size, then if the standard
required the function, then that implementation could only be able to
provide it based on a future update of the OS. This would be a good
reason for someone to object to that addition to the standard.

Post by Richard Damon
... If the standard added a feature which
requires an OS upgrade to support, then it would not work on the
'current' version of the OS, only something later.

Not future C implementation, future version of OS to run the program.
This means that not only do you need the implementation updated, but the
USERS (not just the programmers) need to use an updated environment.
This is an unusual requirement.

True - but unless and until you can explain why, I find it difficult to
imagine that adding the necessary functionality to Windows to enable
implementation of msize() would be "costly"

The basis is that for a function like calloc, which needs to zero the
memory, there are often tricks that the OS can do the vastly improve the
efficiency of creating a block of all zeros.

The key issue here is that if an implementation like GCC wanted to add
this feature, THEY have no ability to add the feature to the OS, THEY
are not the provider of the OS.

Again, if it needs to be added at the OS level, then it might be
possible that only the OS provider would be able to provide an efficient
C implementation.

Keith Thompson

2016-02-08 04:48:50 UTC

Note, I didn't say only the most recent version of Windows, but only on
the NEXT version of Windows.

That would be necessary only to implement msize() without additional
overhead.

Regardless of the underlying OS, a C implementation could implement
msize() by recording the requested size for each call to malloc().

BTW, the C standard already imposes a constraint on library
implementations: the ability to release memory without being told
how big it is. If the standard specified
void *malloc(size_t size);
void dealloc(void *ptr, size_t size);
and somebody proposed adding:
void free(void *ptr);
then we'd now be having a very similar discussion.

[...]

James Kuyper

2016-02-08 16:08:31 UTC

Note, I didn't say only the most recent version of Windows, but only on
the NEXT version of Windows.

I don't see the problem. The implementation has two options:
1. Announce that it will not be fully conforming with the new
requirements until after that OS has been updated. There's nothing
particular unusual about it taking some time before implementations
fully conform to a new version of the standard. Many never got around to
fully conforming to C99, and as I remember it, it was at least 5 years
after C90 before I saw a fully conforming implementation.
2. Keep track of the requested size of every block of memory requested
from the OS, and have msize() report that size.

Post by Richard Damon
... If the standard added a feature which
requires an OS upgrade to support, then it would not work on the
'current' version of the OS, only something later.

Not future C implementation, future version of OS to run the program.

You mean that a current C implementation could make use of a future
version of the OS, before it's even been released? That doesn't make any
sense to me. If it uses a feature that's only provided by a future
version of the OS, it's necessarily also a future implementation of C.

Post by Richard Damon
This means that not only do you need the implementation updated, but the
USERS (not just the programmers) need to use an updated environment.
This is an unusual requirement.

Well, since the requirement can be met by the implementation keeping
track of the requested size, it's not necessary to require users to wait
for the upgraded environment.

A good analogy would be floating point support. If an early version of C
had been released that did not require floating point support, an
implementation targeting a platform with optional hardware support for
floating point operations would probably have not bothered providing
software emulation for them; if the customer wanted floating point
operations, they could install the optional FPU. If a later version of C
had mandated floating point support, it would require either that the
user change his environment (install the optional FPU) or that the
implementation provide software emulation.

True - but unless and until you can explain why, I find it difficult to
imagine that adding the necessary functionality to Windows to enable
implementation of msize() would be "costly"

The basis is that for a function like calloc, which needs to zero the
memory, there are often tricks that the OS can do the vastly improve the
efficiency of creating a block of all zeros.

That doesn't say anything to me about why it would be difficult for the
OS to report the size of an allocated bloc of memory.

Post by Richard Damon
The key issue here is that if an implementation like GCC wanted to add
this feature, THEY have no ability to add the feature to the OS, THEY
are not the provider of the OS.

So, talk to the provider of the OS and request the feature. Unless and
until they agree to provide the feature, record the requested size and
have msize() return it.