Does `sizeof` *really* evaluate to a `std::size_t`? Can it?

60

2

Take the following standard passage:

[C++11: 5.3.3/6]: The result of sizeof and sizeof... is a constant of type std::size_t. [ Note: std::size_t is defined in the standard header <cstddef> (18.2). —end note ]

Now:

[C++11: 18.2/6]: The type size_t is an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object.

Granted, the passage doesn't require that size_t is a type alias defined with typedef, but since it's explicitly stated to be made available by the standard header <cstddef>, I think we can take as read that failing to include <cstddef> should remove any guarantee that size_t shall be available to a program.

However, according to that first quote, we can regardless obtain an expression of type std::size_t.

We can actually demonstrate both of these facts:

int main()
{
    typedef decltype(sizeof(0)) my_size_t;

    my_size_t x   = 0;  // OK
    std::size_t y = 1;  // error: 'size_t' is not a member of 'std'
}

std::size_t is not visible to the program, but sizeof(0) still gives us one? Really?

Is it therefore not correct to say that 5.3.3/6 is flawed, and that it actually has "the same type as whatever std::size_t resolves to", but not std::size_t itself?

Sure, the two are one and the same if std::size_t is a type alias but, again, nowhere is this actually required.

Lightness Races in Orbit

Posted 2013-12-27T22:51:52.273

Reputation: 279 083

I'm pretty sure "the same type as whatever std::size_t resolves to" is std::size_t, regardless of how size_t is defined or whether it's an alias for anything. That said, I'm not enough of a C++ language lawyer to answer this question. – user2357112 – 2013-12-27T22:56:14.113

17"I think we can take as read that failing to include <cstddef> should remove any guarantee that size_t shall be available to a program" - you need to include cstddef to use the size_t name, but not the size_t type. – user2357112 – 2013-12-27T22:57:46.230

@user2357112: int and long are distinct types, despite having the same properties on my platform. – Lightness Races in Orbit – 2013-12-27T22:57:48.317

@user2357112: That might be the answer. I'd like to see a further exploration of that idea! – Lightness Races in Orbit – 2013-12-27T22:58:07.050

1@LightnessRacesinOrbit Related: operator new, operator delete and variations thereof are also available without the inclusion of any header files, but somewhere in the standard (I suppose 3.7.4 but not sure), I read that using them "does not make the operator std::operator new(std::size_t) visible", or something like that. – None – 2013-12-27T23:01:53.470

To be clear, I didn't expect y to compile just because x did. – Lightness Races in Orbit – 2013-12-27T23:03:00.367

4@H2CO3 the type is available in the language, no library needed. The library provides a convenient name for it, that's all. – R. Martinho Fernandes – 2013-12-27T23:05:01.543

@R.MartinhoFernandes It's just that the standard wording refers to this type with this name, which seems to be trying to make a binding between sizeof and this name, which is wrong?! Of course I'm being massively picky... – Lightness Races in Orbit – 2013-12-27T23:05:47.910

Answers

45

Do not confuse the map for the territory.

Types can be named by typenames. These typenames can be built-in, they can be user-defined types, or they could even be template parameters and refer to multiple different types depending on the instantiation.

But the names are not the types. Clearly standard does not mandate that all types have names -- the classic struct {} is a type without a name.

std::size_t is a typename. It names the type that sizeof(expression) returns.

The compiler could have a canonical name for the type -- __size_t would be one way for it to have a unique built-in canonical typename.

The standard guarantees in that clause that whatever the type of sizeof(expression) is, once you #include <cstddef>, the name std::size_t now refers to that type.

In the standard, they refer to types by names. They do not say "the type that this typename refers to", but simply say "the type $NAME$". The compiler could decide that int is another name for __int_32_fast if it wanted to, and the standard would have no objection either.

This same thing happens with std::nullptr_t and std::initializer_list<Ts> and std::type_info: use of variables of those types does not always require that the header that provides you with a name for those types be included in your program.

The traditional C/C++ built-in types all had canonical names that did not require a header. The downside is that this breaks existing code, as new typenames in the global scope collide with other identifiers.

By having "nameless types", where you can get a name for them via including a header file, we avoid that problem.

Yakk - Adam Nevraumont

Posted 2013-12-27T22:51:52.273

Reputation: 179 086

What do you think about 5.7.6? Where it does exactly what I suggest, for std::ptrdiff_t. Isn't this, at the very least, an inconsistency in the choice of wording? – Lightness Races in Orbit – 2013-12-27T23:23:39.000

1@LightnessRacesInOrbit Yes, inconsistency in wording. Same with nullptr_t -- lots of inconsistent wording. Probably could be cleared up? But is pretty esoteric, is it worth the effort? – Yakk - Adam Nevraumont – 2013-12-27T23:29:59.140

Nope, doubt it. Just wanted us all to agree :) – Lightness Races in Orbit – 2013-12-27T23:36:25.080

If we add this in, that's the answer I think – Lightness Races in Orbit – 2013-12-27T23:49:40.340

Types and typenames... this reminds me variables' names and pointers. We can access a variable by name, but also can access it by pointer which can be returned from a function (in this case we don't need to include a header that declares variable's name). Data is just a record in memory which can be accessed by name or by pointer. Similarly types are records in compiler memory which can be referred from program by typename or by what operators return (using auto or decltype). – anton_rh – 2015-12-05T17:41:50.650

51

The standard just mandates that the type of sizeof(expr) is the same type as std::size_t. There is no mandate that using sizeof(expr) makes the name std::size_t available and since std::size_t just names one of the the built-in integral types there isn't really a problem.

Dietmar Kühl

Posted 2013-12-27T22:51:52.273

Reputation: 124 519

It says that the type of sizeof(expr) is std::size_t, which is where I have a problem. – Lightness Races in Orbit – 2013-12-27T22:57:12.877

20@LightnessRacesinOrbit: There's a difference between types and type names. The type is always there, but the name isn't. – Kerrek SB – 2013-12-27T22:58:05.723

@KerrekSB: That sounds promising. Can you back it up with standard wording? I suppose it comes from the fact that numeric types cannot be magically created by a program, but it's all very indirect, isn't it? – Lightness Races in Orbit – 2013-12-27T22:58:51.960

3@LightnessRacesinOrbit: it is a bit like the symbol 1: until you know how to write it, you can't spell it directly. Just because you don't know how to spell it it doesn't mean the concept isn't there. – Dietmar Kühl – 2013-12-27T22:58:59.287

@DietmarKühl: I just find it hard to accept that sizeof be guaranteed to yield a std::size_t when that type name is not visible to the program. Or, at least, I wish the standard were clearer about this. I think the answer I'm looking for touches on auto with private types, and the nullptr debacle from http://stackoverflow.com/q/17069315/560648.

– Lightness Races in Orbit – 2013-12-27T23:00:09.293

@LightnessRacesinOrbit Also, I wish that an operator, which is part of the core language, weren't tied to a type available in the library. That's just conceptually wrong, isn't it? – None – 2013-12-27T23:02:41.877

6@LightnessRacesinOrbit: Well, it's quite straight-forward, isn't it? You can declare names, and you can define types. A type can have many names, declared with typedef or using, or as a template parameter, but only one definition - it's either built-in or user-defined. And traits only work on types, not names, so there's no trait to tell you whether something is a typedef or a primary name. – Kerrek SB – 2013-12-27T23:03:39.400

@H2CO3: The same observation, from the Lounge, spawned this question. :) – Lightness Races in Orbit – 2013-12-27T23:03:50.247

@LightnessRacesinOrbit Ugh... So it doesn't only twist my mind :P – None – 2013-12-27T23:04:28.680

4@H2CO3: note that there is, conceptually, not separation between the language and the library! It is one thing and language features may be implemented in terms of what we might consider a library or library features could be implemented by the language. – Dietmar Kühl – 2013-12-27T23:05:33.250

1@DietmarKühl But that's just downright stupid. – None – 2013-12-27T23:06:26.610

@LightnessRacesinOrbit: I think the beginning of Clause 3 (Basic Concepts) contains useful pointers... – Kerrek SB – 2013-12-27T23:10:09.297

@H2CO3: Not necessarily. There are traits in the standard library that cannot be implemented without compiler support (is_class is one I think, plus all the traits like is_nothrow_***_constructible), yet they are exposed as class templates – Andy Prowl – 2013-12-27T23:10:42.970

@AndyProwl I beg to differ. "There are traits in the standard library that cannot be implemented without compiler support" - and who said that this was a good idea? The library and the core language should be entirely decoupled. – None – 2013-12-27T23:11:38.773

@H2CO3: There is probably not much point in re-designing the standard library such that it can function without the existence of the core language... – Lightness Races in Orbit – 2013-12-27T23:12:13.297

@LightnessRacesinOrbit That's not what I'm saying. – None – 2013-12-27T23:12:34.480

@H2CO3: Messin' with ya ;) – Lightness Races in Orbit – 2013-12-27T23:13:06.177

@LightnessRacesinOrbit Ah OK :) Admittedly my phrasing wasn't perfect... but I'm sure you get the idea. – None – 2013-12-27T23:13:28.340

5Anyway I believe the issue here lies in the distinction between types and their names. A type may exist and classify an expression even if it is unutterable (see lambdas). The fact that sizeof(T) yields the same type as std::size_t does not conflict with the fact that std::size_t may be unutterable (if &lt;cstddef&gt; was not included). Spelling out sizeof(T) does not assume you can "say" std::size_t. – Andy Prowl – 2013-12-27T23:15:41.440

5@H2CO3: And don't forget the std::typeinfo situation. And the std::initializer_list... – rodrigo – 2013-12-27T23:16:51.600

@rodrigo I always tell people that there are a couple of very good reasons I don't like C++... – None – 2013-12-27T23:17:29.577

1@H2CO3: I see what you mean, but then std::is_class should become an operator (e.g. is_class()). Not sure if I would like that. It would also mean I can't treat is_class as a class template if I need it (e.g. in generic code) - not saying it necessarily makes sense, might be a bad example. – Andy Prowl – 2013-12-27T23:17:49.230

4@H2CO3: Separating the language core from the library is a pointless and arbitrary goal. What do you gain? C++ is a language comprised of syntax, semantics, and libraries, just like any other language. – GManNickG – 2013-12-28T00:12:09.517

2@GManNickG I am not willing to argue about this further. "What do you gain" -- for example, interchangeability between compilers and libraries. (You seem to be an experienced C++ programmer. I am surprised and sad that you find this "arbitrary" and "pointless"...) – None – 2013-12-28T00:17:57.590

5@H2CO3: That idea went down with varargs in C, about 3 decades ago. WG21 never assumed the two could be entirely separate. – MSalters – 2013-12-28T02:01:55.077

@MSalters Yeah, varargs in C are also guilty of blurring the borders. Horrible. One of the worst parts of C. – None – 2013-12-28T02:14:38.680

1@H2CO3 A problem with operators like is_class is that functions called is_class now fail to compile. If the only supported way to access it is via an include followed by using it, you get a pretty name and it doesn't break existing code (outside of extreme cases). You could go with a __is_class operator (ie, use reserved tokens), but those look ugly. – Yakk - Adam Nevraumont – 2013-12-28T03:51:48.043

5

As I understand it, this standard passage requires the following expression:

typeid(sizeof(0)) == typeid(std::size_t)

will always yield true. If you use the actual identifier std::size_t, ::size_t or any other alias/typedef will be irrelevant as long as the identity of the type, as per std::typeinfo::operator==(), is preserved.

The same type identity issue appears in other places of the language. For example, in my 64-bit machine the following code fails to compile because of function redefinition:

#include <cstddef>
void foo(std::size_t x)
{}

void foo(unsigned long x)
{}

rodrigo

Posted 2013-12-27T22:51:52.273

Reputation: 62 405

So it should say "is a constant of type std::size_t (or the same type given by one of its other names, if that name is not visible to the program, just saying)"? – Lightness Races in Orbit – 2013-12-27T23:05:16.253

@LightnessRacesinOrbit: But all this wording is unnecesary. If I write typedef int number; number x;, then x is of type int, and it is of type number, as they are the very same. Two names, one type. – rodrigo – 2013-12-27T23:08:25.093

It just unnerves me that this is left to the imagination. – Lightness Races in Orbit – 2013-12-27T23:09:26.113

3@LightnessRacesinOrbit: But then, you'd have to do the same everywhere a specific type is required. Am I allowed to write typedef int number; number main() {}? If so, should the 3.6.1 say "It [main] shall have a return type of type int (or the same type given by one of its other names"? – rodrigo – 2013-12-27T23:14:28.743

That is a very good point. So the answer then is "live with it — it would be boring as heck if the standard didn't take these lexical 'shortcuts'!"? – Lightness Races in Orbit – 2013-12-27T23:18:13.417

@LightnessRacesinOrbit: Yeah, I don't think you'll get enough momentum to convince the commitee to change this, But for you ease of mind, read 5.7.6: _"The type of the result [pointer substraction] is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdifft in the <cstddef> header". And later, when pointers to the same object are substracted _"the result compares equal to the value 0 converted to the type std::ptrdifft". – rodrigo – 2013-12-27T23:20:54.697

So there's a precedent! Now I'm even more keen for a canonical answer on this :) – Lightness Races in Orbit – 2013-12-27T23:22:31.180

1@LightnessRacesinOrbit: Well, std::ptrdiff_t is only named in 5.7.6/7, but std::size_t is all over the document. Particularly interesting is a note in 3.7.4: _"a new-expression, delete-expression or function call that refers to one of these functions [operator new/delete] without including the header <new> is well-formed. However, referring to std or std::sizet is ill-formed unless the name has been declared by including the appropriate header". – rodrigo – 2013-12-27T23:28:11.527

5

Yes.

The type yielded by sizeof is some unsigned integer type; the implementation defines which one it is.

For example, on some particular implementation, the type of a sizeof expression might be unsigned long.

std::size_t, if it's a typedef, is nothing more than an alternative name for unsigned long. So these two statements:

The type of sizeof ... is a constant of type unsigned long

and

The type of sizeof ... is a constant of type std::size_t

are saying exactly the same thing for that implementation. The type unsigned long and the type std::size_t are the same type. The difference is that the latter is accurate for all (conforming) implementations, where std::size_t might be an alias for, say, unsigned int or some other unsigned type.

As far as the compiler is concerned, sizeof yields a result of type unsigned long; the compiler (as opposed to the runtime library) needn't have any knowledge of the name size_t.

This all assumes that std::size_t (or just size_t if you're talking about C) is a typedef. That's not spelled out in either the C or the C++ standard. Nevertheless, an implementation can straightforwardly conform to the requirements of the standard by making size_t a typedef. I don't believe there's any other portable way to satisfy those requirements. (It can't be a macro or an implementation-defined keyword because that would infringe on the user's name space, and a macro wouldn't be scoped within the std namespace.) A compiler could make size_t some implementation-specific construct other than a typedef, but since a typedef works perfectly well, there's no point in doing so. It would be nice, IMHO, if the standard stated that size_t is a typedef.

(An irrelevant aside: The real problem is that the standard refers to the result as a "constant". In ISO C, a "constant" is a token, such as an integer literal. C++, as far as I know, doesn't define the noun "constant", but it does refer to the ISO C definition of the term. sizeof ... is a constant expression; it's not a constant. Calling the result a "constant value" would have been reasonable.)

Keith Thompson

Posted 2013-12-27T22:51:52.273

Reputation: 188 449

This all seems to rely on std::size_t being a typedef. – Lightness Races in Orbit – 2013-12-27T23:38:16.463

@LightnessRacesinOrbit: What else could it be? – Keith Thompson – 2013-12-27T23:39:40.343

Nothing I suppose. Would still prefer it spelt out. – Lightness Races in Orbit – 2013-12-27T23:49:09.837

@LightnessRacesinOrbit: Agreed. I've updated my answer. – Keith Thompson – 2013-12-27T23:56:20.957

2

It is the same type but you have to include that header to use it.

user2030677

Posted 2013-12-27T22:51:52.273

Reputation: 1 480

I just used it in the variable x, without any trouble, without including the header. – Lightness Races in Orbit – 2013-12-27T22:56:37.090