Date: Sat, 5 Nov 94 08:19:06 JST From: Stephen Turnbull To: nummedal AT pvv DOT unit DOT no Cc: djgpp AT sun DOT soe DOT clarkson DOT edu Subject: NULL pointers... Dag Nummedal writes: The point was to get GCC to change its interneal represantation of the token 0, used in a pointer context. That is: int *p = 0 would in assembly make p point to a high memory location, not to memory location zero. This way only badly broken C-code, that assumes that: int i = 0 can be cast to a null pointer, would fail. I don't have access to a "modern" C-specific reference at the moment, but according to Stroustrup "The C++ Programming Language" 1e (1986): A pointer may be explicitly converted to any of the integral types large enough to hold it. ... The mapping function is also machine dependent.... An object of integral type may be explicitly converted to a pointer. The mapping always carries and integer converted from a pointer back to the same pointer, but is otherwise machine dependent. Something like these rules has to be enforced, or systems programming (specifically, memory-mapped device drivers) can't be done in C. So I guess that code which searches for a video buffer (eg) and returns an integer value which is the absolute address of that buffer (hardware addresses are *not* C pointers) must therefore return (int) (void *) 0 to indicate "not found". (A pretty good trick if that code was written in another language.) Code checking for it does something like switch (address) { ((int) (void *) 0): /* didn't find a video buffer */ break; /* etc */ Otherwise it's "badly broken." What you're saying then is that 0 is 0 except when it's the inverse of (void *) 0. A programming language is supposed to make life easier for humans, not for compilers and automatic code verifiers. (It occurs to me that the semantics of a cast are not the same as the semantics of initialization, and therefore the above code doesn't necessarily work, either. Maybe it could be made to do so, but if not we need to define an otherwise unused variable to do this: void * const null_pointer = 0; (int) null_pointer gaaaakkk. This may be unbroken code but it looks like a pretty broken language to me.) I agree that for portability's sake, the above is necessary. I can imagine that there could be machines where it makes sense for some reason to put NULL somewhere other than address 0. But the convenience of having 0 mean 0 in the context of the argument of the conversion (type *) 0 seems overwhelming to me. If you don't like this, then define a new keyword (NULL would be a good candidate except lots of code seems to use it to mean "false") which means what we currently call a NULL pointer, and disallow the use of *any* integral value to mean NULL pointer. I understand the logic of saying that in a program "0" is just a token, and the internal representation can be whatever you want. But I think it's an unfair burden to place on everyday programming: "is this an integer or is it the special token used to represent null pointers in initialization and comparison? oh yeah, it is." The problem with this is that probably most of libc and go32 (at least the assembly parts) would assume that the null pointer points to memory location zero. If implemented, this could catch a lot of bad code, but it would also make porting programs to DJGPP harder. I don't see why this catches bad code better than protecting page zero. Unless you're planning on protecting both, and logging the names of any programmers whose code checks for (int) p == 0 where p is a T*. --Steve