[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: `low slots'



At 07:04 -0600 97/10/06, Nelson H. F. Beebe wrote:
>Hans Aberg writes
>>> there is nothing in the C standard demanding strings to be null
>>>terminated...
>
>That is incorrect.  Section 2.2.1 of ANSI X3.159-1989 on p 11 says:
>
>	A byte with all bits set to 0, called the {\em null character},
>	shall exist in the basic execution character set; it is used
>	to terminate a character string literal.

  Strictly speaking, there are no strings in C, only types called char*,
etc., which can be used to point to strings; these strings may (commonly
called "C-strings"), or may not be null-terminated ("memory operations"),
then:  There are standard library routines, strcpy() etc, supporting
null-terminated strings, but there are also routines memcpy() for handling
non-null-terminated strings.

>Significant portions of the C library, and virtually every non-trivial
>C program in exist, depend on this property.

   So, if you program in C, there is nothing in the C-standard forcing you
to use C-strings. In fact, if you do not want to use the null-terminated
C-strings, an easy way around it is by writing a C++ class that expands to
the C library memory operations; this is what I did. (But I think the new
C++ library <http://www.cygnus.com/misc/wp/index.html> has some string
classes in it.)

  Of course, people starting programming in C often think they have to use
the C library routines for null-terminated strings, which is the reason a
lot of programs have it. Otherwise, sloppy written C-software can also miss
the binary character 0xff ( = -1 as a C char), because -1 is also used to
indicate end-of-file; but the end-of-file -1 is an int, on a 32-bit machine
equal to 0xffffffff. So it is possible to get around that in C only -- but
I prefer using the C++ IOstreams library, which is much better in such
respects.

  I know that for example that the computer language Haskell
<http://haskell.org/> uses strings formed by Unicode 2-byte char's, and the
highly non-trivial implementations of it uses C; so, just because you are
using C, you do not have to use the null-terminated C-strings.

  Hans Aberg
                  * AMS member: Listing <http://www.ams.org/cml/>
                  * Email: Hans Aberg <haberg@member.ams.org>