# Re: low slots'

• To: "Nelson H. F. Beebe" <beebe@math.utah.edu>
• Subject: Re: low slots'
• From: Hans Aberg <haberg@matematik.su.se>
• Date: Mon, 6 Oct 1997 16:22:13 +0200
• Cc: math-font-discuss@cogs.susx.ac.uk

At 07:04 -0600 97/10/06, Nelson H. F. Beebe wrote:
>Hans Aberg writes
>>> there is nothing in the C standard demanding strings to be null
>>>terminated...
>
>That is incorrect.  Section 2.2.1 of ANSI X3.159-1989 on p 11 says:
>
>	A byte with all bits set to 0, called the {\em null character},
>	shall exist in the basic execution character set; it is used
>	to terminate a character string literal.

Strictly speaking, there are no strings in C, only types called char*,
etc., which can be used to point to strings; these strings may (commonly
called "C-strings"), or may not be null-terminated ("memory operations"),
then:  There are standard library routines, strcpy() etc, supporting
null-terminated strings, but there are also routines memcpy() for handling
non-null-terminated strings.

>Significant portions of the C library, and virtually every non-trivial
>C program in exist, depend on this property.

So, if you program in C, there is nothing in the C-standard forcing you
to use C-strings. In fact, if you do not want to use the null-terminated
C-strings, an easy way around it is by writing a C++ class that expands to
the C library memory operations; this is what I did. (But I think the new
C++ library <http://www.cygnus.com/misc/wp/index.html> has some string
classes in it.)

Of course, people starting programming in C often think they have to use
the C library routines for null-terminated strings, which is the reason a
lot of programs have it. Otherwise, sloppy written C-software can also miss
the binary character 0xff ( = -1 as a C char), because -1 is also used to
indicate end-of-file; but the end-of-file -1 is an int, on a 32-bit machine
equal to 0xffffffff. So it is possible to get around that in C only -- but
I prefer using the C++ IOstreams library, which is much better in such
respects.

I know that for example that the computer language Haskell
<http://haskell.org/> uses strings formed by Unicode 2-byte char's, and the
highly non-trivial implementations of it uses C; so, just because you are
using C, you do not have to use the null-terminated C-strings.

Hans Aberg
* AMS member: Listing <http://www.ams.org/cml/>
* Email: Hans Aberg <haberg@member.ams.org>