[tex-k] A bug? Maybe just a terrible inconvenience to work around...

Stefan Ulrich stefan.ulrich@dsl.pipex.com
Tue, 17 Sep 2002 00:34:37 +0100


Martin Schroeder <martin@oneiros.de> writes:

> AFAIK despite the common usage, "~" is not the valid symbol in
> URIs; it must be encoded. The relevant rfc or the html-spec
> should tell you how. :-)

I beg to differ; RFC 2396 (which supersedes 1738 that used to
recommend escaping the ~) says:


2.3. Unreserved Characters

   Data characters that are allowed in a URI but do not have a reserved
   purpose are called unreserved.  These include upper and lower case
   letters, decimal digits, and a limited set of punctuation marks and
   symbols.

      unreserved  = alphanum | mark

      mark        = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"

   Unreserved characters can be escaped without changing the semantics
   of the URI, but this should not be done unless the URI is being used
   in a context that does not allow the unescaped character to appear.

[...]
G.2. Modifications from both RFC 1738 and RFC 1808
  [...]
  The tilde "~" character was added to those in the "unreserved" set,
  since it is extensively used on the Internet in spite of the
  difficulty to transcribe it with some keyboards.


Best,
Stefan