[metapost] [mp-implementors] MetaPost 1.750 (alpha version)

Nelson H. F. Beebe beebe at math.utah.edu
Fri Jun 24 00:09:09 CEST 2011

Hans Hagen responds to a comment earlier today:

>> ...
>> On 23-6-2011 9:41, Troy Henderson wrote:
>> > Is there a built-in way with MetaPost 1.750 and with --math=double to
>> > "round" a number by using only a certain number of significant digits?
>> >
>> > For example, I'm getting an output of
>> >
>> > 7.999999999999998
>> >
>> > which is "begging" to be 8.
>> So maybe we need a round(n,precision) ?
>> ...

On the subject of number conversion, please note that Don Knuth
thought hard about that problem for TeX and Metafont, and wrote a
now-famous paper on it:

  author =       "Donald E. Knuth",
  title =        "A Simple Program Whose Proof Isn't",
  crossref =     "Feijen:1990:BOB",
  chapter =      "27",
  pages =        "233--242",
  year =         "1990",
  bibdate =      "Mon Feb 03 07:07:55 2003",
  note =         "This paper discusses the algorithm used in {\TeX} for
                 converting between decimal and scaled fixed-point
                 binary values, and for guaranteeing a minimum number of
                 digits in the decimal representation. See also
                 \cite{Clinger:1990:HRF,Clinger:2004:RHR} for decimal to
                 binary conversion,
                 \cite{Steele:1990:HPF,Steele:2004:RHP} for binary to
                 decimal conversion, and \cite{Gries:1990:BDO} for an
                 alternate proof of Knuth's algorithm.",
  acknowledgement = ack-nhfb,
  keywords =     "decimal floating-point arithmetic",

It took him about a decade to prove (to his satisfaction) the short
algorithms that he used for input and output conversions in TeX and
Metafont, which is why the title reads as it does.

There is a related paper, with another proof, in the same book volume:

  author =       "David Gries",
  title =        "Binary to Decimal, One More Time",
  crossref =     "Feijen:1990:BOB",
  chapter =      "16",
  pages =        "141--148",
  year =         "1990",
  bibdate =      "Sat Sep 03 09:41:32 1994",
  note =         "This paper presents an alternate proof of Knuth's
                 algorithm \cite{Knuth:1990:SPW} for conversion between
                 decimal and fixed-point binary numbers.",
  acknowledgement = ack-nhfb,
  keywords =     "decimal floating-point arithmetic",

@String{pub-SV                  = "Spring{\-}er-Ver{\-}lag"}
@String{pub-SV:adr              = "Berlin, Germany~/ Heidelberg,
                                  Germany~/ London, UK~/ etc."}

  editor =       "W. H. J. Feijen and A. J. M. {van Gasteren} and D.
                 Gries and J. Misra",
  booktitle =    "Beauty is our business: a birthday salute to {Edsger
                 W. Dijkstra}",
  title =        "Beauty is our business: a birthday salute to {Edsger
                 W. Dijkstra}",
  publisher =    pub-SV,
  address =      pub-SV:adr,
  pages =        "xix + 453",
  year =         "1990",
  ISBN =         "0-387-97299-4",
  ISBN-13 =      "978-0-387-97299-2",
  LCCN =         "QA76 .B326 1990",
  bibdate =      "Thu Mar 24 09:27:40 1994",
  acknowledgement = ack-nhfb,

Don's solution works for the two arithmetic choices that he made for
TeX and Metafont; extending it to other arithmetic systems, such as
the IEEE 754 64-bit type suggested by the --math=double option, is
distinctly nontrivial.  It is addressed by David Gay in
freely-available software that MIGHT be usable with Metapost:

  author =       "David M. Gay",
  title =        "Correctly Rounded Binary-Decimal and Decimal-Binary
  type =         "Numerical Analysis Manuscript",
  number =       "90-10",
  institution =  "AT\&T Bell Laboratories",
  pages =        "16",
  month =        nov # " 30",
  year =         "1990",
  bibdate =      "Sat Apr 28 18:42:55 2001",
  bibsource =    "ftp://garbo.uwasa.fi/pc/doc-soft/fpbibl18.zip",
  URL =          "http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz;
  acknowledgement = ack-nj,
  keywords =     "decimal floating-point arithmetic",

For more on the base-conversion problem, see these 14-year

@String{j-SIGPLAN               = "ACM SIGPLAN Notices"}

  author =       "William D. Clinger",
  title =        "Retrospective: How to read floating point numbers
  journal =      j-SIGPLAN,
  volume =       "39",
  number =       "4",
  pages =        "360--371",
  month =        apr,
  year =         "2004",
  CODEN =        "SINODQ",
  DOI =          "http://doi.acm.org/10.1145/989393.989430",
  ISSN =         "0362-1340",
  bibdate =      "Wed May 26 06:21:19 2004",
  note =         "Best of PLDI 1979--1999. Reprint of, and retrospective
                 on, \cite{Clinger:1990:HRF}.",
  abstract =     "Converting decimal scientific notation into binary
                 floating point is nontrivial, but this conversion can
                 be performed with the best possible accuracy without
                 sacrificing efficiency. Consider the problem of
                 converting decimal scientific notation for a number
                 into the best binary floating point approximation to
                 that number, for some fixed precision. This problem
                 cannot be solved using arithmetic of any fixed
                 precision. Hence the IEEE Standard for Binary
                 Floating-Point Arithmetic does not require the result
                 of such a conversion to be the best approximation. This
                 paper presents an efficient algorithm that always finds
                 the best approximation. The algorithm uses a few extra
                 bits of precision to compute an IEEE-conforming
                 approximation while testing an intermediate result to
                 determine whether the approximation could be other than
                 the best. If the approximation might not be the best,
                 then the best approximation is determined by a few
                 simple operations on multiple-precision integers, where
                 the precision is determined by the input. When using 64
                 bits of precision to compute IEEE double precision
                 results, the algorithm avoids higher-precision
                 arithmetic over 99\% of the time. The input problem
                 considered by this paper is the inverse of an output
                 problem considered by Steele and White: Given a binary
                 floating point number, print a correctly rounded
                 decimal representation of it using the smallest number
                 of digits that will allow the number to be read without
                 loss of accuracy. The Steele and White algorithm
                 assumes that the input problem is solved; an imperfect
                 solution to the input problem, as allowed by the IEEE
                 standard and ubiquitous in current practice, defeats
                 the purpose of their algorithm.",
  acknowledgement = ack-nhfb,

  author =       "Guy L. {Steele Jr.} and Jon L. White",
  title =        "Retrospective: How to Print Floating-Point Numbers
  journal =      j-SIGPLAN,
  volume =       "39",
  number =       "4",
  pages =        "372--389",
  month =        apr,
  year =         "2004",
  CODEN =        "SINODQ",
  DOI =          "http://doi.acm.org/10.1145/989393.989431",
  ISSN =         "0362-1340",
  bibdate =      "Tue Jun 15 10:00:43 2004",
  note =         "Best of PLDI 1979--1999. Reprint of, and retrospective
                 on, \cite{Steele:1990:HPF}.",
  abstract =     "We present algorithms for accurately converting
                 floating-point numbers to decimal representation. The
                 key idea is to carry along with the computation an
                 explicit representation of the required rounding
                 accuracy. We begin with the simpler problem of
                 converting fixed-point fractions. A modification of the
                 well-known algorithm for radix-conversion of
                 fixed-point fractions by multiplication explicitly
                 determines when to terminate the conversion process; a
                 variable number of digits are produced. The algorithm
                 has these properties: (*) No information is lost; the
                 original fraction can be recovered from the output by
                 rounding. (*) No ``garbage digits'' are produced. (*)
                 The output is correctly rounded. (*) It is never
                 necessary to propagate carries on rounding. We then
                 derive two algorithms for free-format out-put of
                 floating-point numbers. The first simply scales the
                 given floating-point number to an appropriate
                 fractional range and then applies the algorithm for
                 fractions. This is quite fast and simple to code but
                 has inaccuracies stemming from round-off errors and
                 oversimplification. The second algorithm guarantees
                 mathematical accuracy by using multiple-precision
                 integer arithmetic and handling special cases. Both
                 algorithms produce no more digits than necessary
                 (intuitively, the ``1.3 prints as 1.2999999'' problem
                 does not occur). Finally, we modify the free-format
                 conversion algorithm for use in fixed-format
                 applications. Information may be lost if the fixed
                 format provides too few digit positions, but the output
                 is always correctly rounded. On the other hand, no
                 ``garbage digits'' are ever produced, even if the fixed
                 format specifies too many digit positions (intuitively,
                 the ``4/3 prints as 1.333333328366279602'' problem does
                 not occur).",
  acknowledgement = ack-nhfb,

That entry, and thousands more, can be found in the floating-point
bibliography at


See particularly these entries:


Here are some little functions in hoc that can be used to figure out
how many digits are needed for correct round-trip conversion; they are
based on proofs in the first two papers in that list of seven:

	### goldberg(ndecdig): return number of bits needed to ensure correct
	### round-trip conversion between binary and ndecdig decimal digits
	func goldberg(ndecdig) { return ceil(ndecdig/log10(2) + 1) }

	### goldbergb(b,ndecdig): return number of base-b digits needed to
	### ensure correct round-trip conversion between base b and ndecdig
	### decimal digits
	func goldbergb(b,ndecdig) \
		if (b == 10) \
			return ndecdig \
		else \
			x = 1 + ndecdig / log10(b)
			y = ceil(x)
			if (x == y) return (y + 1) else return y

	### matula(nbits): return number of decimal digits needed to ensure
	### correct round-trip conversion between binary and decimal
	func matula(nbits) { return ceil(nbits/log2(10) + 1) }

	### matulab(b,nbaseb): return number of decimal digits needed to ensure
	### correct round-trip conversion between base-b and decimal
	func matulab(b,nbaseb) \
		if (b == 10) \
			return nbaseb \
		else \
			x = 1 + nbaseb / logbase(b,10)
			y = ceil(x)
			if (x == y) return (y + 1) else return y

- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: beebe at math.utah.edu  -
- 155 S 1400 E RM 233                       beebe at acm.org  beebe at computer.org -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -

More information about the metapost mailing list