Thu 13 Apr: TeX Hour: unlatex; Fermi estimate for reprocessing the arXiv: 6:30 to 7:30 BST

Jonathan Fine jfine2358 at gmail.com
Wed Apr 12 21:30:39 CEST 2023


Hi

Well, next week, on Monday 17 April 1:00-5:00pm Eastern Time, we have the
very first arXiv Access Forum. I'm looking forward to that. I'm most
grateful to the organisers, and I hope they're not overwhelmed. They have a
massive responsibility. As do the esteemed presenters and panelists, and
the many participants.

Tomorrow's TeX Hour (Thursday 6:30 to 7:30pm UK Summer Time) is about my
emerging unlatex tool for reprocessing TeX documents, to provide more
accessible outputs. I'm close to creating in Python an equivalent to TeX's
internal boxes. This involves an interesting parser + builder combination,
linked by a stream of control symbols, constructors and leaf nodes.

Here are some arXiv stats (in round numbers):

Total number of submissions: 2.25 million.
Downloads per month: 25 million.
Seconds in a month: 2.6 million.
Registered for arXiv Access Forum: 2,000 people.

Why seconds in a month? Well, it's approximately equal to the total number
of submissions. So we can make a Fermi estimate as to how long it will take
to reprocess the entire arXiv to get accessible outputs (assuming suitable
software).

Suppose we have a desktop PC with 12 cores, so 24 threads, so about 20
cores doing useful work. On such a machine, if not bottlenecked, we could
do the whole lot in a month provided each item takes only 20 seconds. The
download might take a while, and the electricity would be about £150 (or
$150).

Harder is to make a Fermi estimate for creating suitable software, and yet
harder is writing and testing the software. Also very important is field
testing of its outputs for accessibility..

Here's the URL for Monday's arXiv forum:
https://accessibility2023.arxiv.org/
The TeX Hour zoom URL:
https://us02web.zoom.us/j/78551255396?pwd=cHdJN0pTTXRlRCtSd1lCTHpuWmNIUT09
The home page tomorrow's TeX Hour:
https://texhour.github.io/2023/04/13/unlatex-results-prospects/

wishing you happy arXiving

Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/tex-live/attachments/20230412/96cb79ee/attachment.html>


More information about the tex-live mailing list.