Chemical structures with plain TeX

Shreevatsa R shreevatsa.public at gmail.com
Fri Jul 5 20:38:45 CEST 2019


On Fri, 5 Jul 2019 at 03:47, Taylor, P <P.Taylor at rhul.ac.uk> wrote:

> As I wrote off-list to Peter :
>
>
> Sometimes I just want to weep.  There can be no doubt, based even on just
> the evidence above, that the Unix operating system is a very powerful tool,
> and the simple fact that one can identify all packages that do not have the
> string "LaTeX" (presumably case-insensitive) in their CTAN path is a clear
> demonstration of that fact.  And yet the entire thing is gibberish.  It
> could be Mayan, for all I know.  I could stare at it for the rest of my
> life and still not have the slightest idea how it works.  Why oh why oh why
> does someone not come up with a command-line interpreter (or as I fear you
> would call it, "a shell") that uses English verbs as its commands and
> Enqlish nouns/adjective/adverbs/etc as its qualifiers ?  How on earth is
> anyone expected to know what "-i -o" implies, especially as what it implies
> is almost certainly a function of the command to which it is applied ?  And
> why can one not apply 2>/dev/null distributively, such that it applies to
> *all* commands in the sequence rather than having to be spelled out in
> full for each.
>
> Most of these are a matter of style and cultural preference I guess. After
all, the Mayans did manage to communicate with each other. :-)
For instance, "wget -O -" could be written as "wget --output-document=-" or
even "wget --output-document=/dev/stdin".
Similarly, "tidy -n -i" could be written as "tidy -numeric -indent".
And "grep -v" as "grep --invert-match".
I think very few people prefer to write the longer versions though.
Given that wget, tidy, lxprintf etc are all separate programs written by
unrelated programmers with their own conventions for specifying options, I
think one is not expected to know what a specific option to a specific
command/program means by simply reading it; it must be looked up in the
corresponding program's manual. (For instance even with "-numeric" in place
of "-n" I wouldn't have guessed that it means "output numeric rather than
named entities".)
Also, "2>/dev/null" could be applied distributively, by enclosing the whole
thing in parentheses and appending "2>/dev/null" to that.

Anyway, here is a python3 script that I guess (because I couldn't install
lxprintf either) is the equivalent of the above; hopefully it is slightly
easier to understand:

import requests
from bs4 import BeautifulSoup

chemistry_response = requests.get('https://ctan.org/topic/chemistry')
chemistry_soup = BeautifulSoup(chemistry_response.text, 'html.parser')
for link in chemistry_soup.find_all('a'):
    href = link.get('href')
    if href.startswith('/pkg/'):
        uri = 'https://ctan.org' + href
        package = BeautifulSoup(requests.get(uri).text, 'html.parser')
        for td in package.find_all('td'):
            if td.text == 'Sources':
                path = td.next_sibling.a.code.text
                if 'latex' not in path:
                    print(path)

Of course all this does is replace Unix programs written by different
people with Python packages (libraries) written by different people; so it
may not be any better.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/texhax/attachments/20190705/f12b554e/attachment.html>


More information about the texhax mailing list