making bibtex easy to find - TUG vs IJVSBT for example lol

Mike Marchywka marchywka at hotmail.com
Wed Oct 20 09:58:38 CEST 2021


On Tue, Oct 19, 2021 at 08:33:14PM -0700, Paulo Ney de Souza wrote:
>    On Tue, Oct 19, 2021 at 4:55 PM Mike Marchywka <[mailto:marchywka at hotmail.com]marchywka at hotmail.com> wrote:
> 
>      Here is a link that turns up  on google scholar,
>      [https://www.ijvsbt.org/index.php/journal/article/download/1386/1058]https://www.ijvsbt.org/index.php/journal/article/do
>      wnload/1386/1058
>      It has a doi in it but a scraper would be hard pressed to find it. Zotero did
>      not find it and I did not find it.
> 
>    The link is to a PDF file that absolutely does NOT contain any DOI numbers.

What is your final determination on the "doi" below ? 

>    The scrapper I use which is a half-an-hour job on BeatifulSoup finds a candidate

Creating a regex or something to find as you say candidates is not hard
but finding the right thing is well a big hack. 

>    to DOI in the string:
>            10.21887/ijvsbt.17.1.24
>    by validation marks it as non-valid DOI on 2021-10-20-03:03:28 UTC.
>    I was wondering what do you think is a DOI in this document. Our script thinks
>    there are none and a quick check confirms that.

I did not bother to run it but the thing that looks like a doi,
maybe you can find it in the  PDF with a regex, is listed in the
cite's bibtex as the doi for the document,

doi = {10.21887/ijvsbt.17.1.24},

>    We all know that if you start with a false-premise in math you can prove anything
>    you want, so it is essential to start with something that is valid.
> 
>      It turns out however, that the url
>      can be modified to find the bibtex but this is harder with a local file
>      and no URL info. I got my code to work as another special
>      case. However, it would be nice if there was some simplicity and
>      uniformity to the process especially for works with no DOI.
> 
>    Our scrapper knows the rules for finding DOIs in some 500 math journals. We used to
>    have that many rules and they were numbered rule-1, rule-2, .... we have now merged
>    them into about 50, and they are named: rule-springer, rule-elsevier, rule-ams, ... and
>    it is becoming a bit more manageable.
>    It is very hard to even come up with a rule -- for a journal --  since there are journals
>    with certain rules for years under JStor and another set of rules for years-published
>    under somebody else.
>    The only solution here will be to associate a unique identifier (ISSN + Year) to a set
>    of well-defined rules.... but we need first to define the language that describes these rules.
> 
>      What objections would there be to just including machine readable
>      citation info in a PDF file? Absent that, a domain specific document
>      number and look up facility? lol.
> 
>    Try! I'll give you the database of the managers of some 2000 math journals and you
>    can try asking them ...



I was going to start with TUG :) And with neomutt its easy to send out
dear colleague letters :) 

Although I'm curious now how TooBib would do on some of them applying
the rules it has now.... 

>    Paulo Ney
>

Thanks.
 
>      % mjmhandler: toobib guessijvsbt<-guessijvsbt<-handleadhochtml<-citation
>      % date 2021-10-19:19:39:12 Tue Oct 19 19:39:12 EDT 2021
>      % srcurl:
>      [https://www.ijvsbt.org/index.php/journal/article/view/1386]https://www.ijvsbt.org/index.php/journal/article/view/1386
>      [https://www.ijvsbt.org/index.php/journal/article/download/1386/1058]https://www.ijvsbt.org/index.php/journal/article/do
>      wnload/1386/1058
>      % citeurl:
>      [https://www.ijvsbt.org/index.php/journal/article/view/1386]https://www.ijvsbt.org/index.php/journal/article/view/1386
>      @article{ClinicalManagementHypothyroidismGunajitPubaleem2021,
>      X_TooBib = {publisher: ReWriteParse be.get(s)= be.get(dest)=},
>      abstract_html_url =
>      {[https://www.ijvsbt.org/index.php/journal/article/view/1386]https://www.ijvsbt.org/index.php/journal/article/view/1386}
>      ,
>      author = {Gunajit Das and Pubaleem Deka and Kongkon Jyoti Dutta},
>      author_institution = {Department of Veterinary Medicine, Lakhimpur College of Veterinary Science, Assam Agricultural
>      University, Joyhing, Assam, India and Department of Veterinary Epidemiology and Preventive Medicine, College of
>      Veterinary Science, Assam Agricultural University, Khanapara, Assam, India and Department of Veterinary Pathology,
>      Lakhimpur College of Veterinary Science, Assam Agricultural University, Joyhing, Assam, India},
>      date = {2021/01/25},
>      day = {25},
>      doi = {10.21887/ijvsbt.17.1.24},
>      firstpage = {91},
>      issn = {2395-1176},
>      issue = {01},
>      journal = {THE INDIAN JOURNAL OF VETERINARY SCIENCES AND BIOTECHNOLOGY},
>      journal_abbrev = {IJ Vet Sci \& Bio},
>      journal_title = {THE INDIAN JOURNAL OF VETERINARY SCIENCES AND BIOTECHNOLOGY},
>      keywords = {.},
>      language = {en},
>      lastpage = {92},
>      month = {01},
>      pagetitle = {Clinical Management of Hypothyroidism Associated Dermatological Signs in a Labrador: A Case Report | THE
>      INDIAN JOURNAL OF VETERINARY SCIENCES AND BIOTECHNOLOGY},
>      pdf_url =
>      {[https://www.ijvsbt.org/index.php/journal/article/download/1386/1058]https://www.ijvsbt.org/index.php/journal/article/d
>      ownload/1386/1058},
>      title = {Clinical Management of Hypothyroidism Associated Dermatological Signs in a Labrador: A Case Report},
>      volume = {17},
>      year = {2021},
>      url={[https://www.ijvsbt.org/index.php/journal/article/download/1386/1058]https://www.ijvsbt.org/index.php/journal/artic
>      le/download/1386/1058},
>      srcurl={[https://www.ijvsbt.org/index.php/journal/article/download/1386/1058]https://www.ijvsbt.org/index.php/journal/ar
>      ticle/download/1386/1058},
>      xsrcurl={[https://www.ijvsbt.org/index.php/journal/article/view/1386]https://www.ijvsbt.org/index.php/journal/article/vi
>      ew/1386},
>      citeurl={[https://www.ijvsbt.org/index.php/journal/article/view/1386]https://www.ijvsbt.org/index.php/journal/article/vi
>      ew/1386}
>      }
>       Mike Marchywka
>      306 Charles Cox Drive
>      Canton, GA 30115
>      470-758-0799
>      404-788-1216

-- 

mike marchywka
306 charles cox
canton GA 30115
USA, Earth 
marchywka at hotmail.com
404-788-1216
ORCID: 0000-0001-9237-455X


More information about the texhax mailing list.