something dumb on bibtex entry format,

barbara beeton bnb at tug.org
Tue Jan 12 20:09:25 CET 2021


I'm not sure how you'd accomplish it, and can't offer help, but if
you can add this check, to at least give a warning, you'd almost
certainly be able to identify all likely problems of this sort:

In a bibliographic entry, it's extremely unlikely that either a
leading group-defining open brace or printing brace ( \{ ) would
be "matched" by a closing brace of the other variety.  So checking
for properly nested type-braces, and reporting unmatched ones,
would allow you to manually identify and correct as necessary.

Also, but generally unrelated, initial or terminal spaces within
individual bib items (author, title, etc.) are generally undesirable
and can result in unwanted spaces in output, unless care is taken to
ignore them in processing (may be done, but not guaranteed; I don't
know how rigorous bib-processing macros are in this regard).

Hope these observations are useful.
 						-- bb

On Tue, 12 Jan 2021, Mike Marchywka wrote:

> On Tue, Jan 12, 2021 at 04:13:51PM +0000, David Carlisle wrote:
>>    \} is the latex syntax for \the character } so in fields taking tex streams you have a literal } character and a missing }
>>    to match the {  at the beginning.
>>    in fields taking numeric entries it's again a missing closing brace and then a spurious } in the date I would expect.
>>    actually what happens is that bibtex takes this as a literal \ so the generated bbl file has
>>    \newblock Boston, MA, USA, 2015-09-08 00:00:00 2015. John Wiley \& Sons, Ltd \.
>>    but \. is the accent command  and the argument here is the \par from the following blank line so latex gives the error
>>    ! Paragraph ended before \OT1\. was complete.
>>    same with the month field teh \ is just passed to latex but here you get
>>    \newblock Boston, MA, USA, 2015-09-08 00:00:00 \ 2015. John Wiley \& Sons, Ltd.
>>    so you get the safe \   rather than \.{\par} just by accidental chance of this bib style.  If the bib style had put a full
>>    stop rather than a space after the date you would have had the same error as before.
>>
>
> Thanks, I was originally just trying to do syntax but apparently that is not possible without a format.
> My syntax parser is just that and  I wanted a format-independent validation. However, your point
> makes sense and now I just returned the rendered reference which really is what matters as long
> as the format picks up the right fields. The rendered/pdftotext outputs appear identical with the
> backslash or not in the month field,
>
>
> [1] J. C. Silva, I. M. Aroso, F. Mano, I. Sá-Nogueira, S. Barreiros, R. L. Reis,
> A. Paiva, and A. R. C. Duarte. Therapeutic deep eutectic solvents as solubility enhancers fordifferent active pharmaceutical ingredients. Boston, MA,
> USA, 2015-09-08 00:00:00 2015. John Wiley & Sons, Ltd.
>
>
> [1] J. C. Silva, I. M. Aroso, F. Mano, I. Sá-Nogueira, S. Barreiros, R. L. Reis,
> A. Paiva, and A. R. C. Duarte. Therapeutic deep eutectic solvents as solubility enhancers fordifferent active pharmaceutical ingredients. Boston, MA,
> USA, 2015-09-08 00:00:00 2015. John Wiley & Sons, Ltd.
>
> 1
>
> I guess I'll have to make a validation bst entry that picks up everything and makes errors
> more apparent.
>
>
> echo val xxx  | ../a.outmjm_assemble_putative_bibtex.h615  MJM_ASSEMBLE_PUTATIVE_BIBTEX Jan 12 2021 13:34:29
>
> ../../mjm/hlib/mjm_pawnoff.h439 ONCE  fuxed m_today to exclude time wtf
> ../../mjm/hlib/mjm_instruments.h791  popping an old stream
> mjm>val xxx
> ../../mjm/hlib/mjm_pawnoff.h347 ONCE  Fileio is not thread of process safe doh
> mjm_assemble_putative_bibtex.h290  cmd=cat checkbib_test_output.xxx | grep -i "output\|error\|warning" | sed -e 's/  */ /g'
> mjm_assemble_putative_bibtex.h292  c=0 StrTy(err)= StrTy(out)=No pages of output .
> Warning - - empty booktitle in 18538
> ( There was 1 warning )
> LaTeX Warning : Label ( s ) may have changed . Rerun to get cross - references right .
> Output written on xxx . pdf ( 1 page , 28929 bytes ) .
> StrTy(data)=
> mjm_assemble_putative_bibtex.h295  rcl=0 fnerr=checkbib_test_output.xxx
> mjm_assemble_putative_bibtex.h299  StrTy(rendered)=References
> [1] J. C. Silva, I. M. Aroso, F. Mano, I. Sá-Nogueira, S. Barreiros, R. L. Reis,
> A. Paiva, and A. R. C. Duarte. Therapeutic deep eutectic solvents as solubility enhancers fordifferent active pharmaceutical ingredients. Boston, MA,
> USA, 2015-09-08 00:00:00 2015. John Wiley & Sons, Ltd.
>
> 1
>
>
>
> mjm_assemble_putative_bibtex.h648  m_n=0 m_errors=0 m_be.name()= m_be.type()= m_be.size()=0 m_be.errors()=0 m_latex_output=No pages of output .
> Warning - - empty booktitle in 18538
> ( There was 1 warning )
> LaTeX Warning : Label ( s ) may have changed . Rerun to get cross - references right .
> Output written on xxx . pdf ( 1 page , 28929 bytes ) .
>
> mjm>../../mjm/hlib/mjm_instruments.h340  readline returns null danger will robinson 01
> marchywka at happy:/home/documents/cpp/proj/toobib/junk$ vi xxx
> marchywka at happy:/home/documents/cpp/proj/toobib/junk$ echo val xxx  | ../a.outmjm_assemble_putative_bibtex.h615  MJM_ASSEMBLE_PUTATIVE_BIBTEX Jan 12 2021 13:34:29
>
> ../../mjm/hlib/mjm_pawnoff.h439 ONCE  fuxed m_today to exclude time wtf
> ../../mjm/hlib/mjm_instruments.h791  popping an old stream
> mjm>val xxx
> ../../mjm/hlib/mjm_pawnoff.h347 ONCE  Fileio is not thread of process safe doh
> mjm_assemble_putative_bibtex.h290  cmd=cat checkbib_test_output.xxx | grep -i "output\|error\|warning" | sed -e 's/  */ /g'
> mjm_assemble_putative_bibtex.h292  c=0 StrTy(err)= StrTy(out)=No pages of output .
> Warning - - empty booktitle in 18538
> ( There was 1 warning )
> LaTeX Warning : Label ( s ) may have changed . Rerun to get cross - references right .
> Output written on xxx . pdf ( 1 page , 28937 bytes ) .
> StrTy(data)=
> mjm_assemble_putative_bibtex.h295  rcl=0 fnerr=checkbib_test_output.xxx
> mjm_assemble_putative_bibtex.h299  StrTy(rendered)=References
> [1] J. C. Silva, I. M. Aroso, F. Mano, I. Sá-Nogueira, S. Barreiros, R. L. Reis,
> A. Paiva, and A. R. C. Duarte. Therapeutic deep eutectic solvents as solubility enhancers fordifferent active pharmaceutical ingredients. Boston, MA,
> USA, 2015-09-08 00:00:00 2015. John Wiley & Sons, Ltd.
>
> 1
>
>
>
> mjm_assemble_putative_bibtex.h648  m_n=0 m_errors=0 m_be.name()= m_be.type()= m_be.size()=0 m_be.errors()=0 m_latex_output=No pages of output .
> Warning - - empty booktitle in 18538
> ( There was 1 warning )
> LaTeX Warning : Label ( s ) may have changed . Rerun to get cross - references right .
> Output written on xxx . pdf ( 1 page , 28937 bytes ) .
>
> mjm>../../mjm/hlib/mjm_instruments.h340  readline returns null danger will robinson 01
> marchywka at happy:/home/documents/cpp/proj/toobib/junk$
>
>
>
>>    On Tue, 12 Jan 2021 at 15:59, Mike Marchywka <[mailto:marchywka at hotmail.com]marchywka at hotmail.com> wrote:
>>
>>      I'm finally moving my bibtex scraping script to c++ and cleaning up a lot
>>      of stuff. I validate a foreign or scraped bibtex entry using latex run on
>>      a test document and my own parser. Its unlikely my parser conforms to
>>      bibtex requirements exactly so I want to do both. I found what appears
>>      to be an old test case and it still does not make sense. The question
>>      seems to be about backslashes preceding a terminating right brace.
>>      Sometimes they are ok, others not. The test files are xxx.tex and xxx.bib
>>      as shown below.  If I put a backslash on the "month" line before the right brace
>>      it seems to work ( originally there was an abstract entry with the problem but
>>      I deleted it for space and clarity ),
>>          month = {2015-09-08 00:00:00 \},
>>      However, doing it on the publisher line fails,
>>          publisher = {John Wiley \& Sons, Ltd \},
>>       cat /tmp/xxx.tex
>>      \documentclass{article}
>>      \begin{document}
>>      \nocite{*}
>>      \bibliographystyle{plain}
>>      \bibliography{xxx}
>>      \end{document}
>>      marchywka at happy:/home/documents/cpp/proj/toobib/junk$ cat /tmp/xxx.bib
>>      % programmatically fixed probably bu toobib
>>      % loaded from bbb written on 2019-11-02:17:33:45
>>      %0 prior 0
>>      @inproceedings{18538,
>>          address = {Boston, MA, USA},
>>          author = {Silva, J. C. and Aroso, I. M. and Mano, F. and S{\'a}-Nogueira, I. and Barreiros, S. and Reis, R. L. and
>>      Paiva, A. and Duarte, A. R. C.},
>>          doi = {10.1089/ten.tea.2015.5000.abstracts},
>>          journal = {Tissue Engineering Part A},
>>          keywords = {Drug delivery systems, green chemistry, Therapeutic deep eutectic solvents},
>>          month = {2015-09-08 00:00:00 },
>>          publisher = {John Wiley \& Sons, Ltd },
>>          title = { Therapeutic deep eutectic solvents as solubility enhancers fordifferent active pharmaceutical
>>      ingredients},
>>          url =
>>      {[http://online.liebertpub.com/doi/full/10.1089/ten.tea.2015.5000.abstracts]http://online.liebertpub.com/doi/full/10.108
>>      9/ten.tea.2015.5000.abstracts},
>>          year = {2015}
>>      }
>>      marchywka at happy:/home/documents/cpp/proj/toobib/junk$
>>      I run 3 times, latex, bibtex, and latex again, then grep for error, output, and warning
>>      giving the following lines in the two cases,
>>      No pages of output .
>>      Warning - - empty booktitle in 18538
>>      ( There was 1 warning )
>>      . / xxx . bbl : 9 : = = > Fatal error occurred , no output PDF file produced !
>>      versus,
>>      =No pages of output .
>>      Warning - - empty booktitle in 18538
>>      ( There was 1 warning )
>>      LaTeX Warning : Label ( s ) may have changed . Rerun to get cross - references right .
>>      Output written on xxx . pdf ( 1 page , 28937 bytes ) .
>>      What is the backslash before the brace supposed to do or is there something silly I'm
>>      missing? Thanks.
>>      note new address
>>       Mike Marchywka 306 Charles Cox Drive Canton, GA 30115
>>       2295 Collinworth  Drive Marietta GA 30062.  formerly 487 Salem Woods Drive Marietta GA 30067 404-788-1216 (C)<- leave
>>      message 989-348-4796 (P)<- emergency
>
> -- 
>
> mike marchywka
> 306 charles cox
> canton GA 30115
> USA, Earth
> marchywka at hotmail.com
> 404-788-1216
> ORCID: 0000-0001-9237-455X
>


More information about the texhax mailing list.