[tex-k] accessing MySQL w/kpathsea

Johannes Wilm johanneswilm at gmail.com
Tue Jun 10 09:48:33 CEST 2008


Hey,
I would like to ask the list here about the feasibility of the following
hack. I'm currently trying to help on an open source solution to compile
Latex-files, which includes storing and compiling on a server. It would be a
lot nicer to not have to check everything and store as files before using
latex & co on them. Just figuring out what kinds of files need to be checked
out creates considerable headaches, and so I wondered if it would be
something  hack on kpathsea could alleviate.


As I understand it, the way it works is that it gets handed a string such as
the following:

".:/usr/local/texmf/:/home/www/marx.su/www_docs/lab/inc"

and a file-name like:

"include-me.tex"

it then splits these up at the colon-signs, and tests if the file
exists starting form the first directory to the next, etc. Once it
hits one, under normal circumstances it returns just that one filename
with path, like "/usr/local/texmf/include-me.tex", however, one can also set
it to find all instances.

(
http://www.tug.org/svn/texlive/trunk/Build/source/texk/kpathsea/pathsearch.c?view=markup
)

I thought now to modify this in two steps:

First, add an option to retrieve from database, using a string like:

".:/usr/local/texmf:[[database=mysql,dbserver='localhost',dbusername='USERNAME',dbpassword='PASSWORD',query='SELECT
content FROM tex-files WHERE
name='.$filename.';',outdir='/tmp/5grTj3/']]:/home/www/
marx.su/www_docs/lab/inc"

In the loop that goes through the directories, it will then call a
function "check_mysql" when it encounters a "[[" coupled with a
"database=mysql". That function connects to the database and then
calls the query. If it finds something, it checks out the first result
and creates a file in the directory specified
as 'outdir' into which it fills the contents gained from the database.
It then returns the path to that file (fx: "/tmp/5grTj3/include-me.tex").

That way we should save the database and more so the filesystem some
load because it only checks out the files that actually are required
for the compilation of the current document. Also, this should work
seamlessly with all current applications.

In addition, when packaging, this directory should now contain all
latex-files except those that were encountered in the file-system (likely
standard system latex files) and so they can be packages togetehr with the
currently accessed file into a nice *.tar.gz or included into a compiled
pdf-file with a name such as *.latex.pdf .

The second change would be to add an option "get_contents". If that is
specified, then kpathsea will return the contents of the file that is
being looked for, rather than just the filename. It should be easy to
add this part to the filesystem-based file-lookup, and of course it
would simplify the mysql-based part. Common applications (latex,
bibtex) would likely only need to be modified slightly to take
advantage of this change as well.

But one thing that I was thinking about is the database-connection.
The compilation of a single document might be opening and closing
connections to the database for 40-200 times (completely unscientific
estimate). But I just couldn't figure out how to make a persistent
connection, as it would need to be closed, and we never know when the
last mysql-query is being called, if I get the structure of kpathsea right.

Do you have any thoughts/considerations in connection with this? Do similar
projects exist already?


I have only started to look at your code, but maybe you can tell me whether
I'm looking in vain and should rather start somewhere else…
-- 
Johannes Wilm
http://www.johanneswilm.org
tel: +5059173717
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://tug.org/pipermail/tex-k/attachments/20080610/9f390c4a/attachment.html 


More information about the tex-k mailing list