Suggested feature: Portable document archive

Barry MacKichan barry.mackichan at mackichan.com
Mon Dec 30 21:56:35 CET 2019


I can provide another data point for this discussion.

Scientific Word and Scientific WorkPlace use a scheme quite like this. 
The “native file type” extension for our documents is .sci, but in 
fact a .sci file is just a zip file under another name. By changing the 
extension we can change the default operation selected when the user 
double clicks on it.

It is possible, by making some changes in the Windows registry, to make 
the .sci file behave like a zip file in some occasions and like a 
document in others. In particular, in the Windows file browser which 
shows the directory tree in one pane on the left and the directory 
contents in the other, the user can launch the program by clicking on 
the file on the right, or can open and explore the contents of the zip 
file by clicking on the right. The special treatment of zip files in 
Windows applies to .sci files as well.

Here is what we keep in the .sci file:

One or more xhtml files, a directory of css files, and a directory of 
graphics files — mostly these don’t apply to this discussion.

A directory that contains a tex file generated by our program from the 
xhtml version of the document and temporary copies of the intermediate 
files generated by running TeX.

Three directories that can contain graphics files which hold the 
original copy of a graphic imported by the user. If that graphic can’t 
be displayed in a browser, we run a conversion program to make a 
displayable copy. If neither of those can be imported into TeX, we also 
make a TeX-compatible copy. Thus there may be three copies of a graphic 
in some cases. The final PDF is also contained in the zipped package.

We make no attempt to include any LaTeX package files or utility 
programs (we install some, but these are part of the installation and 
not part of the document). So the only guarantee is that the document 
can be compiled on an installation of our program on OSX or Windows, 
with a complete TeXLive installation. Smaller TeX installations usually 
work, but there is no guarantee that you won’t have to install a bit 
more.

There is a command in our program to trim the fat in the .sci file. It 
removes temporary files, and depending on preferences, it might also 
remove the generated PDF and the conversions of the graphics and even 
the TeX file. I.e., it can go back to the minimum required to regenerate 
everything.

The .sci file is agnostic about the TeX compiler used and so works with 
PDFLaTeX, LuaLaTeX, and XeLaTeX.

The Mac version could easily have used OSX packages instead of zip 
files, but we decided it Was more important to be able to drag a file 
from Windows to a Mac and have everything work.

—Barry MacKichan

On 29 Dec 2019, at 3:25, Mojca Miklavec wrote:

> Dear Simon,
>
> On Sat, 28 Dec 2019 at 21:29, Simon Heisterkamp wrote:
>>
>> I'm the author of the MikteX feature suggestion 424: 
>> https://github.com/MiKTeX/miktex/issues/424
> ...
>> I am open to contributing to the effort but would likely need 
>> substantial support from experience developers of tex-live.
> ...
>> The MS Word .docx format is actually a zip archive with xml files and 
>> folders for resources such as pictures. The idea here is to do the 
>> same for .tex sources.
>> Call the archive something like .pta (portable Tex archive), and 
>> support creating, opening editing and typesetting this file directly 
>> from MikTeX (and tex-live).
>
> My guess is that this could be much easier to write if you only target
> a single engine (I have LuaTeX in mind, including any of its recent
> variants) than if you need to support the gazillion of other programs
> (some of which have been barely touched or changed since the
> eighties).
>
> LuaTeX already has a built-in support for zip files, and ConTeXt
> already supports providing a single zip with the full TEXMF tree with
> resource files as an alternative to having the standard extracted
> installation, so I could imagine that supporting something like
> "context document.pta" to magically compile everything might be almost
> straightforward.
>
> But then you would need *A LOT* more work to also:
> - properly port it to LaTeX
> - support "all" the major viewers and text editors
> - write a really nice GUI with drag-and-drop functionality to
> manipulate the files (add and remove pictures & other packages)
> - make sure that Overleaf etc. can import and export it
> - potentially ensure that google drive / gmail know how to display the 
> contents
> to make it really useful, else it becomes just another file that users
> have no idea what to do with it.
>
> While I find the idea nice in principle (I share the concerns raised
> by Johannes), this can only work if you are really willing to spend a
> lot of effort yourself, and even if you get help (not impossible, but
> not if you would need handholding on each step), you would still need
> to figure out the huge majority yourself. This can only help users if
> you really create great user experience and take care of all the
> details, including platform-specific ones (like for example: having
> extracted structure like something.app which "behaves like a single
> file" on different operating systems and can be double-clicked to be
> opened with a GUI editor).
>
>> Proposed properties:
>>
>> a zip archive with all sources necessary to typeset a TeX document.
>> renamed to something other than .zip to discourage non-technical 
>> users from editing directly. Suggestion: .pta
>> MikTeX (and tex-live) support for working with this file directly. 
>> (like MS Word works with .docx archives) In particular guarantee that 
>> the archive can always be typeset again after moving and copying to 
>> another installation of MikTeX (and tex-live).
>> Use a somewhat strict structure for names and locations inside the 
>> archive. This will simplify development of the feature. Unsupported 
>> uses can always fall back to ordinary .tex source files.
>> a “make-file” (inside the archive) for standardized typesetting 
>> with one click - no setup required at all by users who only want to 
>> make minor changes to a document.
>> archive can contain non-standard dependencies, i.e. packages, 
>> pictures, styles.
>> archive can contain the typeset pdf document. Costs extra size file, 
>> but gains accessibility. This makes it very easy to write 
>> “viewer” applications for every conceivable system out there - 
>> they simply pull the pdf out of the archive.
>>
>> Use cases:
>>
>> non-technical persons are comfortable with a document being one file.
>> file can be copied and moved around while maintaining a guarantee 
>> that it can be typeset when needed.
>
> Not necessarily.
> Unless you also bundle the binary and all the packages and fonts,
> there is no guarantee that the document can be typeset again, at least
> not necessarily in the fully reproducible way.
> It may always happen that a package gets removed from TeX Live, that
> fonts get renames, or that two packages start conflicting with each
> other.
> (I mean: it's not any worse nor any better than what you have now.)
>
>> this could greatly assist in spreading the use of latex outside of 
>> technical academic fields.
>>
>> Aspects that need more thought:
>>
>> when editing creates typesetting errors, the archive could maintain a 
>> "last successfully typeset" pdf for the viewers, alternatively it 
>> could present a "corrupted" pdf file. Another approach could be to 
>> forbid saving to the archive format if the document cannot be 
>> typeset.
>>
>> Feel free to use and share this idea however you want.
>> I don’t have the time to develop this myself,
>
> That's contrary to what you said at the top?
>
>> but would love to use the feature at the engineering company where I 
>> work.
>
> How many man-years would the company be willing to fund to implement
> such a feature?
>
>> Best regards,
>> Simon
>
> Mojca
>
> PS: this is not really something that concerns MikTeX or TeX Live
> alone. It's something that would need to be an independent
> development.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/tex-live/attachments/20191230/c7d3b186/attachment-0001.html>


More information about the tex-live mailing list