Filter for uncompressed storage of zipped document formats like docx (http://stackoverflow.com/questions/3298525/version-control-for-docx-and-pdf)
Andreas Gobell
andreasgobell at gmx.de
Wed May 4 11:23:18 UTC 2011
Dear Mercurial team,
I am setting up Mercurial as my Version Control System. In my case it is not only meant to manage source code but also Microsoft Word documents in the docx format and some binary files.
I already wrote some scripts to handle diffing and merging of docx files in Word. Another goal was to improve the delta compression in the repository. To achieve this I first had tried putting directories containing the extracted docx contents under version control. This worked fine for the repository but the usage was cumbersome because of the necessary conversion between the directories and the docx files. I then stumbled upon the thread http://stackoverflow.com/questions/3298525/version-control-for-docx-and-pdf where Martin Geisler mentions Mercurial's Filter System. This seemed a good solution as it is completely transparent to the user.
As Martin stated that he is interested in a solution to this problem and I haven't found an extension on the internet I am sending the filter extension that I've written. I've done some tests and compared the space required for storing standard compressed docx files, docx with no compression created manually before a commit and docx processed by my filter and the results show clear space savings for the filter version (and of course the manually uncompressed docx). I also tested odt files created in LibreOffice with the filter and they work as well.
I am new to Mercurial and I haven't written Python for a few years so I am would be very glad to hear about improvements and comments.
Cheers
Andreas Gobell
-------------- next part --------------
A non-text attachment was scrubbed...
Name: doczip.py
Type: text/x-python-script
Size: 5312 bytes
Desc: not available
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20110504/6a473e36/attachment-0002.bin>
More information about the Mercurial
mailing list