Mercurial for not-software projects

Eric Siegerman pub08-hg at davor.org
Tue Feb 18 17:07:31 UTC 2014


On 02/18/2014 10:09 AM, Pietro Moras wrote:
 > I wander if there is any significant example of Mercurial used as VCS 
for not-software projects.
 >
 > In other words, I wander if this Version Control technology has been 
successfully conceived for a class of projects wider than the "mere" 
s/w. Thanks.

I'm not personally aware of any, but here are some thoughts in general 
terms.  Why is it you're asking?  Do you have a particular kind of 
project in mind?


There's no intrinsic reason not to use Mercurial (or any DVCS) for 
versioning of non-software projects.  It's all files, after all.  My 
main concerns would be:
   - Are the files easily and meaningfully mergeable?
   - Are they very large?  If so, how well does the VCS in question 
handle such large files?
   - Who are the anticipated users?  More to the point, how technically 
minded are they?  Can they wrap their heads around (a) the DVCS concept, 
and (b) the details of the particular DVCS you choose?

In more detail...

*Mergeability**
*
A DVCS's natural mode of operation is to allow concurrent modifications 
to a given file, and to resolve these by merging after the fact.  That 
presupposes a way to perform a meaningful merge. Text files often meet 
this requirement; in particular, line-based text like a typical 
programming language's source code does.  Binary files usually don't, 
unless there's a special-purpose tool available for merging files with 
that particular binary format.

This is often discussed in terms of "text vs. binary", but that isn't 
really the point.  Some text files can be a problem to merge using 
Mercurial's line-based algorithm; for example, documents written in full 
paragraphs (like this email).  That's because you end up merging at 
paragraph granularity, which is inconveniently coarse.  That's true, in 
different ways, whether the paragraphs are line-wrapped or not.

XML can suffer from similar problems, especially if it's 
machine-generated.  For example, if, in a series of <foo> elements, the 
order doesn't matter, the application might feel free to reorder the 
list arbitrarily, leading to spurious conflicts. Same for order of 
attributes within an element, single vs. double-quotes around attribute 
values, values of id= attributes, whitespace, etc.

Note that sometimes a merge tool can be available, but cumbersome. If 
Mercurial can't invoke the tool automatically, it has to fall back to 
just reporting the conflict and letting the user sort it out.
(MS Word has a Compare and Merge Documents feature, but (a) it might be 
a challenge to invoke it from hg, and (b) even if you could, it's only a 
two-way merge, so can't tell conflicting changes from non-conflicting 
ones.  Thus, *every* difference is going to need to be resolved manually 
-- a lot of extra, mind-numbing, error-prone work for the user.)

If the files you're working with aren't mergeable, it might be better to 
go with one of the CVCSs that supports reserved checkouts (aka locking 
-- only one person can edit a given file at a time). Yes, Mercurial has 
a couple of extensions for that, but they're fighting against 
Mercurial's basic design.  It might be better to go with a VCS in which 
reserved checkouts are a more natural fit ... almost certainly a 
centralized one.


*File size*

Mercurial doesn't deal well with huge files, especially ones that aren't 
very delta-compressible -- at each commit, the repository tends to grow 
proportionally to the size of the changed files, not to the size of the 
changes themselves.  Change one byte near the beginning of a 10-MB file, 
and commit.  If the file's text, the repo will grow by a small amount.  
If it's compressed data, the repo will grow by close to the whole 10 MB.

This consideration applies to all VCSes of course, but for a DVCS it's 
worse, because all that redundant data is duplicated in every user's 
workspace, and has to be copied on every "hg clone".

Mercurial has the largefiles extension (and a couple of alternatives) to 
try to cope with this, but it can be slow and awkward -- at least, it 
was a year or so ago, when I last tried to use it; I believe things have 
improved since.


*Your users*

We techies tend to like tools that "think" the way we do.  Most 
non-techies don't think that way.  DVCSes, and many CVCSes, were 
designed by programmers for programmers; trying to teach non-programmers 
-- especially right-brained creative types -- how to use them, and more 
importantly, to use them well, is likely to be an exercise in 
frustration.  Better to find a tool that's designed for those folks.  It 
might not be as featureful as Mercurial (or especially git), but they'll 
take to it a lot more easily.

I haven't worked much with content-management systems, but I have the 
impression that basically what they are is VCSes for non-techies, 
possibly with workflow management as well.  Is that a reasonable assessment?

The one CMS that I did work a lot with, back in the 90s, was a thing 
called Quark Publishing System, which hooked into the QuarkXPress 
page-layout program, among other things.  Its target market was 
newspapers and magazines (strictly print; this was before web publishing 
got big enough to have specialized tools).  QPS did an astoundingly good 
job of making version-control concepts accessible to writers, editors, 
and graphic designers -- people who would be totally baffled by 
Mercurial, and challenged even by good old RCS. I do *not*** mean that 
as an insult; designers in particular can do magic with PhotoShop and 
Illustrator (or Gimp and Inkscape) that I can't even conceive of.  It's 
just that different sorts of people sometimes need different sorts of 
software, even to perform tasks that are, at their core, rather similar 
-- in this case, version control.

   - Eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20140218/8bab90dc/attachment-0002.html>


More information about the Mercurial mailing list