Status of lfs and lock extensions

Daniele Benegiamo danielebenegiamo at fastwebnet.it
Thu Jan 21 12:12:21 UTC 2021


On 2021/01/19 19:33, Matt Harbison wrote:
>>> [...]
>>> The LFS extension is shipped with Mercurial.  Don't let the fact that it
>>> is marked experimental scare you off- I use it in production.  The
>>> experimental tag is mostly so that we can change some of the
>>> fileset/revset/template functionality without worrying about backward
>>> compatibility.  I have no intention of changing the storage layout in an
>>> incompatible way.  There is a TODO list of future plans that I hope to
>>> get to some day:
>>>
>>>      https://www.mercurial-scm.org/repo/hg/file/5.6.1/hgext/lfs/TODO.rst

 From a quick look at the history of the pointed directory, it seems 
there's not much activity on the extension since ~1 year (at least on 
that branch/repo). Are there any plans on if/when it will move out from 
the "experimental" status? [just to understand the long term plan about 
this feature in the Mercurial's team vision]

Many devs working on multimedia/interactive projects (including 
simulation, VR and video games of course - projects that often rely on 
game engines and proprietary binary files) are migrating from Subversion 
and Perforce to git+git-lfs because DVCSs have very pleasant advantages. 
It would be great to have Mercurial in the set of the alternatives.


> [...]
>> I'll test it for sure. Can I follow the instructions in the extension
>> documentation to set it up?
>>
>>          https://www.mercurial-scm.org/repo/hg/file/5.6.1/hgext/lfs/__init__.py
> 
> Yes, that should work.  I'd ignore the `lfs.track` config, and track
> `.hglfs` instead.  You won't need `lfs.url` either if you use
> Mercurial as the server, but it sounds like you may need to point to a
> git server if you want locking.  (Unless you get the lock extension
> working, and transition to lfs locking in the future.  But IDK the
> state of that extension.)

Thanks! Accordingly to the docs, in case I need to specify the lfs.url 
to test other git-lfs backends, I must set it in the global config file, 
right?


>> [...]
>> Could you give some more clues about the "break commands in subtle ways"
>> about the "largefiles" extension? We just started testing such
>> extensions, so we still don't have complete and detailed pros/cons.
> 
> I don't have any offhand- it's been ~3-4 years since I converted away
> from it.  But basically the way largefiles works, if you tell it to
> track `foo.bin`, it intercepts that and tells Mercurial that you
> really want to track `.hglf/foo.bin` (which it creates).  And then it
> has to mind both files (if you change `foo.bin`, it needs to update
> the hash it stores in `.hglf/foo.bin`) without any real help from core
> Mercurial.  If you look at the largefiles extension code, you'll see
> there are tons of functions and commands that are wrapped.  Some of
> these are trivial (do a small thing and delegate to the core
> functionality), but some are non trivial or almost complete
> replacements.  Those things tend to get out of date, and/or missed
> when new functionality is added to core Mercurial.  Search the history
> from ~2013-2017 to see all of the whack-a-mole fixes.
> 
> The lfs extension uses a very low level toggle such that if you ask
> Mercurial to track `foo.bin`, that's the file that actually gets
> tracked, and the expected content is returned when you ask Mercurial
> to read that file.  The low level toggle allows the pointer data to be
> read or written only in the few cases that it is needed, instead of
> always.  If you look at the lfs extension, you'll see there are many
> fewer things wrapped (and some of the wrapping adds functionality not
> available in largefiles, so it does more with less).

Wow, thanks for the details! They're very clear. From your description 
it's evident that the approach followed by lfs is much better.


>> If I understand correctly, the "convert" extension supports
>> bi-directional conversions: normal <-> largefiles and normal <-> lfs. Is
>> it right? Or there are limitations? (it would be very helpful to run our
>> tests)
> 
> Correct.  The important thing to understand is that since largefiles
> tracks `.hglf/foo.bin` in place of `foo.bin`, the commit hashes will
> *always* change when you convert between normal <--> largefiles.
> Since lfs tracks the file you want, you can freely convert between
> normal <--> lfs, and the hashes will stay the same (barring the
> convert extension edge cases I mentioned).  Because the hashes are the
> same, you can freely push and pull between the normal/lfs repos.  (You
> need to enable the extension for the normal repo IIRC, but you don't
> have to *commit* anything as LFS).

Very clear! Thanks.


>> [...]
>>
>> I don't know very well how git-lfs works. It seems you need a dedicated
>> server process that deals with the protocol and (I suppose) mediates
>> with the underlying git repository. Or it can run independently by git?
> 
> Correct.  If you use Mercurial for the server, you don't need a git
> repo.  It simply implements the protocol, and stores the files in the
> underlying Mercurial repo.  But you need locking, so 3rd party
> packages might expect a git repo, even if it is empty.  When I toyed
> with 3rd party stuff, I looked at gitbucket and SCM Manager (because I
> use the latter to host hg repos).

I'll have to spend some time looking at git servers :) Thanks for 
pointing me to some solutions!


> The protocol is basically a client request and reply sequence:
> 
> HTTP POST:
> C: "I'd like to upload these blobs, here are the hashes"
> S: "OK, here is a list of URLs, one per blob that needs to be
> uploaded, ignoring stuff I already have"
> 
> HTTP PUT:
> C: "Here's blob 1 content"
> S: success/failure status
> C: "Here's blob2 content"
> ....

It sounds easy. But usually problems are in the details :)


>> Just to be sure to have understood correctly, the current version of
>> "hgweb" can be used as a remote git-lfs endpoint, and so it can be used
>> directly with the "lfs" extension on the clients? Or we need other
>> specialized software on server to run the tests?
> 
> Correct.  Obviously you need to also enable the extension on the
> server also.  The only caveat is that it doesn't currently support the
> locking protocol.

If some existing git-lfs implementation can manage the lock protocol 
separately from the data storage (not sure this is possible at all...), 
transparently forwarding requests to it could provide a quick 
implementation of the protocol on the Mercurial server (at prototype 
level at least) [of course after adding the needed functionalities on 
the client].


>>> I didn't try any file locking (and it's not implemented in Mercurial
>>> client), but maybe if you've already got a 3rd party server and can send
>>> it the right commands to lock it on the server, it will (mostly) work? I
>>> didn't look at any of the lfs extensions or proposals when working on
>>> the server, because I needed the bare bones implementation.
>>
>> So the "lfs" extension could work with the current git-lfs v2 reference
>> implementation (https://git-lfs.github.com)?
> 
> I don't remember a v2 reference when I implemented the server support
> in early 2018.  The batch command and basic-transfers specs I used
> look basically unchanged since then:
> 
>      https://github.com/git-lfs/git-lfs/tree/master/docs/api
> 

Accordingly to the note "Added: v2.0" in their documents, I think 
locking was the main addition to the v2 specs.


>> I see the locking API is documented here:
>> https://github.com/git-lfs/git-lfs/blob/master/docs/api/locking.md
>>
>> If I have understood, an hypothetical client "lfs-lock" extension on
>> Mercurial could use that API to manage locks on server (if supported by
>> the server of course). The server should then enforce the locks on push.
> 
> Yes, though the functionality should probably just be built into the
> lfs extension.

I don't know how Mercurial extensions works, but I suppose that an 
extension can define distinct "hooks" for when it's running on the 
client and the server?


Thanks again for all the interesting and detailed information!
	Daniele.



More information about the Mercurial mailing list