«large» files.

Matt Harbison mharbison72 at gmail.com
Tue Apr 26 02:10:23 UTC 2016


On Sat, 23 Apr 2016 04:12:50 -0400, Uwe Brauer <oub at mat.ucm.es> wrote:

>>>> "Matt" == Matt Harbison <mharbison72 at gmail.com> writes:
>
>     >> On Apr 22, 2016, at 3:42 AM, Uwe Brauer <oub at mat.ucm.es> wrote:
>     >>
>     >>
>>>> On Wed, 20 Apr 2016 09:03:42 -0400, Uwe Brauer <oub at mat.ucm.es> wrote:
>     >>>
>     >>> Be aware that you didn't add this as a largefile if you see this
>     >>> warning. You need to add a file with --large before any name or  
> size
>     >>> based settings you may have setup will be honored.
>     >>
>     >> Thanks, this was very helpful. I was not aware that I need to add  
> the
>     >> --large option. Now is there a hg command which allows me to find
>     >> added large files? Then I could remove and add them but this time  
> with
>     >> the appropriate --large option.
>
>     > Remove and add may not be what you want, since there will be a  
> revlog
>     > for the file in the history.
>
> Sorry my ignorance, what is the problem with the revlog? Also I must say
> that removing and adding is not enough that gets cumbersome then.

Nothing wrong with the revlog itself.  Just that there would be a revlog  
for this larger file forever, even if you remove it and re-add it as a  
largefile.  Why bother with the largefiles extension if you already track  
it as a normal file?

There may also be edge cases around removing a normal file and re-adding  
as a largefile.  e.g., would (ext)diff compare the two files, or the  
normal file and the standin for the largefile?

>     > You may want to rebuild the repo, at least starting from the point
>     > where you added the largefile.
>
> Suppose I decide (which was my case) to put an existing directory of
> some dozen (or hundered) files under hg. So I should then check first
> the size of all files. Add only those which are smaller than say 10m.
> And the rest with the --large option.

You can setup a sized base configuration, or pass --lfsize with add.   
Certainly for the former, but I suspect also for the latter, you need to  
`hg add --large` one file first.  See the last few paragraphs of `hg help  
largefiles`.

You probably also want to consider how often you will be modifying these  
files.  I've got a 35MB or so file checked in as a normal file.  But it  
never changes, so making it a largefile doesn't seem worth it.  Starting  
out by determining the largest file in the directory, and deciding how to  
handle it seems reasonable.

> What would happen if I added all files with the --large option.

They would all be tracked as largefiles.  (Or do you mean files already  
tracked?  That should fail the add.  If not, file a bug.)

>
> Ok I understand what mercurial is thought to first initialize a repo and
> then step by step create and add files, but it is not made (primarily) to
> add a directory which already contains a lot files.
>
> Is this correct?

`hg add $directory` should add all files in that directory without you  
having to name them.

>     > There's also 'convert' with a largefile option, but that isn't as
>     > feature rich as other conversions. E.g. It may not update .hgtags,
>     > and it definitely doesn't update hashes in commit messages. (This
>     > is from memory, I don't have the code or hg handy ATM.)
>
>     > Try 'hg files "set:size(>13m)"'. Some versions required the arg to
>     > size to be quoted IIRC. This also only works on one revision at a
>     > time- it won't see something you added and then moved or removed in
>     > the past. You can pass a -r <rev> to look at each revision to find  
> any
>     > such files.
>
> Ok my version 3.0.1 does not contain that command. Maybe after all I
> should upgrade (and abandon to use tortoisehg).

Unfortunately, the manifest command doesn't let you specify a fileset.   
Maybe you could hack something together by using the archive command, and  
parsing the output of `tar -tvzf $archive`?

BTW, if you do upgrade, the size() revset isn't largefile aware.  All  
largefiles will be exactly 41 bytes, regardless of actual size.  The  
archive output will contain the actual largefile.

Why would updating mean you can't use thg?

>
> thanks
>
> Uwe Brauer
>
> _______________________________________________
> Mercurial mailing list
> Mercurial at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial



More information about the Mercurial mailing list