hg pull runs out of memory
Isaac Jurado
diptongo at gmail.com
Fri Apr 27 09:07:25 UTC 2012
On Fri, Apr 27, 2012 at 10:16 AM, Hertroys A.
<alban.hertroys at apollovredestein.com> wrote:
>
>> >> Because Mercurial is designed for dealing with _source code_
>> >> quickly. And it's massively more efficient to deal with small
>> >> files typical of source code by reading, writing, and calculating
>> >> deltas on them in memory.
>> >
>> > Of course, but isn't it kind of dumb to attempt the same with a
>> > large binary file?
>>
>> The engineering effort to change the situation is huge and the demand
>> is small.
>
> [...]
>
> For binary data you would compare some file-properties first (creation
> date, size and a CRC check seem a reasonable starting point - although
> the first two will differ between clones, I realise) and if they're
> different you create a new version of the file. That's how many source
> control systems do this.
>
> Yes, that way your repository grows fast if you change your binary
> files a lot and yes, implementing a binary diff that could prevent
> that growth would be a huge effort. ISTR reading about some open
> source tools capable of binary diffs, there's probably no need to
> duplicate the effort.
>
> It's easy to control which binary files need to stay under version
> control if the only effect they have is to increase your repository's
> size. You can see that coming, as it happens gradually, while more
> disk space is relatively easy to obtain (although of course still a
> pain if all your clones for all your developers grow beyond reasonable
> sizes). Compared to the current situation where a large file can
> suddenly drive you past the memory allocation limits of 32-bit systems
> (which is still the majority), I think that's an improvement.
>
> [...]
>
> I think this could be abused as a DOS vulnerability. It only requires
> one person pushing a largish binary file to the repo, and every
> developer with a clone on a 32-bit system is going to run into
> problems and it's kind of hard to fix the repo. Just saying...
I don't get it. Isn't this issue what the largefiles extension is
designed for?
http://mercurial.selenic.com/wiki/LargefilesExtension
--
Isaac Jurado
"The noblest pleasure is the joy of understanding"
Leonardo da Vinci
More information about the Mercurial
mailing list