Dealing with binary files

Jonathan Lucas jonathan.p.lucas at gmail.com
Thu Jul 28 18:35:49 UTC 2005


On 28/07/05, Stephen Darnell <sdarnell at esmertec.com> wrote:
> 
> > a) by file contents
> > b) by file extension
> > c) by per-file flag
> 

or d) use the UNIX file utility? I don't know if there's an equivalent
in Windows, however.

> > Thoughts?
> 
> You may remember my previous email on this topic, and I think
> c) is by far the best option for users of the system, particularly
> for the sorts of projects I've worked on where there are lots of
> different types of files in the repository.
> 
> file contents are a strong but fallable indicator,
> file extension is also a strong but fallable indicator
> 
> I'd suggest that the best solution is a combination:
> - once a file is submitted the flag is the accurate value
> - the flag can be specified when adding a file or a new value
>   can be specified when changing a file
> - if no value is specified for the 'add' then the heuristics
>   kick in: look for matching extensions, then check the first part
>   to see if it smells like a binary
> 
> This is essentially how p4 handles it, and it works really well.
> 
> I also work in environments that have people sharing the same
> repositories on unix and windows, where a wide variety of tools
> and editors are being used. I'd really like hg to be a good
> SCM tool for any platform (esp. Unix and Windows), but without
> the option of distinguishing text files from binaries, and being
> able to perform eol transformations - it is going to cause a
> minority of people serious problems.  There'll also be a bunch
> of people for whom it is not a blocker, but a pain.
> 
> > So by doing c), we've made binary handling much more complicated and
> > fixed less than 50% of a problem that was very small to start with.
> 
> Much more complicated?  If you combine it with how the executable bit
> is stored (and a sym link flag), handling of the flag should not be
> more complicated?
> 
> As for the eol transformation, it should be possible to separate this
> from the normal (unix) code.
> 
> There is the issue of merging the state, but that's not much different
> to what is done now about the exec bit?
> 
> Regards,
>   Stephen
> 
> _______________________________________________
> Mercurial mailing list
> Mercurial at selenic.com
> http://selenic.com/mailman/listinfo/mercurial
>




More information about the Mercurial mailing list