[PATCH] Allow manipulating files with long names on Windows
Aaron Cohen
aaron at assonance.org
Thu Jan 20 22:06:39 UTC 2011
Thanks for the review.
On Thu, Jan 20, 2011 at 3:17 PM, Mads Kiilerich <mads at kiilerich.com> wrote:
> On 01/20/2011 06:02 PM, Aaron Cohen wrote:
>>
>> # Parent 9f707b297b0f52278acc6c4a4f7c6d801001acb7
>> Allow manipulating files with long names on Windows
>
> A minor detail from http://mercurial.selenic.com/wiki/ContributingChanges :
> "lowercase summary line"
>
> And start the summary line with the extension name.
There's some tension here between fitting in 80 characters for the
description, and having a good summary. ;)
I also wasn't sure whether it was supposed to start with a subsystem
prefix before the subsystem was accepted.
Duly noted though.
> Such history might be more appropriate in an introduction email than in the
> changeset description.
Yes, but I'm actually not sure how to do the introductory email
correctly at all. I'm new to the patchbomb extension. I guess I'm
supposed to use --intro and then compose the message, and end up with
a [0/1] email? I've been fruitlessly looking for a way to send just
the patch with introduction before "---". I also haven't been able to
get "versioned patches" going the way I'd like. I was expecting
--flagV3 to produce a subject like [PATCH][V3], but it does not.
I'll try to do better in the future, I guess using --intro.
>> +''Author: Aaron Cohen<aaron at assonance.org>''
>
> This should be in a standard copyright / license comment like in (most)
> other extensions.
OK
>> + [extensions]
>> + win32lfn=
>
> "lfn" is new abbreviation to learn. Would it make sense to call it "unc" instead?
That was actually what I originally named the extension, but I
reconsidered. UNC seems to be more of a technical detail to me, where
LFN (for long file names) is more descriptive of the extension's
purpose. I can switch back if you feel strongly about it.
>> +_win32 = False
>> +try:
>> + import win32api, win32file, winerror, pywintypes
>> +
>> + _win32 = True
>
> Beware of demandload that might delay the import error to winerror is used
> in the next lines.
So the _win32 = True needs to come after the first usage? OK. I
probably need to test more without pywin32 installed.
>> + _errmap = {
>> + winerror.ERROR_ALREADY_EXISTS: errno.EEXIST,
>> + winerror.ERROR_PATH_NOT_FOUND: errno.ENOENT
>> + }
>> +except ImportError:
>> + pass
>> +# UNC filenames require different normalization than mercurial and python
>> want
>> +def unc(path):
>> + global _suppressunc
>> + if not _suppressunc:
>> + _suppressunc += 1
>
> The trick here is that abspath is patched and might recurse back here? That
> might deserve a comment.
>
> Would it be possible to store the original abspath and call that instead?
Yes, that was the trick. Unfortunately, the unpatched abspath doesn't
behave correctly for long paths. However, the patched one no longer
recurses, this is now just an unneeded holdover from an earlier
version and I'll delete it.
>> +def wrap(method):
>
> "wrap1" would be more descriptive, especially compared to wrap2.
OK
>> +# vanilla os.listdir handles UNC ok, but breaks if they're longer than
>> MAX_PATH
>
> Use docstring instead of comment.
Won't that replace the docstring for listdir with mine? If so do I
need to copy all the original docstring too? Maybe I need to do that
anyway.
>> +# vanilla returns a relative path for filenames longer than MAX_PATH
>> +# os.path.abspath(30 * "123456789\\") -> 30 * "123456789\\"
>
> You could perhaps phrase this as a doctest in the docstring.
That's indicating the incorrect behavior of the original rather than
the desired behavior.
>> +# vanilla loses a trailing backslash:
>
> Why is that a problem?
If you do os.path.join(os.path.split("\\\\?\\C:\\"), "foo") you end up
with \\?\C:foo which is invalid. Same problem with dirname.
>> +# os.path.split('\\\\?\\C:\\') -> ('\\\\?\\C:', '')
>> +def wrapsplit(split):
>> + def lfnsplit(path):
>> + result = split(path)
>> +def wrapchdir(chdir):
>> + def lfnchdir(path):
>> + if len(os.path.abspath(path))>= 248:
>
> A magic number? Put it in a "constant" with a descriptive name or add a
> comment.
Will fix.
>> + path = unc(path)
>> + if os.path.exists(path):
>> + # Use an environment variable so subprocesses get the correct
>> cwd
>> + os.environ["CD"] = path
>
> Is this CD variable magic or used in other places? If it is for this use
> only we might want to make sure it doesn't collide with other uses - for
> example HGWIN32LFNCWD.
CD is the variable cmd.exe sets as you change directories. It's not
set when hg is running though, so I thought I'd repurpose it. I can
change it though.
> But it seems like a bad idea that chdir doesn't do a chdir. That will most
> likely have unexpected consequences. Wouldn't it be better to just fail if
> cwd is too long?
That would be possible I guess, there aren't that many places in hg
where os.chdir is used, and it's almost always to repo.root. The most
important usages seem to be in the record extension and the patch
command.
I originally just documented that the root of the repo shouldn't be
more than 244 characters long, but figured I'd give it my best shot to
fix it "right".
>> +def uisetup(ui):
>> + if not _win32:
>> + ui.warn(_("This extension requires the pywin32 extensions\n"))
>
> It will be hard for the user who get this message to figure which extensions
> "this" refers to.
Will fix
>> + if hasattr(util, "unlinkpath"):
>> + util.unlinkpath = wrap(util.unlinkpath)
>> + if hasattr(util, "unlink"):
>> + util.unlink = wrap(util.unlink)
>
> This invasive monkey patching looks a bit scary and fragile and might make
> it less easy for it to become an "official" extension.
This is actually pretty much identical to what win32mbcs patches, just
written out directly instead of with a fancy loop. I find this easier
to read, but could switch to a style like mbcs is using. In fact,
anywhere win32mbcs isn't patching a function I patched, I'd bet they
have a bug. (util.makedirs for instance).
>From a quick readthrough, I would expect the two extensions to
actually be able to coexist, but only if mine loads first.
>> +def list(ui, repo):
>> + for root, _ignored, files in os.walk(repo.root):
>> + for file in files:
>> + if len(root + file)>= 259:
>
> MAXPATH - 1?
>
> "contrib/check-code.py hgext/win32lfn.py" will complain here.
Hmm, not for me. I've been running that actually. The only complaint
it has is about one line that's too long, that I deliberately left in
because it's a URL. I can split that line if desired. I'll replace the
magic number though.
>> + c = ui.promptchoice(_("Delete %s? [N/y]") % path,
>> + (_("&No"), _("&Yes")), 0)
>
> promptchoice is usually used with prompts such as "lowercase...(yn)".
OK, I should have looked harder for an example.
>
> Mercurial don't use commands with multiple keywords. It is one keyword and a
> number of options.
>
> I suggest you use two completely different commands. Especially because the
> commands do something completely different.
OK
> Finally:
>
> Many fine extensions have their own life and isn't distributed with
> Mercurial. That has the advantage that they can support multiple Mercurial
> versions (if they can) and they can use their own release and bugfix
> schedule.
Yes, I hear you. I do consider it an unfortunate weakness of Mercurial
that it can have repositories that it's unable to checkout on Windows.
And it's a regression if someone is moving from SVN, because
TortoiseSVN at least is able to check out these same projects.
> Extensions might be accepted in Mercurial if they have proven that they are
> stable and widely used and actively maintained.
>
> I suggest you publish this extension somewhere (for example on bitbucket)
> and add it to http://mercurial.selenic.com/wiki/UsingExtensions . Time will
> tell if it would be better to distribute it with Mercurial.
Alright
> I don't know how much you have looked at the fixut8 and win32mbcs
> extensions. They solve similar problems in a similar way but do it very
> differently. Who should learn from who? Are they compatible? Could they
> share some infrastructure?
As I said, the list of patched functions is actually very similar
between my patch and win32mbcs. It is perhaps a little less startling
in win32mbcs because it's less obvious exactly how many functions are
being patched!
I don't have a shiftjis system to test on, but I would guess that my
extension would coexist well with mbcs if mine loads first. If mine
loads second, I can tell that some of the patches mbcs needs would be
lost.
--Aaron
More information about the Mercurial-devel
mailing list