win32text and excluding patterns
Mark Hammond
skippy.hammond at gmail.com
Tue Apr 14 00:49:02 UTC 2009
On 14/04/2009 8:43 AM, Mads Kiilerich wrote:
> Mark Hammond wrote, On 04/10/2009 06:16 AM:
>> Hi all,
>> I'm trying to get the win32text extension to ignore certain patterns
>> and I'm having trouble. What I want is something like the following:
>>
>> [encode]
>> **.dsp = !
>> **.dsw = !
>> ** = cleverencode:
>>
>> IOW, all files _except_ *.dsp and *.dsw should use clever encoding.
>> I've dug around the mailing lists and the sources and it seems some
>> attempt is indeed made to handle '!' - and it seems to have been
>> introduced with a similar motivation:
>>
>> http://markmail.org/message/g55ev2ka7yseaept
>>
>> But best I can tell, it's not working in my case as the '**' still
>> matches. IOW, using '**=!' is useful - it temporarily disables all
>> encodings - but it's not useful for any other extension when a '**' is
>> in force. Am I misunderstanding something?
>>
>> So I thought maybe a way forward was to define new "pass-through"
>> encoders called, eg, 'exact' - they just return exactly what was passed
>> to them. Then I could do something like:
>>
>> [encode]
>> **.dsp = exact:
>> **.dsw = exact:
>> ** = cleverencode:
>>
>> But this falls over for a similar reason; ordering in the sections is
>> not maintained, so the '**' may still match first.
>>
>> An easy solution that avoids trying to capture "full" ordering might be
>> to have the code classify the filters into 3 categories - those without
>> a wildcard, those with a wildcard, and '**'/'*', and ensure the filters
>> are applied in that order.
>>
>> The handling of '!' still seems suspect to me though - it acts more like
>> "pretend this filter line doesn't exist" than the expected "record that
>> this extension explicitly wants no filtering". Am I missing the intent
>> of '!' (and therefore my idea of a new 'exact' encoding makes sense), or
>> is the implementation of '!' suboptimal, meaning I could implement my
>> requirements by changing the handling of '!'?
>>
>> I'm happy to make a patch for this, but thought I'd check here first
>> that I'm not missing anything obvious and what the best way forward is.
>
> I think it "works as designed": The "!" notation is only for disabling
> all filtering for _a_specific_ pattern
Thanks for the reply.
My point is that it does *not* disable all filtering for a specific
pattern. If I have the configuration:
[encode]
**.dsp=!
** = cleverencode:
filtering is *not* disabled for the pattern '**.dsp' - it gets clever
encoding.
All '!' does is disable that specific rule - almost identical to
commenting out the line (but I understand its not identical to
commenting due to the merging of different config files.) So what it
actually does is 'allows you to disable a previously configured rule for
a specific pattern'.
I'm not trying to nitpick, but 'disabling a previously defined rule' is
quite a bit different to the user than 'disabling filtering for a
pattern' - I'm after a way of disabling *all* filtering for a specific
pattern.
> only be one filter specification for each pattern. I think ordering _is_
> preserved for the filters, but while an early "!" disables that pattern
> it doesn't stop the filter engine from continuing with the next filter
> and match and apply that.
Yes, I understand that - it disables a single rule.
Also, I've re-confirmed that ordering is *not* preserved. A simple
'print' statement, eg:
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -539,6 +539,7 @@
if filter not in self.filterpats:
l = []
for pat, cmd in self.ui.configitems(filter):
+ print "CHECK", pat, cmd
if cmd == '!':
continue
mf = util.matcher(self.root, "", [pat], [], [])[1]
Shows that the order of patterns listed in the 'print' statement has no
bearing on the order in the INI file. If you look at the impl of the
ConfigParser, you will note a dictionary is used, which is why the
ordering is lost.
Further, if you check the rest of the impl of _filter() in localrepo.py,
it uses a dictionary to remember the filters it has loaded, so even if
the config kept the order for us, _filter()'s current impl would lose it.
> What you ask for could be a nice feature, but it isn't obvious to me how
> it would fit what currently is implemented. And note that a patch must
> preserve the current behavior.
The current behaviour as described or as implemented <wink>? I can see
a use case for '!' meaning 'disable all filtering for this pattern', but
I *can't* see a use-case where people would want the INI file like I
posted above to use clever encoding on the .dsp file. Am I missing
something?
> Take a look at mercurial.localrepo.localrepository._filter
>
> Hmm. Perhaps you can do something close to what you ask for with
> something like:
>
> [encode]
> ** = cleverencode:
> **.dsp = cleverdecode:
> **.dsw = cleverdecode:
>
> But the simplest solution would perhaps be to create an extension with
> your own custom "smarterencode" which does exactly what you want.
Yes - that is exactly where I started the email from (but I called it
'exact' instead of 'smarterencode'.) But as I noted above, I'm stuck
here with the fact ordering is lost, so the '**' may get used before the
more specific pattern.
I think I must be missing something though, as the current behaviour
doesn't really seem useful for people using Windows all day. My
situation is:
* I've many repos which use exclusively text; **=cleverencode seems a
perfectly good fit for me, as the docs suggest. I want this setting
'globally', not per-repo, so I don't forget to configure a repo and
accidentally create mixed line endings etc in repos I use day-to-day.
* I've a handful of repos with a few files with windows line endings in
the repo - the .dsp and .dsw are the obvious ones, but I also note
mozilla has a fair number too of .html, .js etc, so entire directory
trees should be excluded there.
Every single time I work with one of the second class of repos, I get
loads of warnings about the mixed line endings, and 'hg diff' shows the
files as being changed.
All attempts to avoid this have been fruitless. The only solution seems
to *not* use **=cleverencode, but as mentioned, I believe there are good
reasons to ensure hg does the right thing *by default* and not rely on
me remembering some process each time I clone/create a repo.
What do others do here?
Cheers,
Mark
More information about the Mercurial-devel
mailing list