[PATCH 2 of 2 V2] filterlang: add a small language to filter files
Yuya Nishihara
yuya at tcha.org
Fri Jan 12 12:00:33 UTC 2018
On Thu, 11 Jan 2018 11:13:57 -0500, Matt Harbison wrote:
>
> > On Jan 11, 2018, at 10:16 AM, Yuya Nishihara <yuya at tcha.org> wrote:
> >
> >> On Thu, 11 Jan 2018 00:17:39 -0500, Matt Harbison wrote:
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison at yahoo.com>
> >> # Date 1515641014 18000
> >> # Wed Jan 10 22:23:34 2018 -0500
> >> # Node ID 548e748cb3f4eea0aedb36a2b2e9fe3b77ffb263
> >> # Parent 962b2bdd70d094ce4bf9a8135495788166b04510
> >> filterlang: add a small language to filter files
> >
> >> I also made the 'always' token a
> >> predicate for consistency, and introduced 'never' to improve readability.
> >
> > Perhaps '**' or '.' could be an "always" symbol given patterns are relative
> > to the repository root in filterlang.
>
> I’m thinking ahead to a tracked file that could be converted to this language, and trying to make it readable. This construct seems weird to me:
>
> **.c = !**
Ah, okay. always()/never() or all()/none() makes sense there. I slightly
prefer all()/none() as fileset is the language for set operations, and we
have all() in revset.
> >> diff --git a/mercurial/filterlang.py b/mercurial/filterlang.py
> >> new file mode 100644
> >> --- /dev/null
> >> +++ b/mercurial/filterlang.py
> >> @@ -0,0 +1,73 @@
> >> +# filterlang.py - a simple language to select files
> >
> > The module name seems too generic.
> > minifileset.py, ufileset.py, etc. or merge these functions into fileset.py?
>
> minifileset.py I guess? My concern with putting it in fileset.py is how to enforce the boundary clearly.
Seems fine.
> >> +def _compile(tree):
> >> + op = tree[0]
> >> + if op in ('symbol', 'string'):
> >> + name = fileset.getstring(tree, 'invalid file pattern')
> >> + op = name[0]
> >> + if op == '*': # file extension test, ex. "*.tar.gz"
> >> + return lambda n, s: n.endswith(name[1:])
> >
> > Better to make sure no metacharacters in name[1:].
>
> Aren’t meta characters allowed in a string, so as to not block certain file names? Does this mean symbol and string have to be handled separately?
I meant '*.*' shouldn't be translated to n.endswith('.*'), for example.
> >> + elif op in ['or', 'and']:
> >> + funcs = [_compile(t) for t in tree[1:]]
> >> + summary = {'or': any, 'and': all}[op]
> >> + return lambda n, s: summary(f(n, s) for f in funcs)
> >
> > IIRC, ('or'/'and', x, y) isn't flattened in fileset.py, so the tree would have
> > exactly 2 operands.
>
> fileset.andset() calls getset(), which checks the arg, but maybe that’s an artifact of other uses.
That's probably for "()", an empty group.
Here I meant any()/all() always takes a list of two elements. Just for the
record, that isn't a problem.
More information about the Mercurial-devel
mailing list