How to name template filter to remove duplicates from list?

Yuya Nishihara yuya at tcha.org
Fri Sep 16 09:49:55 UTC 2022


On Fri, 16 Sep 2022 03:38:42 +0200, Manuel Jacob wrote:
> The Unix tool uniq removes only adjacent duplicates, which I found 
> surprising at first. I wouldn’t find "uniq" a good name for any variant 
> of the filter, since removing only adjacent duplicates might surprise 
> users not knowing the Unix tool, and removing all duplicates might 
> surprise users knowing the Unix tool.
> 
> The name "removeduplicates" is descriptive but quite long. It could take 
> an argument to choose whether all or only adjacent duplicates should be 
> removed. Examples:
> 
> "{'a\na\nb\na'|splitlines|removeduplicates}\n"
> -> "a b"  
> 
> "{removeduplicates('a\na\nb\na'|splitlines, onlyadjacent=False)}\n"
> -> "a b"  
> 
> "{removeduplicates('a\na\nb\na'|splitlines, onlyadjacent=True)}\n"
> -> "a b a"  

"uniq"/"unique" sounds good to me. Itertools of Rust provides unique()
for onlyadjacent=False, and dedup() for onlyadjacent=True.

https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.unique

I think the uniq command only uniquify adjacents with the assumption that
the input is sorted. The command could be used to deduplicate repeated lines
from unsorted text, but I've never used uniq for that use case.

I don't know if there's any use case of onlyadjacent=True in hg template.


More information about the Mercurial-devel mailing list