[PATCH] revset: fix generatorset race condition
Durham Goode
durham at fb.com
Wed Mar 26 00:27:44 UTC 2014
On 3/25/14 5:24 PM, "Gregory Szorc" <gregory.szorc at gmail.com> wrote:
>On 3/25/14, 5:13 PM, Durham Goode wrote:
>> # HG changeset patch
>> # User Durham Goode <durham at fb.com>
>> # Date 1395789007 25200
>> # Tue Mar 25 16:10:07 2014 -0700
>> # Node ID 719d5d962b0c8630c153453a48bfaeccbe9de2f7
>> # Parent 9a09a625bc93ab69248613abcbea041b785f1a0e
>> revset: fix generatorset race condition
>>
>> If two things were iterating over a generatorset at the same time, they
>>could
>> miss out on the things the other was generating, resulting in incomplete
>> results. This fixes it by making it possible for two things to iterate
>>at once,
>> by always checking the _genlist at the beginning of each iteration.
>>
>> I was only able to repro it with pending changes from my other commits,
>>but they
>> aren't ready yet. So I'm unable to add a test for now.
>>
>> diff --git a/mercurial/revset.py b/mercurial/revset.py
>> --- a/mercurial/revset.py
>> +++ b/mercurial/revset.py
>> @@ -2624,10 +2624,8 @@
>> gen: a generator producing the values for the generatorset.
>> """
>> self._gen = gen
>> - self._iter = iter(gen)
>> self._cache = {}
>> self._genlist = baseset([])
>> - self._iterated = False
>> self._finished = False
>>
>> def __contains__(self, x):
>> @@ -2639,28 +2637,29 @@
>> if l == x:
>> return True
>>
>> - self._finished = True
>> self._cache[x] = False
>> return False
>>
>> def __iter__(self):
>> - if self._iterated:
>> - # At least a part of the list should be cached if
>>iteration has
>> - # started over the generatorset.
>> - for l in self._genlist:
>> - yield l
>> -
>> - for item in self._consumegen():
>> - yield item
>> + if self._finished:
>> + for x in self._genlist:
>> + yield x
>
>Shouldn't we return here? Otherwise, don't we iterate over self._genlist
>twice if self._finished == True?
>
>Why do we even need this block? Won't the while loop below handle it?
You are correct, we need a return. I had this block because doing all
that extra stuff below might have real cost when run across hundreds of
thousands of revs.
>
>> +
>> + i = 0
>> + genlist = self._genlist
>> + consume = self._consumegen()
>> + while True:
>> + if i < len(genlist):
>> + yield genlist[i]
>> + else:
>> + yield consume.next()
>> + i += 1
>>
>> def _consumegen(self):
>> - self._iterated = True
>> -
>> for item in self._gen:
>> self._cache[item] = True
>> self._genlist.append(item)
>> yield item
>> -
>> self._finished = True
>>
>> def set(self):
>>
>
>
More information about the Mercurial-devel
mailing list