[Reviewers] Review Tooling (was: dropped a couple changesets and flushed the queue)

Pierre-Yves David pierre-yves.david at ens-lyon.org
Wed Dec 28 16:17:54 UTC 2016


I've split my reply in two because these is two very distinct topic 
going on here and the email was getter far too long.

Here is the part about our review flow.

On 12/15/2016 11:24 PM, Augie Fackler wrote:
>
>> On Dec 15, 2016, at 3:40 PM, Pierre-Yves David <pierre-yves.david at ens-lyon.org> wrote:
>> […]
>>>> Dropping a patches is not a big deal, it is easy to requeue it as draft
>>>> on the top of the start and resume the discussion. Right
>>>> now, the main pain around dropping is our tooling.
>>>
>>> Is dropping any more complicated than a prune, stabilize and push at this point? (With a warning and/or helping hand to the contributor so they don’t lose their change)?
>>
>> Yes, currently, the stabilization means changeset get rewritten and repushed. As a result:
>
> You misunderstood my question a bit: I meant “prune && stabilize --all && push” is the right set of hg commands? I’m aware of the (nasty) implications for our review tooling. :/

Simple answer: yes. you just need “prune && stabilize --all && push”

Complex answer: if user is not available for sync, the dance is a bit 
more complex as one want to push a non-pruned successors to the pruned 
changesets somewhere. This involves a dance with the infamous and hacky 
"drop" extension to strip marker on your side. Details looks like:

You want to drop changeset with hash DROPPED
1) hg touch DROPPED # creates TOUCHED
2) hg rebase -s 'only(@, DROPPED)' --dest DROPPED~1
3) hg prune --hidden DROPPED
4) hg push --rev --force TOUCHED 
some-side-repo-for-contributor-to-pull-from.
5) hg drop TOUCHED
6) hg push -r @ hg-committed
7) send the "this was dropped email with a point to 
some-side-repo-for-contributor-to-pull-from and the TOUCHED hash.

>> 1) dropping existing accept stamps on everything we touched,
>> 2) implicitly accept everything we repushed.
>>
>> The problems gets bigger as the stack above what we drop growth (in our case 92, but this become painful much sooner).
>
> I’ve been thinking about this some, and I don’t yet have any ideas that merit wider sharing.

The main issue here is that dropping a patches means loosing the other 
people approval stamp and putting your own on every changeset rewritten 
as a side effect. If we get better at detecting this pure rebase without 
actual patch rewrite we would be much more agile here. Dropping a 
changeset would be a lighter weight process and we could even reorganize 
long-validated patches or urgent bugfixes earlier in the stack to land 
them quickly.

Logilab have good code to detect that since about 2012. I think I 
pointed Matt to it earlier this year when he was working on the accept 
tooling but I don't think he got around updating our tooling with it. I 
can dig you a pointer to that code if you want to.

Just having that detection code ported and rolling would be a huge win 
in my opinion.

> The only thing I’ve come up with (and mostly discarded) is pushing lots of anonymous heads, and as things get automatically accepted a robot would do the merge and (absent any conflicts or test failures) auto-land the merge of the accepted series.

For the record, Logilab have a process focused around that, many heads 
that get linearized and published when ready. The difference are:

- submission are done by pushing to a non-publishing repo,

- review is web-based only (some email notification),

- final tests validation and publishing is manually made by maintainers 
as the second approval stamp.

> It’s not that much code, and it might be kind of nice, but the pile of anonymous heads could be really annoying.

<humor>
/me adds another mark to the FeatureBranchesStruggle idols. May they 
hear our prayer
</humor>

> Suggestions welcomed - I might be able to spend some time on this in Q1, as it’s starting to be a big problem for FB/GOOG/MOZ.

The branchy thing describe above is nice and shinny, but I see that 
happening any time soon given the amount of effort we have historically 
been putting in tooling. I prefer we start improving tooling around the 
current process as discussed in the past. That would allow big returns 
on small amount of invested energy.

Some time ago I spent half an hours throwing together a basic munin¹ 
probe with some patch flow data (I was head down in Munin probes for 
something else[²]). Thats not much but was quite easy to get. I'll to 
get the data published somewhere actually visible when I get the time. 
For now this shows up on my home server monitoring…
You can see the script in action by running /home/marmoute/bin/patchflow


The process we currently use is not perfect, but is well field tested 
and have been able to raise better results in the past than it is 
yielding now. Here are the historical data of accepted patches for 
reference:
- version 3.3: 857
- version 3.4: 905
- version 3.5: 995
- version 3.6: 951
- version 3.7: 1127
- version 3.8: 1083
- version 3.9: 602
- version 4.0: 599
- current @: 433 (¾ in the cycle) (projection 577 at end of cycle).

We can see a significant drop of at one third of the throughput at the 
point where Augie and Yuya ended up handling the review mostly alone 
(and thanks to both of them for that). Getting more people back into 
review should bring us back to a better speed.

Regarding short terms improvement, I should be back to full activity in 
2017, that should provide a significant help to the rate. The next 
obvious things is to try to get Kevin and Martin a bit more into the 
game ;-) If the current latency is a problem for Google, they have a 
pretty simple card to play by putting Martin back into a place were he 
is happy doing review. The original idea behind having Facebook contract 
Kevin on infra stuff was to get him a foot back into the project so that 
he can start be an active reviewer again. That never really happened. 
Kevin, what would help you being active again?

On a more midterms view, I think minor improvement to our tooling could 
help to get more people involved in pre-review and then eventually core 
review. Having (1) clear tooling and process for contributor to mark 
thing as pre-reviewed (2) available statistic to show that this 
pre-review helped. (3) Having more stat would also help us to nudge some 
people into doing more review to balance their submission load and spot 
people doing mostly successful pre-review. There is a small group of 
people putting effort into doing pre-review right now and I've good hope 
to be able to suggest them as reviewer sometime in 2017 (eg: junw is on 
a good progress slope).

To summarize I would try to improve things in the following order:

1) get automated statistic about the current flow,
2) get current reviewers to be as active as they can/want to be,
3) improve the pre-review flow and usage.
4) use statistic to check what is still on fire and focus on fixing that 
next.

Point (1) and (3) requires someone spending actual time writing a 
minimal amount of things. I think we should try to find some small 
founding for that. We should be able to find someone to work on that on 
paid that and that would unlock the current stall. (eg: Kevin, Logilab, 
av6?)

Cheers,

-- 
Pierre-Yves David

[1] http://munin-monitoring.org/
[2] I not have graph of the particulates level around my flat 
(disclaimer, it is usually scary).



More information about the Reviewers mailing list