Topic on Talk:GitLab/2020 consultation

Lowering the barrier to contributing

11
MusikAnimal (talkcontribs)

While Gerrit may have features that are arguably superior in terms of code review (depending on your workflow), to me, it poses too great of a barrier to contributing, and is a constant source of confusion. I've been using it for 4 years and I still find myself occasionally having to ask for help. I can't help but wonder just how many volunteer developers we've lost because of this. Let's say as a new developer I wanted to fix a simple typo, or add a new line to a config file -- why do I need to read a manual on how to do this? Unless our goal is to increase the barrier to contributing, I'd say there's really no contest here... GitLab/GitHub/BitBucket are all scores more user-friendly. Sure, once you are familiar with Gerrit, its powerful features start to shine, but I think we should do our best to foster open-source development by keeping the barrier to contributing as low as possible, just like we try to do on the wiki. It's for these reasons that I would never host my own Wikimedia tools on Gerrit.

That said, if we do stay with Gerrit, I think there are some small improvements we could make to improve the user experience. For instance, I had +2 rights when I first started using Gerrit. On my first attempt at reviewing code, I of course hit the pretty blue "Code Review +2" button, as it would seem that would 'start' the code review process. Two members of my team at WMF did the same thing when they first joined. I think the button should instead say "+2 Merge", and perhaps have a confirmation modal. Or, say the build gets stuck. You might see another pretty blue "Submit" button. I would have expected that to re-submit the jobs, or something, not merge and bypass CI entirely! Again, "Merge" might be the better wording. It's weird that all the buttons have tooltips except the one that actually can cause problems, and the problematic buttons are so easy and inviting to click on. These are just minor examples. I also struggle to navigate the codebase through the UI, can't ever remember how to follow projects, not to mention those secret commands to control CI via comments... the list goes on and on. Left to my own devices, I always use the GitHub mirrors to browse and share code.

I hope my wording does come off as too strong. A lot of people have put immense work into Gerrit, and I know it works exceedingly well for some people. Perhaps GitLab seems like a toy to some. I suppose it's just a trade-off between power and usability, and I hope we don't neglect the usability aspect when making our final decision.

Nikerabbit (talkcontribs)

I fully agree that we should lower the barrier to contributing, but we should be conscious about the trade-offs. If we switch

  • productivity of some developers, like me, would likely decrease temporarily as we learn and adapt.
  • productivity of some developers, like me, could possibly decrease permanently, if GitLab does not support certain kind of workflows as fluently.

In addition, a lower barrier to entry has to be balanced with managing the incoming stream of contributions, not all of them valuable. We know from Wikipedia that it can only work if sufficient tooling and resourcing is present to filter out spam, vandalism and improve contributions which do not quite meet the requirements. Are we prepared to fight the spam, vandalism and drive-by contributions that are not mergeable without further work? Do we have sufficient guidance for contributions so that they can work with us, and not (unknowingly) against us?

I don't have answers to any of these questions, but I hope that there will be by the end of this consultation. Personally, I will try to figure out the first part, how much would my productivity be affected by the switch.

WDoran (WMF) (talkcontribs)

From my limited experience here, managing the flow of inbound work is already a significant issue at least for our team. This involves making hard choices and trying to balance resources. On Platform Engineering, we've tried to adopt processes that give clear interfaces for other teams but the volume is already quite high.

I do not at all mean to discount this point, I think it's valuable and prescient but above all something we should already have impetus to address. Building up a better experience both for our internal teams and external contributors should absolutely be a focus.

I'm not sure if it's possible but it might be worth reviewing the practices of other large scale groups and seeing what we can adopt or if there is a willingness to knowledge share with us. I know our own team had an excellent experience working with Envoy recently to contribute upstream changes.

Hashar (talkcontribs)

I am pretty convinced it is a social problem rather than a tooling issue. We had the same problem under the CVS/Subversion area, new commits were send to a mailing list and reviewed after the fact. In 2008, Brion sprinted the Extension:CodeReview (GitHub was just starting at that time) which at least make it easier to process the backlog. I came back as a volunteer in 2010 and went on a review frenzy, but we still had glitches.

Others would correct me, the main incentive was to switch to git. Gerrit came with the nice addition of holding the flood of patches as pending changes which nicely fitted MediaWiki: patches were on hold until reviewed thus protecting production.

Gerrit surely has its flaws, but I don't think the review issue is a tooling issue it is entirely social and related to our "bad" (but improving) development practices and community as whole.

For the tooling consultation, we might be able to look at repositories maintained by Wikimedia on GitHub and see whether the reviews are better handled there. But the corpus of repositories is vastly different (in my experience interactions for a given Github repository are mostly from a single wmf team).

MusikAnimal (talkcontribs)

Will GitLab login require a Wikimedia developer account, like Gerrit does? If so I think that alone would cut out a lot of drive-by garbage, at least spam and vandalism. I can't imagine it'd be much worse than what we see on Phabricator, no? Even if there was an approval process to get access, that might be okay... my issue is good-faith, competent developers (volunteer and staff alike) who already have access still struggle to use the software. It's not just about making patches, but participating in code review, and doing basic things like watching projects and navigating the code, or even finding the command to clone a repository (though downloading an individual patch I think is easy enough to figure out). Or say I click on a Change-Id, it forwards me to the patch, and all of a sudden by browser's history is polluted with redirects making it hard to get back to the previous page. It's all the little things, that together combined with the confusing CI system can turn routine tasks into headaches. This all is of course just my opinion/experience. I am fairly confident these days with Gerrit, but it took a long time for me to get here.

BBearnes (WMF) (talkcontribs)

Will GitLab login require a Wikimedia developer account, like Gerrit does?

Yeah, that's the plan.


(Edit: Well, that's my assumption as to what the plan would be. Specifics will need work, but GitLab CE supports LDAP.)

Tgr (WMF) (talkcontribs)

Like others, I'm worried we are misidentifying the problem here. I agree in theory that we should prioritize a low barrier of entry and good learning curve above power-user-friendliness - both for pragmatic reasons (we can always use more hands, and the Wikimedia open source projects seem very far below the potential that being a top10 website and the top free knowledge management tool should grant them) and because it fits well with our values of openness and equity.

In practice, though, I agree with Hashar that the main bottleneck is human. This is something the "why" section of the consultation doesn't engage with as well as it should - yes, surveys have shown code review to be the biggest pain point, but we don't have any good reason to think Gerrit was the main reason for that. Resoundingly, the biggest complaint is the lack of reviewer response; the WMF has so far chosen not to invest significant resources into fixing that. So I worry that 1) this will be a distraction (we feel good that we are now doing something about developer retention, so addressing the real problem is delayed even further); 2) maybe even harmful if GitLab is worse at supporting efficient code review (one thing Gerrit excels at is finding patches; as such it's reasonably okay at supporting our somewhat unusual situation of a huge pile of repos with unclear or lacking ownership, and some repos which are too large for repo-level ownership to be meaningful); 3) it will just lead to more churn (if you have a social system with a limited capacity for supporting newcomers which is already overloaded, and you make the technical means of joining that system easier, you'll end up with the same amount of successfully integrating users but much more deflected ones, who have negative experiences with the Wikimedia developer community and it will be harder to reach them later once we improved things).

To phrase things more actionably, I'd really like to see Gerrit and GitLab compared specifically in terms of their ability to support code review if it remains a largely voluntary activity, not incentivized or rewarded by management. Will it become easier or harder to find unreviewed patches accross repos, by various criteria like "recently registered user" or "productive volunteer contributor"? Will it be easier or harder to track code review health on a global or repo level? Will code review take less or more time?

Tgr (WMF) (talkcontribs)

I'd add that CI is IMO the one area where tooling can efficiently support code reviewers - tests and linters basically provide automated code review, and they reduce the reviewer burden as long as they provide it in a comprehensible format. This something our current system is really bad at - patch authors need to figure out what went wrong by parsing dozens of pages of console logs, a terrible experience for new developers (and an annoyance for experienced ones). I'm not sure how much that is an issue with Gerrit though. It had the ability for years to filter out bot noise from review conversations, for example, and we haven't bothered to make use of it until recently. Since recently it has the ability to assign test errors to specific lines and show them in context, and there is no organized, resourced effort to convert our test tooling. So again I don't know if the switch would address the real issue there. Does GitLab even support inline CI comments? From speed-skimming the docs, my impression is it does not (interactive CI debugging OTOH sounds like a really cool feature, but it is not for beginners). Making sure all of our major test/lint tools play nice with Gerrit features like inline comments and fix suggestions could IMO be more impactful for new developer retention while being a less ambitious (ie. less risky) project.

Hashar (talkcontribs)
Tgr (WMF) (talkcontribs)

@Hashar yes, and it is not on any team's roadmap (much less on the annual plan) to do so. Kosta has done an amazing job with SonarCloud, and there is a working group doing great work, but it's mostly a personal effort that is happening due to the dedication of the participants, and to the extent they can find free time for it. Meanwhile we are considering this moonshot project to address a problem when there are bigger problems that could be addressed with far less effort.

I don't want to downplay Gerrit's UX weaknesses, it is certainly a serious problem for developer retention. I find the arguments that we should at some point migrate away from it convincing, and as a superficial first impression GitLab seems like a decent place to move to. But given there are problems which are more severe and can be addressed with less cost and less risk, it feels a bit like a prioritization fail.

ProcrastinatingReader (talkcontribs)

I have no comment on all the nuances described elsewhere on this talk, but I can say that Gerrit is a huge bar to contributing. I don't understand any of it (to be fair, I haven't tried, and don't intend to learn) -- I know two commands and I get by on them. So maybe it's not the biggest bar in practice, but it's a psychological / "can I really be bothered" bar. Verses just knowing what to do, and being able to spend your time on the code rather than on learning Gerrit. Most devs, especially volunteer ones, will not be exclusively contributing to MW. And I would hypothesise it's likely most other projects they contribute to are on GitHub, or using the GH flow. Hence it's more intuitive and a lower barrier to entry.


I think it would certainly help improve contributions. Admittedly, last I used GitLab I didn't have that much love for it (many years ago now), but it is certainly a big improvement, and I think it's better in the long term. I do not think Gerrit is sustainable if we think about the years ahead, when I think these kinds of tools will become more and more forgotten. My opinion: the quicker MediaWiki moves on from Gerrit, the better. And I hope one day something is done about phab too, although that is more a preference rather than a problem.

Btw, respect for everyone who has made Gerrit work this long and tried to abstract away the barrier to entry. Not trying to diminish that work, by any means. But I think there's only so far you can go.

Reply to "Lowering the barrier to contributing"