Talk:GitLab/Migration status
Github repos?
editThank you for the report and the decision. I'm quite happy about the results. I have a question about the third group of repos we have: Github repos. There are many canonical repos in github.com under wikimedia/ org and it's quite a pain maintain the third system. Since gitlab workflows mirrors github, can we do something about migrating them to gitlab or gerrit? Thank you! Ladsgroup (talk) 18:54, 24 June 2024 (UTC)
- Folks should prefer our GitLab to GitHub. And we are keeping an eye on what's developed exclusively on GitHub (there are 186 as of this afternoon: Gerrit/GitHub#Projects_on_GitHub). That said, we have no specific plans to move projects from GitHub to our GitLab instance. Are there specific projects/tasks causing maintenance burden/toil for GitHub?
- I note that during the migration process, we finally got rid of the lingering projects in phabricator—four systems down to three! TCipriani (WMF) (talk) 20:56, 24 June 2024 (UTC)
"until the end of the fiscal year"
editAt the end of GitLab/Migration status#GitLab vs Gerrit, recommendations for repos, it says [migration] will be actively supported until the end of the fiscal year. Does this mean we have about a week to get everything migrated (until the end of this the 2023-2024 FY)? Or is it perhaps referring to the end of the next (2024-2025) FY? Taavi (talk!) 19:39, 24 June 2024 (UTC)
- Bah, that should have read "calendar year", updated now. Thank you for flagging this. TCipriani (WMF) (talk) 19:42, 24 June 2024 (UTC)
Mono repo for core and extensions?
editI am wondering if a monorepo strategy for core and extensions would solve the mentioned problems and enable moving to GitLab sooner:
- Cross-repository dependent merge requests -- You can't have a cross-repository dependent merge request if it's all in one repo ;)
- Stacked patchsets -- similar to above, a monorepo would have all of the stake holders working out of one repo, and there would be a lot of visibility on any change merged in to core or extensions and would ensure core and e.g. an extension would be updated at in the same merge
- Dashboards, search, and visibility -- all of the MRs related to core and its extensions would live in one repo, and all MRs would be quickly visible; moreover GitLab has a To Do feature and To Dos are automatically added to an account under certain conditions, e.g. being at mentioned in an issue description or comment
SDunlap-WMF (talk) 15:21, 25 June 2024 (UTC)
- We've talked about it in the context of GitLab, if not explored it in much depth. I tend to think the answer is "not really".
- I'll note that a monorepo has been considered in the past and rejected, although I don't have the history of those discussions in my head (they mostly predated my involvement). Maybe someone else can fill in some background. I don't mean to say that it's obviously a bad idea - it would likely have some benefits - but it introduces other serious complications, and I don't think it would resolve most of our core difficulties with GitLab as a platform.
- A few thoughts re: specific points:
Cross-repository dependent merge requests -- You can't have a cross-repository dependent merge request if it's all in one repo
- Some things might indeed get simpler here. I'm not sure whether this solves the problem unless the scope of the monorepo is "everything that makes up Wikimedia production", but it would probably at least account for things like the first example at Gerrit/Cross-repo_dependencies, of a change to core and a change to an extension that depends on it.
Stacked patchsets -- similar to above, a monorepo would have all of the stake holders working out of one repo, and there would be a lot of visibility on any change merged in to core or extensions and would ensure core and e.g. an extension would be updated at in the same merge
- I don't think stacked patchsets are really a problem of visibility. They're rather a problem of atomicity. It's possible in GitLab to model a stack of proposed changes with a dependency relationship, but it's very far from ergonomic. A monorepo or not doesn't really have much bearing on this part of the problem. I'd welcome thoughts here from people who use the stacked patchset workflow more than I do.
Dashboards, search, and visibility -- all of the MRs related to core and its extensions would live in one repo
- I do think this could help a bit, at least for the scope of the proposed monorepo. BBearnes (WMF) (talk) 16:54, 25 June 2024 (UTC)
- Historically MediaWiki used to be a monorepo during the SVN days, and you could just commit and make changes across everything. There was an intentional decision to split it during the migration to Git/Gerrit, I don't have a link that explains the full rationale offhand.
- Conceptually I support and would have loved a MediaWiki monorepo once again, but I think it's just impractical given the amount of code we ship, maintain, and the number of diverse developers we support and want to recruit. My slightly outdated checkout of mediawiki/core + all extensions in Gerrit is ~9.6GB, which is a significant onboarding cost for new developers and casual contributors. Legoktm (talk) 22:00, 25 June 2024 (UTC)
- My friend Brad works full time at Automattic in a team of 4 just to keep their internal monorepo working. The Automattic monorepo would be about 1/10th the size of a similar MediaWiki repo and would likely need at least as many folks working on the tooling to keep such a large monorepo with external publishing requirements functioning well. Don't forget that to completely replace the utility of Gerrit workflows with a monorepo on GitLab we would need to bundle services and libraries in as well in addition to all of mediawiki/core + mediawiki/extensions + mediawiki/skins.
- It is certainly not impossible, but it is a big pivot from the status quo that would need a non-trivial amount of planning and communication to implement. -- BDavis (WMF) (talk) 22:21, 25 June 2024 (UTC)
Using a more opensource code forge (e.g. gitea)
editFrom what I can tell, the root of the problem is that GitLab is a for-profit company sponsored code forge with an open core that is unreceptive to merge requests that offer new features.
This is another strong opinion held loosely. Could/did we consider a community managed code forge like Gitea. I've used it with other organisations, and found the feature set pretty similar to GitHub (including support for a self-managed per-project CI pipeline.) It is far more extendable than GitLab because it isn't competing with a premium offering. It is in the process of supporting federation with other federation aware code forges. Lastly, the Gitea community would be more receptive to pull requests and modification. SDunlap-WMF (talk) 16:36, 27 June 2024 (UTC)
- We answered this one in our FAQ, the question, "Have we considered GitHub or some other forge?" The answer is no one has given any serious thought to any other code forges, and we're not planning to do any more evaluations at this time.
- I think that missing features we flagged are due to more than GitLab being for-profit. It's true the some changes we'd like to make would necessarily compete with enterprise features. But other missing features are a matter of upstream design decisions—our deeply connected repositories have needs that many forges—FOSS or otherwise—don't have on their roadmap.
- Cross-repository dependencies, dependent merge-requests/stacked patchsets, and deterministic atomic merges may be features misaligned with many forges' product direction. That said, I expect that awareness of the importance of these features for large projects will grow as time goes on. The evidence for this growing awareness is that features are slowly trickling in to the big forges from systems like Gerrit and Phabricator: GitLab's merge trains and GitHub's merge queues to coordinate merges as one example. TCipriani (WMF) (talk) 18:54, 27 June 2024 (UTC)