Topic on Talk:Gerrit

Is Gitiles indexed by search engines?

7
Summary by Lectrician1

https://phabricator.wikimedia.org/T209456 disallowed Gitiles to be indexed due to rogue bots.

Lectrician1 (talkcontribs)

Finding Wikimedia code repos hosted on Gitiles can be extremely annoying. Because it doesn't appear that they are indexed by search engines, looking them up returns only their Gerrit changes websites. For example, in order to get to the Mediawiki core Gitiles repo, I search for it , I have to click on the https://gerrit.wikimedia.org/r result because it's the only one, then I have to click "VIEW CHANGES", and then I finally get the "Browse: gitiles" link.

I've faced even more steps when trying to find repos that have very few changes, are small, and are submodules of another repo. For example, it took quite awhile to try the find the value-view Wikibase repo. A Google search doesn't even yield a link to its Gerrit changes site. I did end up finding a link to its JS documentation, however I have no idea how to get from that to Gitiles. I only ended up finding it by looking up a changes that had "value-view" in them on Gerrit, clicking on one, clicking on its repo, and then clicking on Browse: gitiles.

It shouldn't take me this many steps. Gitiles repos should show up in search results in search engines.

Jdforrester (WMF) (talkcontribs)

In general you should use the Wikimedia Code Search tool to find Wikimedia code, as that is much more powerful than general search engines for this specialised use case.

To answer your question: Yes, at least to some extent. For example, searching on Google or Bing, for 'mediawiki site:gerrit.wikimedia.org/g' provides me with https://gerrit.wikimedia.org/g/mediawiki/core as a result, showing that at least that page is in the index there.

Lectrician1 (talkcontribs)
Jdforrester (WMF) (talkcontribs)

Yes, I also use Code Search, however, sometimes it doesn't give useful results:

I don't understand what is not useful about that result?

Why when I plug in a Gitiles link to an SEO analyzer it says that indexing is not allowed?

Because it's blocked from spidering the non-root pages, apparently, since 2013.

Lectrician1 (talkcontribs)
I don't understand what is not useful about that result?

idk. I was expecting some other element to have "DvMonolingualTextValue". I guess I don't know jQuery.

Because it's blocked from spidering the non-root pages, apparently, since 2013.

That looks like it was for Gitblit though, which looks like it was for replaced by Phabricator Diffusion and Gitiles. Was that policy just kept on the new systems or is there another reason for blocking indexing on Gitiles?

Jdforrester (WMF) (talkcontribs)

That looks like it was for Gitblit though, which looks like it was for replaced by Phabricator Diffusion and Gitiles. Was that policy just kept on the new systems or is there another reason for blocking indexing on Gitiles?

Aha, yup, when gitiles took over from gitblit we apparently didn't ban it initially, but it caused a site outage for gerrit, so per T209456 it was added back.

Lectrician1 (talkcontribs)

Ok, that's understandable. Thank you for explaining and finding those tasks!