Manual:Search engine optimization
Search engine optimization (SEO) techniques generally attempt to improve the visibility and ranking of a wiki's pages in search engine "natural" or un-paid ("organic") search results. The higher a page appears on the search engine results page (SERP), the more traffic it will tend to get. For many websites, this can be the largest source of traffic, and therefore SEO is an important part of increasing visitors. Google is by far the largest search engine,[1] so webmasters should typically focus their energy on improving rank on the Google. Over the years, Google has cleverly adjusted its algorithm (that determines ranking) to be focused on quality content and user experience. As such, the most important thing you can do is to write unique, high-quality content designed for your human visitors, not for Google's crawler or any other "robots".[2]
Enable short URLs
Humans and search engines prefer descriptive URLs.[3] Short URLs can be used for this purpose. For example, this changes
https://example.com/index.php?title=Water
into
https://example.com/wiki/Water
Enable canonical links
Sometimes a page can have multiple URLs that essentially show the same content. Google sees this as duplicate content, which is not great for ranking,[4] and can lead to undesirable URLs being shown on the SERP instead of the Short URLs. You can avoid this by turning on $wgEnableCanonicalServerLink in LocalSettings.php . This helps the search engine understand that
https://example.com/index.php?title=Water&mobileaction=toggle_view_desktop
(which can be linked even when Short URLs are enabled) is the same as
https://example.com/wiki/Water
and it will display the shorter (canonical) URL on the SERP.
To enable this, add the variable in LocalSettings.php:
$wgEnableCanonicalServerLink = true;
You can also consider setting $wgMainPageIsDomainRoot to true, if applicable.
Create a robots.txt file
The search engine crawlers read a special file named robots.txt so that the site owner can allow or disallow the crawler access to certain pages or parts of the website. The robots.txt file can also hold other information such as a sitemap. Be aware that many types of crawlers will NOT respect the rules set in the robots.txt file, so the robots.txt file is not useful for blocking many types of misbehaving bots (e.g. crawlers looking to harvest emails to spam).
Blocking pages generally does not improve SEO, so don't go overboard blocking tons of pages. Restricting access is typically helpful only for very large sites that need to decrease the amount of crawling due to performance problems. You can a look at Wikipedia's own robots.txt to see the huge number of different bots it attempts to block. This would be overkill for most of the MediaWiki sites on the web.
Fortunately, MediaWiki already outputs <meta name="robots" content="noindex,nofollow" />
in the HTML of "Edit this page" and "History" pages, so those pages will (probably) never be indexed, particularly when canonical links are used as well.
This is a simple robots.txt to get started:
User-agent: * Disallow: /wiki/Special:Random
This blocks crawlers from following the "random page" link, which could confuse a robot since the page is different each time it is crawled.
Generate a sitemap and add it to robots.txt
A sitemap shows search engine bots what pages are important to crawl and index, and it tells the bots which pages have been updated and should be re-crawled.[5] MediaWiki has a script to automatically generate sitemaps for whichever namespaces you prefer. See Manual:GenerateSitemap.php . Once generated, the sitemap(s) can be submitted on the Google Search Console if you want to see how often google checks the sitemap and identify which of the pages from the sitemap are indexed. Although, this is not necessary because Google and other search engines will find it after it is added to robots.txt.
Example: Add $wgSitemapNamespaces to LocalSettings.php to set the namespaces to be included in the sitemap.
$wgSitemapNamespaces = [ 0, 2, 4, 14, 502 ];
Set a cron job is set to generate the sitemap every 24 hours (or as appropriate).
48 0 * * * /usr/local/bin/php /home/example/public_html/maintenance/run.php generateSitemap --fspath=/home/example/public_html/sitemap/ --urlpath=/sitemap/ --identifier=example --compress=no --skip-redirects
Add the sitemap to robots.txt:
Sitemap: https://example.com/sitemap/sitemap-index-example.xml
Create good links
Internal links: MediaWiki is built to easily allow linking to other pages in the wiki, which can help crawlers find pages and better understand their context and relative importance on the site.[6] Use this feature! See Help:Links , and when practical, use anchor text that fits the natural language of your writing.
External links: You are the proud owner of a wiki site, which hopefully means you're providing factual information. Turn on the included Extension:Cite and use it to cite your sources! This gives your readers and search engines more confidence in the trustworthiness of your information.[7] Similarly, you should set $wgNoFollowLinks to false in order to help the engines understand your relationship with the sites you link. Make sure you are keeping your wiki free from spam. See Reference Tooltips for information about enabling the tooltip that appears when hovering over a citation link.
Add to LocalSettings.php:
$wgNoFollowLinks = false; wfLoadExtension( 'Cite' ); wfLoadExtension( 'Gadgets' ); // for reference tooltips
Optimize your site for mobile devices
Google employs "mobile-first indexing", meaning the crawler will mostly see your site from the perspective of a smartphone.[8] The Extension:MobileFrontend automatically creates a different layout for your site when it detects the user is viewing the site on a mobile device (e.g. smartphone, tablet). This is what Wikipedia uses for its mobile site.
Accelerated Mobile Pages (AMP) may be something to consider if you aren't satisfied with the speed of the pages that MobileFrontend generates. The Extension:AcceleratedMobilePages can be used instead of MobileFrontend for the fastest experience possible for your mobile users.
If you prefer to use MobileFrontend over AcceleratedMobilePages but are struggling to hit your Core Web Vitals, consider using a CDN such as Cloudflare .
Install WikiSEO to optimize title and description
An extension is needed in order to control page title tags, add a meta description, and add OpenGraph data. The title and meta description are (usually) what people will see associated with your wiki on the SERP. Extension:WikiSEO allows you to customize these important details and more. Use it wisely to create optimized page titles and meta descriptions. Note that Google completely ignores the meta "keywords" tag, so don't bother using it.
Other SEO extensions are available, but they are not as feature-rich as WikiSEO.
Do not automatically generate meta descriptions. The search engines are much more capable of reading your page and generating a relevant meta description than any of the MediaWiki extensions that automatically generate them. Write good meta descriptions yourself for the best results.
Optimize images
Humans and search engines alike LOVE pictures! Add high quality and unique images to your wiki pages.
Tips:
- Upload images in high resolution. MediaWiki has $wgResponsiveImages enabled by default, which displays the appropriate resolution to users on devices with high resolution screens.
- Use WebP format (rather than PNG or JPEG) if it's supported by your web host, due to its improved compression (smaller file size).
- Use alt tags. Alt tags help search engines understand the content of the images on your website. Concisely describe what's in the image. You don't need to use the word "picture" or "image"; they already know it's an image.
Example of an alt tag:
[[File:example-image.webp|thumb|A wonderful picture of things|alt=Wonderful example]]
Prevent link rot
Users and search engines don't like to follow a link and end up on a 404 error page. Periodically use a tool to identify dead links on your site. The Extension:Replace Text can be helpful for fixing a link that appears multiple times on a page or on multiple pages.
You can add some directions to your robots.txt file in order to focus a crawler's attention on content pages, since any external crawling service should respect it. For example:
User-agent: SiteAuditBot User-agent: brokenlinkcheck Disallow: /path/honeypot Disallow: /index.php? Disallow: /wiki/Special:
External links
- Google. Search Engine Optimization Starter Guide.
- Blumstein, Aviva (26 September 2011). Tested: The Best Length for a Description Tag is Longer Than You Think.
References
- ↑ The Top 11 Search Engines, Ranked by Popularity
- ↑ Search Engine Optimization (SEO) Starter Guide
- ↑ URL structure best practices for Google
- ↑ Search Engine Optimization (SEO) Starter Guide: Reduce duplicate content
- ↑ Learn about sitemaps
- ↑ https://developers.google.com/search/docs/crawling-indexing/links-crawlable
- ↑ Link best practices for Google
- ↑ Mobile site and mobile-first indexing best practices