Manual:RefreshLinks.php
MediaWiki file: refreshLinks.php | |
---|---|
Location: | maintenance/ |
Source code: | master • 1.42.3 • 1.41.4 • 1.39.10 |
Classes: | RefreshLinks |
Details
refreshLinks.php file is a maintenance script to [re]fill the pagelinks , categorylinks , and imagelinks tables. You should run it if you found that categories are empty or don't show all relevant pages, if "What links here?" doesn't work well, or some other link-related trouble. Additionally this script purges links that point to non-existing pages from the following tables: pagelinks, categorylinks, imagelinks, templatelinks , externallinks , iwlinks , langlinks , redirect , page_props
Usage
Basic
php maintenance/refreshLinks.php [starting_article]
for example, if you want the script to start with the page with id 8,000:
php maintenance/refreshLinks.php 8000
Advanced
php refreshLinks.php [--conf|--dbpass|--dbuser|--dfn-only|--e|--globals|--help|--m|--new-only|--old-redirects-only|--quiet|--redirects-only|--wiki] <start>
Parameters
Option/Parameter | Description |
---|---|
--dfn-only | Delete links from nonexistent articles only |
--new-only | Only affect articles with just a single edit |
--redirects-only | Only fix redirects, not all links |
--old-redirects-only | Only fix redirects with no redirect table entry |
--e <page_id> | Last page id to refresh |
--dfn-chunk-size | Maximum number of existent IDs to check per query, default 100,000 |
--namespace | Only fix pages in this namespace. The namespace should be the numeric ID. |
--category | Only fix pages in this category |
--tracking-category | Only fix pages in this tracking category |
--m <max_lag> | Maximum replication lag |
--wiki | For specifying the wiki ID |
--help | Show help text |
<start> | Article number (page_id) to start at |
no parameters | Will refresh all articles |
This also supports the common options as well.
Example output
me@server:/var/www/htdocs/mw/w/maintenance$ php refreshLinks.php
Refreshing redirects table.
Starting from page_id 1 of 309.
100
200
300
Refreshing links tables.
Starting from page_id 1 of 309.
100
200
300
Retrieving illegal entries from pagelinks... 0..0
Retrieving illegal entries from imagelinks... 0..0
Retrieving illegal entries from categorylinks... 0..0
Retrieving illegal entries from templatelinks... 0..0
Retrieving illegal entries from externallinks... 0..0
Retrieving illegal entries from iwlinks... 0..0
Retrieving illegal entries from langlinks... 0..0
Retrieving illegal entries from redirect... 0..0
Retrieving illegal entries from page_props... 0..0
Avoiding memory issues
This script may run into memory issues. To avoid this you may like to set a last page_id to refresh.
php refreshLinks.php --e 1500
To do the next set of page_ids you enter
php refreshLinks.php --e 3000 -- 1500
Just continue until all page ids in your wiki were refreshed.
If you forgot to set a last page_id
to refresh and the script runs out of memory simply rerun it with the last output page_id
as the article to start at, e.g.
php refreshLinks.php -- 1600
Chunking refreshLinks.php to refresh all links without memory leak
Below is an example script to run refreshLinks.php against all pages but without having memory issues.
num_pages=$(php /path/to/mediawiki/maintenance/showSiteStats.php | grep "Total pages" | sed 's/[^0-9]*//g')
end_id=0
delta=2000
echo "Beginning refreshLinks.php script"
echo " Total pages = $num_pages"
echo " Doing it in $delta-page chunks to avoid memory leak"
while [ "$end_id" -lt "$num_pages" ]; do
start_id=$(($end_id + 1))
end_id=$(($end_id + $delta))
echo "Running refreshLinks.php from $start_id to $end_id"
php /path/to/mediawiki/maintenance/refreshLinks.php --e "$end_id" -- "$start_id"
done
# Just in case there are more IDs beyond the guess we made with showSiteStats, run
# one more unbounded refreshLinks.php starting at the last ID previously done
start_id=$(($end_id + 1))
echo "Running final refreshLinks.php in case there are more pages beyond $num_pages"
php /path/to/mediawiki/maintenance/refreshLinks.php "$start_id"