Manual:refreshLinks.php
MediaWiki file: refreshLinks.php | |
---|---|
Location: | maintenance/ |
Source code: | master • 1.42.4 • 1.41.5 • 1.39.11 |
Classes: | RefreshLinks |
Details
editrefreshLinks.php file is a maintenance script to [re]fill the pagelinks , categorylinks , and imagelinks tables. You should run it if you found that categories are empty or don't show all relevant pages, if "What links here?" doesn't work well, or some other link-related trouble. Additionally this script purges links that point to non-existing pages from the following tables: pagelinks, categorylinks, imagelinks, templatelinks , externallinks , iwlinks , langlinks , redirect , page_props
Usage
editphp maintenance/scriptName.php
instead of php maintenance/run.php scriptName
.Basic
editphp maintenance/run.php refreshLinks [starting_article]
for example, if you want the script to start with the page with id 8,000:
php maintenance/run.php refreshLinks 8000
Advanced
editphp maintenance/run.php refreshLinks [--conf|--dbpass|--dbuser|--dfn-only|--e|--globals|--help|--m|--new-only|--old-redirects-only|--quiet|--redirects-only|--wiki] <start>
Parameters
editOption/Parameter | Description |
---|---|
--dfn-only | Delete links from nonexistent articles only |
--new-only | Only affect articles with just a single edit |
--redirects-only | Only fix redirects, not all links |
--old-redirects-only | Only fix redirects with no redirect table entry |
--e <page_id> | Last page id to refresh |
--dfn-chunk-size | Maximum number of existent IDs to check per query, default 100,000 |
--namespace | Only fix pages in this namespace. The namespace should be the numeric ID. |
--category | Only fix pages in this category |
--tracking-category | Only fix pages in this tracking category |
--m <max_lag> | Maximum replication lag |
--wiki | For specifying the wiki ID |
--help | Show help text |
<start> | Article number (page_id) to start at |
no parameters | Will refresh all articles |
This also supports the common options as well.
Example output
editme@server:/var/www/htdocs/mw/w/maintenance$ php run.php refreshLinks
Refreshing redirects table.
Starting from page_id 1 of 309.
100
200
300
Refreshing links tables.
Starting from page_id 1 of 309.
100
200
300
Retrieving illegal entries from pagelinks... 0..0
Retrieving illegal entries from imagelinks... 0..0
Retrieving illegal entries from categorylinks... 0..0
Retrieving illegal entries from templatelinks... 0..0
Retrieving illegal entries from externallinks... 0..0
Retrieving illegal entries from iwlinks... 0..0
Retrieving illegal entries from langlinks... 0..0
Retrieving illegal entries from redirect... 0..0
Retrieving illegal entries from page_props... 0..0
Avoiding memory issues
editThis script may run into memory issues. To avoid this you may like to set a last page_id to refresh.
php maintenance/run.php refreshLinks --e 1500
To do the next set of page_ids you enter
php maintenance/run.php refreshLinks --e 3000 -- 1500
Just continue until all page ids in your wiki were refreshed.
If you forgot to set a last page_id
to refresh and the script runs out of memory simply rerun it with the last output page_id
as the article to start at, e.g.
php maintenance/run.php refreshLinks -- 1600
Chunking refreshLinks.php to refresh all links without memory leak
editBelow is an example script to run refreshLinks.php against all pages but without having memory issues.
num_pages=$(php /path/to/mediawiki/maintenance/showSiteStats.php | grep "Total pages" | sed 's/[^0-9]*//g')
end_id=0
delta=2000
echo "Beginning refreshLinks.php script"
echo " Total pages = $num_pages"
echo " Doing it in $delta-page chunks to avoid memory leak"
while [ "$end_id" -lt "$num_pages" ]; do
start_id=$(($end_id + 1))
end_id=$(($end_id + $delta))
echo "Running refreshLinks.php from $start_id to $end_id"
php /path/to/mediawiki/maintenance/run.php refreshLinks --e "$end_id" -- "$start_id"
done
# Just in case there are more IDs beyond the guess we made with showSiteStats, run
# one more unbounded refreshLinks.php starting at the last ID previously done
start_id=$(($end_id + 1))
echo "Running final refreshLinks.php in case there are more pages beyond $num_pages"
php /path/to/mediawiki/maintenance/run.php refreshLinks "$start_id"