Open main menu

Exporting all the files of a wiki

Exporting all the files of a wiki can be done in a few different ways:

  1. If you have FTP access to the wiki, then you can move the files by following the procedure at Manual:Moving a wiki.
  2. If you lack such access, as can happen for instance if a wiki is abandoned by its site owner, then you will probably need to use workarounds.
    • This procedure can semi-automate the task of downloading all the files, but you will still have to figure out a way to upload them to your wiki.

Contents

Step 1Edit

  • Follow the procedure at Help:Export#1._Get_the_names_of_pages_to_export to use a Python script to get the names of all the files on the wiki. When you go to Special:AllPages, you will be selecting the File namespace.
  • Select the names that the Python script spits out, and copy and paste them into Column A of your favorite spreadsheet application (e.g. Microsoft Excel, or OpenOffice.org Spreadsheet if you need free software). You should now have a bunch of cells that say, e.g., File:AynRand.png
  • In Cell B1, put this formula: ="*[["&A1&"]]"
  • Copy that formula and paste it into the rest of Column B. (Make sure you don't try to paste cell B1 onto itself, or you'll get an error like "You are pasting data into cells that already contain data") Each cell in Column B should now say something like *[[File:AynRand.png]]
  • Go to the wiki you got the filenames from and create a new page, e.g. User:JoeSchmoe/All files. Copy and paste column B into that page, and save.
  • The page will now load; it may take awhile since you are loading everything. You should see a listing that looks like this:
... (etc.)...

Step 2Edit

  • Now use a perl program to generate a script to give you the urls:
 1 use strict;
 2 use warnings;
 3 use LWP::Simple;
 4 use LWP::UserAgent;
 5 use HTTP::Request;
 6 use HTTP::Response;
 7 
 8 my $url="http://libertarianwiki.org/User:Joe Schmoe/All_files_2";
 9 my $agentName="User:Tisane (http://www.mediawiki.org/wiki/User:Tisane) grabbing some
10 	data using FileNameExtract.pl";
11 my $browser = LWP::UserAgent->new();
12 $browser->timeout(500);
13 my $request = HTTP::Request->new(GET => $url);
14 my $response = $browser->request($request);
15 if ($response->is_error()) {printf "%s\n", $response->status_line;}
16 my $contents = $response->content();
17 my $delimiter="\n";
18 
19 my $string='title="File:';
20 my $endString='"';
21 my $position=0;
22 my $endPosition=0;
23 
24 $position=index($contents,$string,$position)+length($string);
25 $endPosition=index($contents,$endString,$position);
26 my $firstFileName=substr($contents,$position,$endPosition-$position);
27 print '$myFileName[0]="'.$firstFileName.'";'.$delimiter;
28 $position=$endPosition;
29 my $fileNumber=0;
30 
31 while ($position!=-1){
32     $fileNumber++;
33     $position=index($contents,$string,$position)+length($string);
34     if ($position!=-1){
35         $endPosition=index($contents,$endString,$position);
36         my $fileName=substr($contents,$position,$endPosition-$position);
37         if ($fileName ne $firstFileName){
38             print '$myFileName['.$fileNumber.']="'.$fileName.'";'.$delimiter;
39             $position=$endPosition;
40         } else {
41             $position=-1;
42         }
43     }
44 }

Step 3Edit

  • This should generate a list that you can incorporate into another script:
 1 use strict;
 2 use warnings;
 3 
 4 use LWP::UserAgent;
 5 use HTTP::Request;
 6 
 7 # Files to export from the Wiki.
 8 my @exportFiles = (
 9     "01-gold-bar.jpg",
10     "100px-Massachusetts state flag.png",
11     "100px-New York state flag.png",
12     "128px-Padlock-red.svg.png",
13     ...and so on...
14 );
15 
16 # Configuration variables
17 my $string      = 'images/';
18 my $endString   = '"';
19 my $delimiter   = "\n";
20 my $reject1     = 'LibertarianWiki.gif);';
21 my $reject2     = 'icons/fileicon-pdf.png';
22 
23 # Initialize the browser
24 my $browser = LWP::UserAgent->new();
25 $browser->timeout(500);
26 
27 for my $idx (@exportFiles){
28     my $exportFile = $exportFiles[$idx];
29     
30     my $url = "http://libertarianwiki.org/File:$exportFile";
31     my $request = HTTP::Request->new(GET => $url);
32     my $response = $browser->request($request);
33     if (!$response->is_success) {
34         printf STDERR "%s\n", $response->status_line;
35     }
36     
37     my $contents = $response->content();
38 
39     my $position    = index($contents, $string, 0) + length($string);
40     my $endPosition = index($contents, $endString, $position);
41     my $filename    = substr($contents, $position, $endPosition-$position);
42     if ($position != -1 && $filename ne $reject1 && $filename ne $reject2){
43         print qq{\$exportFiles[$idx] = '$filename';$delimiter};
44     }
45 }

Step 4Edit

This in turn will generate a list that you can load into yet another script, e.g.:

 1 use strict;
 2 use warnings;
 3 use LWP::Simple;
 4 use LWP::UserAgent;
 5 use HTTP::Request;
 6 use HTTP::Response;
 7 
 8 my @myFileName=('');
 9 $myFileName[0]="7/78/01-gold-bar.jpg";
10 $myFileName[1]="5/53/100px-New_York_state_flag.png";
11 $myFileName[2]="8/81/128px-Padlock-red.svg.png";
12 ...
13 ...
14 ...
15 $myFileName[349]="a/a6/WilliamGodwin.jpg";
16 $myFileName[350]="b/b1/Wirtland_Coat_of_Arms.png";
17 $myFileName[351]="f/f5/Wirtland_crane.png";
18 my $agentName="User:Tisane (http://www.mediawiki.org/wiki/User:Tisane) grabbing some
19 	data using DownloadImages.pl";
20 my $browser = LWP::UserAgent->new();
21 $browser->timeout(500);
22 my $string='';
23 my $endString='"';
24 my $position=0;
25 my $endPosition=0;
26 my $prefix='';
27 my $reject1='skip me';
28 my $newArrayIndex=0;
29 my $delimiter="\n";
30 my $FILE='myhandle';
31 
32 for (my $count=0; $count<=351; $count++){
33     my $url="http://libertarianwiki.org/wiki/images/".$myFileName[$count];
34     #my $request = HTTP::Request->new(GET => $url);
35     #my $response = $browser->request($request);
36     #if ($response->is_error()) {printf "%s\n", $response->status_line;}
37     #my $contents = $response->content();
38     my $contents = get($url);
39 
40     my $newFileName=substr($myFileName[$count],5,length($myFileName[$count])-5);
41     print $url.$delimiter;
42     print $newFileName.$delimiter;
43     sysopen(FILE, $newFileName,0755);
44     print FILE $contents;
45     close FILE;
46 }

Step 5Edit

  • You should now have all the files downloaded. Uploading them is another issue; perhaps try Extension:MultiUpload if you can get it to work.

See alsoEdit