Extension:Data Transfer/ur

This page is a translated version of the page Extension:Data Transfer and the translation is 14% complete.
MediaWiki extensions manual
Data Transfer
Release status: stable
Implementation Special page
Description Allows for importing and exporting the contents of a wiki's pages in XML and CSV form, using template calls to define the fields
Author(s) Yaron Koren <yaron57@gmail.com>
Latest version 1.7 (اپريل 2025)
Compatibility policy Master maintains backward compatibility.
MediaWiki 1.40+
Database changes No
Composer mediawiki/data-transfer
License GNU General Public License 2.0 or later
Download
README
Example The "view XML" page for Discourse DB

  • $wgDataTransferViewXMLParseFields
  • $wgDataTransferViewXMLParseFreeText
datatransferimport
Translate the Data Transfer extension if it is available at translatewiki.net
Issues Open tasks · Report a bug

Data Transfer is an extension to MediaWiki that allows users to both export and import data from and to the wiki, with export done in XML format and import possible in XML, CSV, and some spreadsheet formats.

It should be noted that Data Transfer is not an ideal solution for backing up one's wiki, or transferring wiki pages from one MediaWiki site to another; for that, the much better solution is to use MediaWiki's built-in "Special:Export" and "Special:Import" pages.

You can download the Data Transfer code, in .zip format, here.

Or download the code via Git from the MediaWiki source code repository by running this command from the extensions directory:

git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/DataTransfer

To view the code online, including version history for each file, you can go here.

Installation

After you've obtained a DataTransfer directory (either by extracting a compressed file or downloading via Git), place this directory within the main MediaWiki 'extensions' directory. Then, in the file LocalSettings.php in the main MediaWiki directory, add the following line:

wfLoadExtension( 'DataTransfer' );

By default, the importing of files is allowed only for administrators/sysops. If you want other groups to be able to import files, you can add additional lines to LocalSettings.php to allow that. This line, for example, will allow all users to import files:

$wgGroupPermissions['user']['datatransferimport'] = true;

To allow anyone reading the wiki to import files, you could add the following (though it's not usually recommended):

$wgGroupPermissions['*']['datatransferimport'] = true;

Usage

Exporting data

Data Transfer defines a special page, "Special:ViewXML", that lets users view (and thus save) the pages in any combination of the wiki's categories and namespaces in XML form. The fields and values in the XML are taken from the fields and values in any template calls contained in the page; any non-template text is put into one or more "free text" tags. In addition, an "ID" field is also displayed for every page, using MediaWiki's internal "article ID" for that page; this is done so that outside systems can track a page with a more fixed identifier than its name (which can change often). The XML contains only the current state of any page: information on authors and dates modified, and information on previous versions of each page, are not recorded.

Two formats for export are supported: the first, or standard one, contains tags of the form <Template name="template-name"> and <Field name="field-name">. The second, or "simplified" one, contains tags of simply the form ‎<template_name> and ‎<field_name>.

Special:ViewXML can also be used to generate XML for individual pages, by adding a &titles= parameter to the URL, like &titles=Page1|Page2|Page3.

By default, the "free text" (non-template) part of a page is parsed by the MediaWiki parser, so that wikitext gets converted into HTML; whereas the values within template calls are not. To disable parsing for the free text, add the following to LocalSettings.php:

$wgDataTransferViewXMLParseFreeText = false;

Conversely, to add parsing for template field values, add the following:

$wgDataTransferViewXMLParseFields = true;

Importing data

Data Transfer defines three special pages, Special:ImportXML, Special:ImportCSV and Special:ImportSpreadsheet, that let users with administrator privileges upload XML, CSV and assorted spreadsheet files, respectively. Once uploaded, the data is turned into pages in the wiki (or, if pages with those names already existed in the wiki, new versions of those pages).

Importing XML files

The XML import requires the standard, i.e. non-simplified, XML format that "ViewXML" produces, although with several differences: the "ID" attribute for each page should not be present, and tags called "Category" or "Namespace" (in whatever language the wiki is in) should not be present.

XML simplified output
<pages>
  <page>
    <id>28</id>
    <title>Limburger</title>
    <Free_Text id="1">
       <p><b>Limburger</b> is a cheese that originated in the Herve area of the historical Duchy of Limburg.</p>
    </Free_Text>
  </page>
</pages>
XML standard format (input and output)

This is both an output format and the format that is needed for importing data in XML format.

<Pages>
<Page Title="Limburger">
  <Free_Text>
<p><b>Limburger</b> is a cheese that originated in the Herve area of the historical Duchy of Limburg.</p>
  </Free_Text>
</Page>
</Pages>

آپ صفحے کے مرکزی حصے کے علاوہ دوسرے "slots" میں بھی مواد درآمد کر سکتے ہیں، MediaWiki کی Multi-Content Revisions خصوصیت کا استعمال کرتے ہوئے، Slot ایٹریبیوٹ شامل کر کے، جیسا کہ:

<Pages>
<Page Title="Limburger" Slot="text-notes">
  <Free_Text>
<p><b>Limburger</b> is a cheese that originated in the Herve area of the historical Duchy of Limburg.</p>
  </Free_Text>
</Page>
</Pages>

The text within the Free_Text field cannot be indented. This is the same as indenting text in an article. If the Free_Text in HTML and it is indented, all of the records will fail. If the HTML text is not indented, the records will import fine.

Importing CSV files

For CSV import to work:

  • The file must be in standard CSV format (i.e., separated by commas, as opposed to semicolons or anything else)
  • If the file contains non-ASCII characters it must be encoded in either UTF-8 or UTF-16 (the latter being simply called "Unicode" in some Windows programs)
  • File's line breaks should contain "line feeds" (\n) as opposed to just "carriage returns" (\r). This is especially if you're using Mac OS
  • The top row must contain the name of each column
    • One of the columns must contain the title of each page, and so its column name must be Title (in whatever language the wiki is in)
    • Another column can contain all the free, non-template text in the page: the title of this column must be Free Text (again, in the language of the wiki)
    • All other columns must represent the contents of a single field of a single template call; the name of such a column should be of the form template-name[field-name] (whitespace allowed). There is no need to separately specify the names of the template(s) called in the page.

A brief tutorial on the CSV format: if a value contains a comma, you must enclose it in double quotes. If a field containing one or more double quotes needs to be enclosed in double quotes, those double quotes should be escaped as double double quotes. An empty field can either be left empty, or contain a double double quote. You can see here for the full CSV specification.

Here is an example of a CSV file that can be parsed by Data Transfer:

Title,Cheese[Country],Cheese[Texture],Free Text
Mozarella,Italy,Semi-soft,It's good on pizzas!
Cheddar,England,Hard/semi-hard,"Often sharp, but not always."
Gorgonzola,Italy,"buttery or firm, crumbly","salty, with a ""bite"" from its blue veining"
Stilton,,"",needs more data

You can also import content into page "slots" other than the main one, using MediaWiki's Multi-Content Revisions feature, by adding the Slot column, like so:

Title,Cheese[Country],Cheese[Texture],Free Text,Slot
Mozarella,Italy,Semi-soft,It's good on pizzas!,text-notes
Cheddar,England,Hard/semi-hard,"Often sharp, but not always.",text-notes
Gorgonzola,Italy,"buttery or firm, crumbly","salty, with a ""bite"" from its blue veining",main
Stilton,,"",needs more data,main

Importing spreadsheet files

For the spreadsheet import, Data Transfer requires the presence of the PhpSpreadsheet library, which does the actual spreadsheet processing. PhpSpreadsheet can handle spreadsheet files in formats including .xls, .xlsx, .ods, Gnumeric, and even PDF and HTML. The titles of the columns should be the same as for CSV files.

Authors

Data Transfer was mostly written by Yaron Koren, reachable at yaron57@gmail.com. Important functionality was also added by Stephan Gambke and Sahaj Khandelwal.

Version history

Data Transfer is currently at version 1.7. See the entire version history .

Common problems

  • The import of each page is a MediaWiki background "job". This means that the page creations will not be done immediately, and may take minutes, hours or even longer to complete. Normally, jobs get activated every time a page is viewed on the wiki; to speed up the process (or slow it down), you can change the number of jobs run when a page is viewed, by setting $wgJobRunRate ; the default is 1. A job run rate that is too high can conceivably cause a problem such that some jobs don't run. To have the wiki run all jobs immediately, execute the script runJobs.php from the operating system command line.

Customizing the export XML

You can specify that any specific page not be included in the XML produced, by adding the category tag [[Category:Excluded from XML]] to that page. You can also add this tag to a template, to exclude any page that uses that template from the XML.

پروجیکٹ میں شراکت

بگ اور فیچر درخواستیں

آپ کو ڈیٹا ٹرانسفر کے بارے میں کسی بھی سوال، تجاویز یا بگ رپورٹ کے لیے میڈیا وکی میلنگ لسٹ، mediawiki-l، استعمال کرنا چاہیے۔

پروجیکٹ میں پیچز کی شراکت

اگر آپ کو کوئی بگ ملا اور اسے ٹھیک کیا، یا اگر آپ نے کسی نئی خصوصیت کے لیے کوڈ لکھا ہے، تو براہ کرم یا تو اس کا گٹ کمٹ کریں (اگر آپ کے پاس ڈویلپر اکاؤنٹ ہے) یا اپنی مقامی "ڈیٹا ٹرانسفر" ڈائرکٹری میں جا کر پیچ بنائیں، اور ٹائپ کریں:

git diff > descriptivename.patch

اگر آپ کوئی پیچ بناتے ہیں تو براہ کرم اسے تفصیل کے ساتھ Yaron Koren کو بھیجیں۔

ترجمہ

ڈیٹا ٹرانسفر کا ترجمہ translatewiki.net کے ذریعے کیا جاتا ہے۔ The translation for this extension can be found here. To add language values or change existing ones, you should create an account on translatewiki.net, then request permission from the administrators to translate a certain language or languages on this page (this is a very simple process). ایک بار جب آپ کو کسی دی گئی زبان کی اجازت مل جاتی ہے، تو آپ اس زبان میں جو بھی پیغامات چاہتے ہیں ان میں لاگ ان کر سکتے ہیں اور ان میں اضافہ یا ترمیم کر سکتے ہیں۔

یہ بھی دیکھیں

  • Page import - تمام صفحہ آمپورٹ ٹولز کا جائزہ