手册:Pywikibot/PAWS
- 参见Wikitech:PAWS以获取更多信息。
This document provides a quick interactive overview of Pywikibot using a notebook hosted on the Wikimedia Cloud Services environment using PAWS (PAWS: A Web Shell).
bash file.sh
。
创建维基媒体账户
To follow this walk-through, you only need a Wikipedia/Wikimedia account. Use Special:CreateAccount to create one.
Once you have created an account, please visit https://test.wikipedia.org/ and check that your username appears in the top right corner (this works around 工單T120327).
If you are a new user on Wikimedia log in with your account on Meta-Wiki, Wikipedia, Wikidata, and Commons. And in each of them read and delete all pending messages you have (on the top).
运行notebook
要启动托管notebook,请访问https://hub-paws.wmcloud.org/hub
单击“使用MediaWiki登录”,然后在要求批准“使用OAuth进行身份验证”时单击“允许”。 首次访问PAWS时,需要创建服务器。 单击绿色的“启动我的服务器”按钮。 新服务器等待几分钟才能启动是正常的。
完成后,您将被重定向到https://paws.wmflabs.org/paws/user/<username>/tree这样的链接
运行终端
要开始一个新的互动终端,
- 前往你的PAWS home
- 点击:文件 > 新建 > 终端
这将打开一个新窗口,并带有Linux '$'提示符。
此终端不是模拟器。它是一个真正的bash shell,作为Docker容器中的真正安装的Linux一部分,因此您可以使用任何bash命令,并使用已安装的Linux上可用的任何命令。
要查看一些可用的命令,请使用ls /bin/
.。
$ ls /bin/
bash cat domainname journalctl mkdir pwd stty tar zcmp
unzip2 chacl echo kill mknod rbash su tempfile zdiff
../..
$ ls /usr/bin/
2to3-3.4 dvipdf lcf printf systemd-path
X11 dwp ld prlimit systemd-run
../..
To see them all, press TAB twice.
登录wiki
这将在服务器上建立您的帐户,并允许您从命令行登录。 以下命令应确认您可以登录testwiki。它使用OAuth,因此无需输入密码。
$ pwb.py login
Logging in to wikipedia:test as <username>
Logged in on wikipedia:test as <username>.
您可以通过在$HOME
目录(/home/paws
)中创建名为user-config.py的文件并添加mylang和family变量:
mylang = 'test'
family = 'wikipedia'
You can type vim user-config.py
in the terminal, then I to insert text, add the text, then Esc to exist insert mode, then :wq and Enter to finishing editing.
创建一个页面
要创建页面,请在终端中输入以下命令,将“<username>”替换为您的用户名,并在提示接受更改时按“Y”:
$ pwb.py add_text -up -talk -page:"User talk:<username>" -text:"Hello. ~~~~"
Loading User talk:<username>...
>>> User talk:<username> <<<
@@ -0,0 +1 @@
+ Hello. ~~~~
Do you want to accept these changes? ([Y]es, [N]o, [a]ll, open in [b]rowser): Y
Page [[User talk:<username>]] saved
您已完成编辑。在Web浏览器中打开https://test.wikipedia.org/wiki/User_talk:<username>查看更改。
您可以使用'-help'命令行选项阅读有关每个命令行脚本的更多信息。
$ pwb.py add_text -help
...
获取页面
使用“listpages”命令可以获取许多页面。
要获取您在上一章节中创建的页面的内容,请输入以下命令:
$ pwb.py listpages -page:"User talk:<username>" -save
1 <username>
Saving User talk:<username> to /home/paws/User_talk_<username>
1 page(s) found
现在,如果您运行$ ls
,则应该可以找到已保存的页面。
一个真实的脚本示例
When a website used on Wikipedia changes its URL, the links on Wikipedia become outdated, and possible also dead links if the website doesn't redirect from the old URLs to the new URLs. For example, Encyclopedia Britannica (EB) has changed their links, such as moving pages from http://www.britannica.com/EBchecked/media/ to http://www.britannica.com/topic/[topic name]/images-videos/*. 您可以在英语维基百科上找到旧URL的用法列表,网址为https://en.wikipedia.org/wiki/Special:LinkSearch/http://www.britannica.com/EBchecked/media 手动更新所有这些链接将非常耗时。 Thankfully EB has maintained redirects from their old URLs to the new URLs, so this does not need to be fixed immediately.
For a simpler example, English Wikipedia currently contains links to http://britannica.com/EBchecked/ instead of http://www.britannica.com/EBchecked/; i.e. a 'www.' subdomain is missing in the URL.
英语维基百科目前有14个案例:https://en.wikipedia.org/wiki/Special:LinkSearch/http://britannica.com/EBchecked/
Wikipedia in other language also have this problem. e.g. there is one case on German Wikipedia: w:de:Spezial:Weblinksuche/http://britannica.com/EBchecked/
In order to fix those links, we can use Pywikibot replace.py script. In this demo we will use the '-simulate' argument to avoid writing to the wiki, as there are strict rules about automated editing of English Wikipedia.
First, let's list all of the pages which link to http://britannica.com/EBchecked/.
$ pwb.py listpages -lang:en -weblink:"britannica.com/EBchecked/"
1 Bhatner fort
2 Mohammad Ishaq Khan
3 Fringe theories/Noticeboard/Archive 7
4 El Riego phase
5 Catalonia/Archive 4
6 Stephen I of Hungary
7 Stephen I of Hungary/Archive 1
8 Väinö Tanner
9 Tokaji
10 Transylvania/Archive5
11 Hungarians in Romania
12 Transylvania
13 Uttarakhand
14 Françoise Giroud
14 page(s) found
Now we check those pages actually have the literal URL in the page; i.e. they are not using a template.
$ pwb.py listpages -lang:en -weblink:"britannica.com/EBchecked/" -grep:"britannica.com\/EBchecked"
1 Bhatner fort
2 Mohammad Ishaq Khan
3 Fringe theories/Noticeboard/Archive 7
4 El Riego phase
5 Catalonia/Archive 4
6 Stephen I of Hungary
7 Stephen I of Hungary/Archive 1
8 Väinö Tanner
9 Tokaji
10 Transylvania/Archive5
11 Hungarians in Romania
12 Transylvania
13 Uttarakhand
14 Françoise Giroud
14 page(s) found
现在使用替换添加缺少的“www”。
$ pwb.py replace -lang:en -simulate -weblink:"britannica.com/EBchecked/" -grep:"britannica.com\/EBchecked" "http://britannica.com/EBchecked/" "http://www.britannica.com/EBchecked/"
The summary message for the command line replacements will be something like: Bot: Automated text replacement (-http://britannica.com/EBchecked/ +http://www.britannica.com/EBchecked/)
Press Enter to use this automatic message, or enter a description of the
changes your bot will make:
Logging in to wikipedia:en as <username>
Retrieving 14 pages from wikipedia:en.
Retrieving 14 pages from wikipedia:en.
>>> Stephen I of Hungary <<<
@@ -47 +47 @@
- Stephen's birth date is uncertain because it was not recorded in contemporaneous documents.{{sfn|Györffy|1994|p=64}} Hungarian and Polish chronicles written centuries later give three different years: 967, 969 and 975.{{sfn|Kristó|2001|p=15}} The unanimous testimony of his three late 11th-century or early 12th-century [[hagiographies]] and other Hungarian sources, which state that Stephen was "still an adolescent" in 997,<ref>''Hartvic, Life of King Stephen of Hungary'' (ch. 5), p. 381.</ref> substantiate the reliability of the later year (975).{{sfn|Györffy|1994|p=64}}{{sfn|Kristó|2001|p=15}} Stephen's ''[[Life of Saint Stephen, King of Hungary (Vita minor)|Lesser Legend]]'' adds that he was born in [[Esztergom]],{{sfn|Györffy|1994|p=64}}{{sfn|Kristó|2001|p=15}}<ref name=Britannica>{{cite encyclopedia|title=Stephen I|url=http://britannica.com/EBchecked/topic/565415/Stephen-I|encyclopedia=[[Encyclopædia Britannica]]|publisher=Encyclopædia Britannica, Inc.|year=2008|accessdate=2008-07-29}}</ref> which implies that he was born after 972 because his father, [[Géza, Grand Prince of the Hungarians]], chose Esztergom as royal residence around that year.{{sfn|Györffy|1994|p=64}} Géza promoted the spread of Christianity among his subjects by force, but never ceased worshipping pagan gods.{{sfn|Kontler|1999|p=51}}{{sfn|Berend|Laszlovszky|Szakács|2007|p=331}} Both his son's ''[[Life of Saint Stephen, King of Hungary (Vita maior)|Greater Legend]]'' and the nearly contemporaneous [[Thietmar of Merseburg]] described Géza as a cruel monarch, suggesting that he was a despot who mercilessly consolidated his authority over the rebellious Hungarian lords.{{sfn|Berend|Laszlovszky|Szakács|2007|p=331}}{{sfn|Bakay|1999|p=547}}
+ Stephen's birth date is uncertain because it was not recorded in contemporaneous documents.{{sfn|Györffy|1994|p=64}} Hungarian and Polish chronicles written centuries later give three different years: 967, 969 and 975.{{sfn|Kristó|2001|p=15}} The unanimous testimony of his three late 11th-century or early 12th-century [[hagiographies]] and other Hungarian sources, which state that Stephen was "still an adolescent" in 997,<ref>''Hartvic, Life of King Stephen of Hungary'' (ch. 5), p. 381.</ref> substantiate the reliability of the later year (975).{{sfn|Györffy|1994|p=64}}{{sfn|Kristó|2001|p=15}} Stephen's ''[[Life of Saint Stephen, King of Hungary (Vita minor)|Lesser Legend]]'' adds that he was born in [[Esztergom]],{{sfn|Györffy|1994|p=64}}{{sfn|Kristó|2001|p=15}}<ref name=Britannica>{{cite encyclopedia|title=Stephen I|url=http://www.britannica.com/EBchecked/topic/565415/Stephen-I|encyclopedia=[[Encyclopædia Britannica]]|publisher=Encyclopædia Britannica, Inc.|year=2008|accessdate=2008-07-29}}</ref> which implies that he was born after 972 because his father, [[Géza, Grand Prince of the Hungarians]], chose Esztergom as royal residence around that year.{{sfn|Györffy|1994|p=64}} Géza promoted the spread of Christianity among his subjects by force, but never ceased worshipping pagan gods.{{sfn|Kontler|1999|p=51}}{{sfn|Berend|Laszlovszky|Szakács|2007|p=331}} Both his son's ''[[Life of Saint Stephen, King of Hungary (Vita maior)|Greater Legend]]'' and the nearly contemporaneous [[Thietmar of Merseburg]] described Géza as a cruel monarch, suggesting that he was a despot who mercilessly consolidated his authority over the rebellious Hungarian lords.{{sfn|Berend|Laszlovszky|Szakács|2007|p=331}}{{sfn|Bakay|1999|p=547}}
Do you want to accept these changes? ([y]es, [N]o, [e]dit, open in [b]rowser, [a]ll, [q]uit): N
...
In PAWS, and any terminal that supports color, the diff of changes will show the added "www." in green text color, making it easier to find the proposed changes.
安装Pywikibot
接下来我们将使用PAWS Python会话。
- Go to your PAWS home,
- click 'New' on the right hand side, and
- select 'Python 3'.
这将打开一个新窗口。
在文本框中,输入以下内容,然后在“单元格”菜单中选择“运行”(或按shift + enter运行)。
import pywikibot
下面会出现一个新的文本框。运行以下命令以创建连接到https://test.wikipedia.org/的APISite对象:
site = pywikibot.Site('test', 'wikipedia')
Describe "site" by entering it into the new text box and selecting "Run".
site
它应该会显示
Out[3]: APISite("test", "wikipedia")
创建页面对象:
page = pywikibot.Page(site, 'test')
通过运行检查它是否存在:
page.exists()
它应该输出
VERBOSE:pywiki:Found 1 wikipedia:test processes running, including this one. Out[5]: True
在页面上显示文字:
page.text
更改对象中的页面文本:
page.text = 'Hello world'
将页面保存到维基:
page.save()
响应应该是:
Page [[Test]] saved
INFO:pywiki:Page [[Test]] saved
交互式Python 3笔记本允许许多行一起运行。 以上内容可以放在一个文本框和Run中
import pywikibot
site = pywikibot.Site('test', 'wikipedia')
page = pywikibot.Page(site, 'test')
page.text = 'Hello world!'
page.save()
可以保存或下载交互式Python会话的日志以供将来参考。
访问PAWS在线文档
Pywikibot documentation may be found at wmdoc:pywikibot. It is primarily sourced from docstrings, which can be loaded in the interactive Python 3 notebook using the Python built-in function help().
例如,要查看上面save方法的参数,请运行以下任一方法:
help(page.save)
或
help(pywikibot.Page.save)
编辑Pywikibot脚本
Pywikibot库和脚本位于/srv/paws中,并且是只读的。无法在PAWS中修改已安装的Pywikibot库。
将脚本复制到PAWS主页后,可以进行修改脚本。
例如,要运行修改后的“checkimages.py”:
- 在终端中,输入
cp /srv/paws/pwb/scripts/checkimages.py ~
- In a browser, go to your PAWS home and click on the file
checkimages.py
.
- 在浏览器中,您可以编辑该文件。 編輯程式碼 -- 例如,在第1775行的
start = time.time()
程式碼後,新增第1776行,將會輸出你的名字:print("MYNAME's version.")
- 在编辑界面中,使用“文件”菜单并单击“保存”以保存修改。
- 在终端中,输入
pwb.py ~/checkimages.py -simulate
(If no '-limit:x' defined, the program would run until all images checked, it may take long time.)
参见
- wikitech:PAWS/PAWS and Pywikibot
- Using Pywikibot with PAWS tutorial - A tutorial that helps users get started with using Pywikibot and PAWS
- Example notebooks using Pywikibot - A list of notebooks hosted on PAWS that use Pywikibot
- 由一名使用者提供的PAWS備忘錄(例如有關API和資料庫存取的內容)
- Source code on GitHub
- Small wiki toolkits workshop about running basic Pywikibot scripts
- Self-study materials based on the small wiki toolkits workshop
- Workshop handbook based on the small wiki toolkits workshop
- If you need more help on setting up your Pywikibot visit the #pywikibot IRC channel 連線 or pywikibot@ mailing list.