Manual talk:Pywikibot/Use on third-party wikis

Latest comment: 2 years ago by Ciencia Al Poder in topic Parameter ns / namespace not working
The following discussion has been transferred from Meta-Wiki.
Any user names refer to users of that site, who are not necessarily users of MediaWiki.org (even if they share the same username).

NOTICE: Please use the mailing list, pywikipedia-l, or IRC (#pywikipediabot at irc.freenode.net) for support. This talk page is not read (often) by developers.

Working LDAP config

edit

I fought with this for a while, you need to set self.ldapDomain to equal your domain

# -*- coding: utf-8  -*-

import family

# The official ABC Wiki.

class Family(family.Family):

    def __init__(self):

        family.Family.__init__(self)

        self.name          = 'WikiNational'
        self.langs         = { 'en':         'WikiNational', }
        self.namespaces[4] = { '_default':  u'WikiNational',       }
        self.namespaces[5] = { '_default':  u'WikiNational Talk',  }
        self.ldapDomain        = 'yourdomain.local'

    def version(self, code):
        return "1.6.1"

    def path(self, code):
        return '/wiki/index.php'

--Mellerbeck 21:43, 21 May 2009 (UTC)Reply

Problem

edit

Im not sure if this is to do with some setting some where but im not able to submit the form correctly.

The responce i get from response = conn.getresponse()is the entire web page? Is that correct?

whats wrong

edit

wont work with me.... for my projekt at www.pflegewiki.de

i created user-config.py including

mylang = 'de'
username = 'ElektroZivi'
family = 'pflegewiki'


and a "pflegewiki_family.py" including:

# -*- coding: utf-8  -*-

import family

# The meta family

class Family(family.Family):
    name = 'PflegeWiki'
    def __init__(self):
        self._addlang('de',
                        location = 'www.pflegewiki.de',
                        namespaces = { 4: u'PflegeWiki',
                                      5: u'Diskussion' })

if i run the login.py from commandline python login.py it returns the message: Login failed. Wrong password?

but the pwd and username are correct (i checked this)!!

what could help ?! -Produnis 15:35, 2 Mar 2005 (UTC)


answer

edit

modify your "pflegewiki_family.py" like this:

# -*- coding: utf-8  -*-

import family

# The meta family

class Family(family.Family):
    name = 'PflegeWiki'
    def __init__(self):
        self._addlang('de',
                        location = 'www.pflegewiki.de',
                        namespaces = { 4: u'PflegeWiki',
                                      5: u'Diskussion' })

    def path(self, code):
        return '/index.php'

should work then

wont work for me neither

edit

The script claimns that my password is wrong but when I use the url (login and passord) from the config in a browser everything works. The family looks like.

# -*- coding: utf-8  -*-

import family

class Family(family.Family):
    name = 'fridewiki' #Set the family name; this should be the same as in the filename.

    def __init__(self):
        family.Family.__init__(self)
	
	self.langs = {
            'de':'vierzig4.dyndns.org',
	}


    def version(self, code):
        return "1.4.2"  #The MediaWiki version used. Not very important in most cases.

    def path(self, code):
        return '/mediawiki/index.php' #The path of index.php

did i forget something?

I used to have the same problem.Later I solved it. It's caused by the wrong path(sel,code) setting. Make sure http://yousitename.com/mediawiki/index.php could point the real index.php file --Farm 11:01, 14 February 2006 (UTC)Reply

doesn't work

edit

I just downloaded the latest CSV snapshot and it just doesn't work - I've got the user configuration file there and the family file in the sub directory and it just throws this when I try python login.py

Checked for running processes. 1 processes currently running, including the current process.
Traceback (most recent call last):
  File "login.py", line 218, in ?
    main()
  File "login.py", line 213, in main
    loginMan = LoginManager(password, sysop = sysop)
  File "login.py", line 79, in __init__
    raise wikipedia.NoUsername(u'ERROR: Username for %s:%s is undefined.\n
If you have an account for that site, please add such a line to user-config.py:\n\nusernames[\'%s\'][\'%s\'] = \'myUsername\'' % 
(self.site.family.name, self.site.lang, self.site.family.name, self.site.lang))
wikipedia.NoUsername: ERROR: Username for None:en is undefined.
If you have an account for that site, please add such a line to user-config.py:

usernames['None']['en'] = 'myUsername'

Whats annoying is that my installation was working but now it throws errors about character sets so I down loaded the latest files to see if that fixed it

Probably not everybody else is as stupid as I am, but check that all your cases are consistent - i.e. if you've used lowercase in the filename, use lowercase when you refer to the family. Also, check what case your login uses. Just a tip that would have saved me about an hour... :$

Or Me

edit

I created the user-config.py with the 3 lines in it needed to work with a non Wiki project. I created the family file and tried to login and I get

Please create a file user-config.py, and put in there:

One line saying "mylang='language'"
One line saying "usernames['wikipedia']['language']='yy'"

...filling in your username and the language code of the wiki you want to work

If I take the family line out of the user-config.py file then I get a password error.

this is using snapshot-20051221 from Sourceforge


user-config.py Not Found

edit

I am having the same problem. Please post solution as soon as anyone finds it. I am using snapshot 20060312. ---20:01, 12 June 2006 (UTC)

Answer

edit

I had the same trouble. Finally found the answer: Make sure you put the statement name = 'mywiki' AFTER this statement: family.Family.__init__(self) -- Barrylb 22:45, 17 July 2006 (UTC)Reply

XML dumps

edit

It seems that for some functions of pywikipedia it is recommend that I have a recent xml dump of my database. How vital is that, and, assuming it's critical, how do I optain an xml dump of my db? Thanks. Edit: Never mind, I think Help:Export answered my question. JoeyDay 05:49, 16 September 2005 (UTC) - edited 07:13, 16 September 2005 (UTC)Reply

edit

Pywikipediabot is having trouble retrieving files that start with lower case letters on the Homestar Runner Wiki because we have our $wgCapitalLinks flag set to false. I realize I may be the first to encounter this issue and don't want to seem needy, but I'm curious as to whether developers are aware of this and/or if there are any plans to address it in future versions of the framework. Thanks. JoeyDay 00:23, 18 September 2005 (UTC)Reply

Uploaded images blank or corrupt

edit

I'm having trouble using upload.py to upload images correctly to our MediaWiki 1.4-based site. I've been able to login to the server with the bot account, run the upload script, and the script reports a correct upload. However, when I look at the images on the site, they are the correction dimensions, but are either: a) blank, or b) blank with a few lines of corrupted pixels at the top of the image.

Here's some of the things I've tried:

  • Uploading with the standard web browser user interface using the bot account (works fine).
  • Tried using my standard administrator user account with the script (same bad result).
  • Tried using other image formats; .png, .gif, .jpg (same result).
  • Tried different image files (same result).
  • Moved image location to the same location as the script on disk (same result).

I don't think it's a permissions problem, since I'm able to upload the files manually using the same account, though I admit I don't know much about MediaWiki permissions. Any ideas?

Pagefromfile.py

edit

I've tried to load a file into my own MediaWiki. But I get the message 'DBG> page may be locked?' and 'Page <title> already exists, not adding!' However the page doesn't exist in my MediaWiki. whats wrong? This is the textfile

Start 
'''title'''
==subtitle1==
text
==subtitle2==
End

BB70 11:20, 25 january 2006 (CET)


This only happens with a MediaWiki installed on a localhost; when i use the my website-MediaWiki it works! --81.58.24.250 05:00, 28 April 2006 (UTC)Reply

Transfering pages from wikipedia to my mediawiki project

edit

I must copy a set of it.wiki pages to my custom mediawiki project.

Is there a bot that could do this job, or what bot should I work on?

Any hint is appreciated!!!

196.3.84.214 19:38, 2 April 2006 (UTC)Reply

Compatibility between Mediawiki 1.6.1 and Pywikipedia "snapshot-20060312" ?

edit

Hello,

I'm having problems loging using login.py on my non-wikimedia project, so I wonder if there is any incompatibilities between Compatibility between Mediawiki 1.6.1 and Pywikipedia "snapshot-20060312" ?

I think so because of the differents POST URLs generated using the normal login method and logging using login.py

In /var/log/apache2/access.log :

If I try to log using the login.py script, it fails:
127.0.0.1 - - [03/May/2006:10:50:08 +0200] "POST /index.php?title=Special:Userlogin&action=submit HTTP/1.1" 200 13696 "-" "RobHooftWikiRobot/1.0"

If I try to log using my browser, it succeeds:
172.21.161.66 - - [03/May/2006:10:51:26 +0200] "POST /index.php?title=Special:Userlogin&action=submitlogin&type=login&returnto=Main_Page HTTP/1.1" 200 2025 "http://victoria/index.php?title=Special:Userlogin&returnto=Main_Page" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"

If it fails  : URL = Special:Userlogin&action=submit
If it suceeds : URL = Special:Userlogin&action=submitlogin&type=login


Here is the output when I try to log in :

srvjavalive:/var/www/victoria/bin/pywikipedia# ./login.py
Checked for running processes. 1 processes currently running, including the current process.
Password for user releasebot on victoria:en:
Logging in to victoria:en as releasebot
Login failed. Wrong password?

Note that the wiki is physically located on the same server where I installed pywikipedia.

Following the doc m:Pywikipedia bot on non-wikimedia projects, this is my config :

srvjavalive:/var/www/victoria/bin/pywikipedia# cat user-config.py
family                 =  'victoria'
mylang                 =  'en'
usernames['victoria']['en'] =  'releasebot'


srvjavalive:/var/www/victoria/bin/pywikipedia# cat families/victoria_family.py
# -*- coding: utf-8  -*-

import family

# The official Victoria Wiki.

class Family(family.Family):

    def __init__(self):

        family.Family.__init__(self)

        self.name          = 'victoria'
        self.langs         = { 'en': 'victoria', }
        self.namespaces[4] = { '_default': u'Victoria',       }
        self.namespaces[5] = { '_default': u'Victoria talk',  }

    def version(self, code):
        return "1.6.1"

    def path(self, code):
        return '/index.php'


Should I use the CVS version to solve the problem ? --Effco 10:41, 3 May 2006 (UTC)Reply

Uncyclopedia sv

edit

The Swedish version of Uncyclopedia is called Psyklopedin, not Psykelopedia. Won't you guys ever learn? – Smiddle (85.30.155.6) 16:46, 2 October 2006 (UTC)Reply

Solution to a problem connecting to a wiki more than one directory under the URL domain

edit

When trying to connect pywikipedia bot to a wiki in location www.example.com/folder/folder/wiki/index.php the failure displayed below occured. Basically it seemed the URL in the families file was not recognised.

In families file I had specified

self.langs         = { 'en':         'www.example.com/folder/folder/wiki/', }

The solution to the failure was to specify only the domain the self.langs, and the subfolders in the call to "path",

   def path(self, code):
       return '/folder/folder/wiki/index.php'

In full:

# -*- coding: utf-8  -*- 
import family
# The Wikispectus Concept Test wiki
class Family(family.Family):
    def __init__(self):
        family.Family.__init__(self)
        self.name          = 'wikispectuspoc'
        self.langs         = { 'en':         'www.example.com', }
        self.namespaces[4] = { '_default':  u'Smw',       }
        self.namespaces[5] = { '_default':  u'Smw talk',  }
    def version(self, code):
        return "1.7.1"
    def path(self, code):
        return '/folder/folder/wiki/index.php'


The original failure:

[rich@wuf bot2]$ python login.py
Checked for running processes. 1 processes currently running, including the current process.
Password for user RickBot on wikispectuspoc:en:
Logging in to wikispectuspoc:en as RickBot
Traceback (most recent call last):
 File "login.py", line 220, in ?
   main()
 File "login.py", line 216, in main
   loginMan.login()
 File "login.py", line 169, in login
   cookiedata = self.getCookie()
 File "login.py", line 120, in getCookie
   conn.request("POST", pagename, data, headers)
 File "/usr/local/lib/python2.4/httplib.py", line 804, in request
   self._send_request(method, url, body, headers)
 File "/usr/local/lib/python2.4/httplib.py", line 827, in _send_request
   self.endheaders()
 File "/usr/local/lib/python2.4/httplib.py", line 798, in endheaders
   self._send_output()
 File "/usr/local/lib/python2.4/httplib.py", line 679, in _send_output
   self.send(msg)
 File "/usr/local/lib/python2.4/httplib.py", line 646, in send
   self.connect()
 File "/usr/local/lib/python2.4/httplib.py", line 614, in connect
   socket.SOCK_STREAM):
socket.gaierror: (4, 'Non-recoverable failure in name resolution')
[rich@wuf bot2]$ 

Thanks to bin in #pywikibot for the answer! --mw:User:Rick 01:07, 8 October 2006 (UTC)Reply

Not working for me 1.7.1

edit

user-config.py

family                 =  'gtrwiki'
mylang                 =  'en'
usernames['gtrwiki']['en'] =  'Toykilla-bot'

gtrwiki_family.py

# -*- coding: utf-8  -*-

import family

class Family(family.Family):


    def __init__(self):
        family.Family.__init__(self)
        
        self.langs = {
            'en':'www.gtr-tech.com',
        }

    name = 'gtrwiki' #Set the family name; this should be the same as in the filename.

    def version(self, code):
        return "1.7.1"  #The MediaWiki version used. Not very important in most cases.

    def path(self, code):
        return '/w/index.php' #The path of index.php

After running login.py I get this output:

Checked for running processes. 1 processes currently running, including the current process.
Traceback (most recent call last):
  File "login.py", line 220, in ?
    main()
  File "login.py", line 215, in main
    loginMan = LoginManager(password, sysop = sysop)
  File "login.py", line 79, in __init__
    raise wikipedia.NoUsername(u'ERROR: Username for %s:%s is undefined.\nIf you have an account for that site, please add such a line to user-config.py:\n\nusernames[\'%s\'][\'%s\'] = \'myUsername\'' % (self.site.family.name, self.site.lang, self.site.family.name, self.site.lang))
wikipedia.NoUsername: ERROR: Username for None:en is undefined.
If you have an account for that site, please add such a line to user-config.py:

usernames['None']['en'] = 'myUsername'

Move to talk page

edit

Note: some of the examples on this page have been shown not to work with recent pywikipediabot releases. Please see Talk:Pywikipedia_bot_on_non-Wikimedia_projects for further discussion. -- Tyagi 11:32, 11 April 2006 (UTC).Reply

There is no mention of this on the talk page. Could someone in the know fix (or remove) the broken examples to avoid clogging up the mailing list....217.117.47.110 19:13, 12 June 2006 (UTC)Reply

To the developer(s) - please have the URL query and POST information printed on a failure and/or provide more information as to help diagnose what may be mis-configured. If the bot never connects in attempt to enter the password, it would be nice to know. -- Adam Katz 23:09, 31 May 2005 (UTC)Reply


Example

edit

This is an example for the user-config.py and families/ file, which have been shown to work with newer releases of pywikipediabot.

user-config.py

family                 =  'abc'
mylang                 =  'en'
usernames['abc']['en'] =  'MyNameBot'

families/abc_family.py

# -*- coding: utf-8  -*-

import family

# The official ABC Wiki.

class Family(family.Family):

    def __init__(self):

        family.Family.__init__(self)

        self.name          = 'abc'
        self.langs         = { 'en':         'www.abc.com', }
        self.namespaces[4] = { '_default':  u'ABC',       }
        self.namespaces[5] = { '_default':  u'ABC talk',  }

    def version(self, code):
        return "1.6.1"

    def path(self, code):
        return '/wiki/index.php'


This example is confusing, I am not even sure what wiki this is about... Odessaukrain 22:03, 7 January 2007 (UTC)Reply

Working with LDAP authentication?

edit

Does anyone know if there's a way to get this bot working with LDAP Authentication? I had no problems running it before I went to LDAP, but now it doesn't seem to work, any ideas?

HotMonkeyAC 20:28, 13 June 2007 (UTC)Reply

in login.py, there's a block of code that sets wpName, wpPassword, etc. To get LDAP working, add this

"wpDomain" : "MYDOMAIN",

This doesn't work

edit

I did this, replacing MYDOMAIN as above with my LDAP domain name, and no change in behavior - it still fails, complaining of either a bad password or bad CAPTCHA data.


I got this to work with LDAP by putting self.domain = 'domain here' in wiki_family.py, where domain here is the LDAP domain. It did not work with self.name = 'domain here' as it says it should in the documentation README-family.txt -Funkmaster 801 18:45, 9 September 2008 (UTC)Reply

This also does not work - when I added the self.domain = 'domain here' line, (substituting my actual domain, of course) it simply generates a syntax error. Can you post a family file that works so I can see what I might be doing wrong?

--Gene Turnbow 20:47, 10 September 2008 (UTC)Reply

In Login.py replace line 121 which is {"wpDomain": self.site.family.ldapDomain,} with {"wpDomain": 'domain here',}

--AdrianArcher 14:34, 12 September 2008 (UTC)Reply

With the embedded colon?? Not sure how that's supposed to map to a MediaWiki variable that way, but I'll try it.

Okay - this doesn't work, just tested it. I didn't expect it to, with the colon embedded in the key name like that. Removing the colon and trying again - okay, once I checked my path override in my family file, it finally worked. Now I'll test it with the "plain vanilla" login.py file from the latest version and see if that works - it may have been my family file the entire time.

My conclusion is that a lot of people are confusing the rendered path that MediaWiki generates with the raw URL you need to reach the login page itself. You have to get there without the use of the MediaWiki-massaged path, which will not succeed.

Also, the new login.py does not work. Hard-coding the LDAP domain as AdrianArcher identified, however, did work.

--Gene Turnbow 17:06, 12 September 2008 (UTC)Reply

You're right, no colon, I've fixed it now. --AdrianArcher 15:34, 16 September 2008 (UTC)Reply

Errors

edit
C:\pywikipedia>login.py
Checked for running processes. 1 processes currently running, including the current process.
Traceback (most recent call last):
  File "C:\pywikipedia\login.py", line 277, in <module>   main()
  File "C:\pywikipedia\login.py", line 272, in main loginMan = LoginManager(password, sysop = sysop)
  File "C:\pywikipedia\login.py", line 97, in __init__
    raise wikipedia.NoUsername(u'ERROR: Username for %s:%s is undefined.\nIf you
 have an account for that site, please add such a line to user-config.py:\n\nuse
rnames[\'%s\'][\'%s\'] = \'myUsername\'' % (self.site.family.name, self.site.lan
g, self.site.family.name, self.site.lang))
wikipedia.NoUsername: ERROR: Username for wikipedia:en is undefined.
If you have an account for that site, please add such a line to user-config.py:

usernames['wikipedia']['en'] = 'myUsername'

Problem caused because I did not add the "family" line.

C:\pywikipedia>login.py
Traceback (most recent call last):
  File "C:\pywikipedia\login.py", line 49, in <module>
    import wikipedia, config
  File "C:\pywikipedia\wikipedia.py", line 4226, in <module>
    getSite()
  File "C:\pywikipedia\wikipedia.py", line 4134, in getSite
    _sites[key] = Site(code=code, fam=fam, user=user)
  File "C:\pywikipedia\wikipedia.py", line 3084, in __init__
    self.family = Family(fam, fatal = False)
  File "C:\pywikipedia\wikipedia.py", line 3062, in Family
    exec "import %s_family as myfamily" % fam
  File "<string>", line 1, in <module>
  File ".\families\odessa_family.py", line 3
    import family
    ^
IndentationError: unexpected indent

Fixed:

Pywikipedia_bot_on_non-Wikimedia_projects#Example:_Mozilla_wiki

So that the two lines:

import family (line 3)

and

class Family(family.Family):

have no space before them.

Odessaukrain 21:13, 24 May 2008 (UTC)Reply

Not Working

edit

it is not working for my project on : http://mercs.wikia.com/Mercs Wiki

my user-config.py

mylang = 'en'
username = 'patxBot'
family = 'mercs'

my family


# -*- coding: utf-8  -*-

import family

# The mercs wiki

class Family(family.Family):
    name = 'mercswiki'
    def __init__(self):
        self._addlang('en',
                        location = 'http://mercs.wikia.com/wiki/Mercs_Wiki',
                        namespaces = { 4: u'mercsWiki',
                                      5: u'talk' })

    def path(self, code):
        return '/index.php'

then when i type the stuff on the command promt this happens:


C:\Documents and Settings\Heeman9@bellsouth.ne>cd C:\\pywikipedia

C:\pywikipedia>welcome.py
WARNING: Configuration variable 'username' is defined but unknown. Misspelled?
Traceback (most recent call last):
  File "C:\pywikipedia\welcome.py", line 173, in <module>
    import wikipedia, config, string, locale
  File "C:\pywikipedia\wikipedia.py", line 5951, in <module>
    getSite()
  File "C:\pywikipedia\wikipedia.py", line 5841, in getSite
    persistent_http=persistent_http)
  File "C:\pywikipedia\wikipedia.py", line 4069, in __init__
    self.family = Family(fam, fatal = False)
  File "C:\pywikipedia\wikipedia.py", line 3893, in Family
    family = myfamily.Family()
  File "C:\pywikipedia\families\mercs_family.py", line 13, in __init__
    5: u'Mercs Wiki Talk' })
  File "C:\pywikipedia\family.py", line 2674, in _addlang
    self.langs[code] = location
AttributeError: Family instance has no attribute 'langs'

C:\pywikipedia>

What should I do????????

-- PATX 13:07, 9 August 2008 (UTC)Reply

The name="blah" of the family file must match the family="blah" in the user-config.py file. Also, this same name has to be applied to the name of the family file, so it is "blah_family.py" (mercs, presumably, in this case).
Further, it looks like (hard to be sure once you've copied and pasted it into the webpage) that your indentation is kind of messed up. name= should be self.name=, and it should be after def __init__. Also, I don't know what _addlangs() is, it might be a valid function, but it's not mentioned in the README file for families (which I recommend you read). Same thing with path() - it's not in the readme, scriptpath() is. It doesn't even look like the Mercs Wiki uses a script path...
I think, though I am not certain, that you would use this:
# -*- coding: utf-8  -*-

import family

# The mercs wiki

class Family(family.Family):
    def __init__(self):
        self.name = 'mercs'
        self.langs = {
            'en': 'mercs.wikia.com',
        }

        self.namespaces[4]['en'] = u'Mercs Wiki'
        self.namespaces[5]['en'] = u'Mercs Wiki talk'

        self.namespaces[110] = { 'en': u'Forum' }
        self.namespaces[111] = { 'en': u'Forum talk' }

    def scriptpath(self, code):
        return ''
I made this using Mercs Wiki's API, and the README-family.txt file in the Family folder. I'm somewhat new at this, but I think that will work.
DragoonWraith 22:24, 2 November 2008 (UTC)Reply

Custom Namespaces

edit

I added a bunch of information that I could not find anywhere when I was trying to set up my installation of PyWikipedia to work on a Wiki with custom namespaces (much thanks to the people on #pywikipediabot for helping me out with this in the first place). It's a bit wordy and some of it probably ought to go elsewhere, but I didn't know where and I figured it'd be easier for people to move what I wrote to someplace appropriate. If you have any questions/comments on the text, let me know. --DragoonWraith 22:09, 2 November 2008 (UTC)Reply

Je comprends rien

edit

Le 1.1 et le 1.2 je maitrîse mais après je comprends plus rien ? Y-a-t-il d'autres fichiers a modifier pour utiliser un bot ? Merci

Re: Custom User Groups & Permissions

edit

As a suggestion, you can try to add:

sysopnames['wikipedia']['en'] = 'adminname'

to user-config.py, where ['wikipedia'] is the name of the the family in which you are working on, and 'adminname' is the username (as seen in on Pywikipediabot/delete.py).

I've tried this and works, it solved my problems with "redict.py broken".

I haven't a user on this wiki but if this is ok please add it to the main page.


Login failed. Wrong password or CAPTCHA answer?

edit

I have not used in the bot in about 18 months. I returned to it yesterday and having got as far as the login.py prompt everything seemed good. It asks me for the password for the relevant user. I supply it. It tells me it is logging in to the wiki as the relevant user name. It then says "Opening CAPTCHA in your web browser...". It asks me for the solution. I supply it. It returns the message: "Login failed. Wrong password or CAPTCHA answer?". I cannot remember having this problem 18 months ago and nothing has changed in my set-up files. I went for the obvious, i.e. wrong passwords. I checked five times, ensured caps lock was off; I changed my password; I then shifted to an alternate user name.....every time the same message. Can anyone help? Is there something obvious I am missing?88.108.210.61 09:56, 5 May 2010 (UTC)Reply

Summary: Download http://svn.wikimedia.org/viewvc/pywikipedia?revision=8071&view=revision
My bot had the same problem since 4 April. Turn out there's a modification w:en:Wikipedia:Bot owners' noticeboard/Archive 5#Bots and Logging In - breaking change since 7 April. Related links: [1], mailarchive:mediawiki-announce/2010-April/000090.html, bugzilla:23076. Hope it helps. Cheers. Bennylin 15:09, 31 May 2010 (UTC)Reply
  • I have the same problem with my bot , downloaded up to date versions and the login.py , still getting Login failed. Wrong password or CAPTCHA answer. I would be very grateful for any help.--Ghaly 21:59, 2 June 2010 (UTC)Reply

No JSON object could be decoded

edit

I'm having this error while trying to login.py in a 1.17alpha wiki:

Error downloading data: No JSON object could be decoded
Request en:/wiki/api.php?
Retrying in 1 minutes...

I've checked the user,pass and scriptpath and everything is allright. The strange thing is that in apache's access.log I've seen this:

"POST /wiki/api.php HTTP/1.1" 200 3930 "-" "PythonWikipediaBot/1.0"

So it is trying to connect, but it don't work.

Does anyone know what is happening?? What other things can I check?

same error, see [[#Error downloading data: No JSON object could be decoded [SOLVED]]] directly below Adamtheclown 16:11, 11 December 2010 (UTC)Reply

Error downloading data: No JSON object could be decoded [SOLVED]

edit

Main possible causes:

  1. byte order mark (BOM) issue
  2. 301 of the domain to another domain, in which case updating your family file could be enough

Introduction

edit

Message returned on the command line from python login.py: Error downloading data: No JSON object could be decoded Request en:/wikiscriptpath/api.php?

From the dumpfile:

Error reported: No JSON object could be decoded
127.0.0.1
/wikiscriptpath/api.php?
{"login":{"result":"NeedToken","token":"f8ff543dea8a19c853b64d714eb580e8"}}

I see this issue from time to time and it is EXTREMELY EXTREMELY EXTREMELY frustrating. It happened to me in March, then I downloaded a new pywikipedia snapshot, and it magically worked.

I am running MediaWiki 1.15.3 on Mac OS 10.6 with Python 2.6.1, and pywikipediabot works. I am using short URLs, but via httpd.conf aliases. I have not "blacklisted" api.php on that setup.

When I use an IDENTICAL setup on Ubuntu 9.04, but with Python 2.6.2, I get this ... error (sorry, I am straining to hold back the profanity).

My access logs:

127.0.0.1 - - [29/May/2010:08:09:34 -0400] "POST /correctwikiscriptpath/api.php HTTP/1.1" 200 95 "-" "PythonWikipediaBot/1.0"

The annoying bit: It was working just 3 or 4 days ago before I upgraded my Ubuntu python from 2.6.1 to 2.6.2. I have made NO other changes to my system.


Any suggestions? --AttemptedUser 12:18, 29 May 2010 (UTC)Reply

Still having this MAJOR problem

Really? No response after a month? I can't be the only person having this problem, can I? Do any of the developers actually look at this page? Anyone? Bueller, Bueller?

--AttemptedUser 09:03, 22 June 2010 (UTC)Reply
Same problem, identically. Been waiting for somebody to look at this for months now, and I'm at a complete standstill. --13:20, 22 June 2010 (UTC)
these talk pages are not very well overseen. I think we should maybe merge all of them into one. Adamtheclown 19:11, 17 November 2010 (UTC)Reply

Solution!! (Well, almost; more of a workaround)

edit

Hi, this is AttemptedUser again, now not so frustrated and posting under my usual handle.

The problem is was that DynamicPageList extension had BOMs (byte order marks) at the beginning of its interface file. Because this is a "require_once" extension, it seems that the BOM was getting inserted into the headers, and Ubuntu's version of php or apache (not sure which) does not sanitize those, whereas the Mac (and seemingly, everyone else's installation) DOES sanitize the BOMs before parsing. I am not sure why BeautifulSoup.py doesn't catch this, but for whatever reason it doesn't. Unless you're using UTF-16 files, you really shouldn't have a BOM anyway...

To check if you have any stray BOM's laying around, Mediawiki has actually included a handy script in the t/maint directory called "bom.t" If you're curious, go to your main MediaWiki directory, then "perl t/maint/bom.t", and it will tell you which files are problematic.

If you just want to blast away and fix the problem, a combination of two handy scripts took care of the problem for me. Put one or both in an executable path, but be sure modify the shell script to refer to the absolute path to the Perl script:

This one I call "RecursiveBOMdefuse.sh"

#!/bin/sh
#
if [ "$1" = "" ] ; then
 echo "Usage: $0 directory"
 exit
fi
# Get list of files in the directory
find "$1" -type f |
while read Name ; do
 # Based on the file name, perform the conversion
 case "$Name" in
    (*) # markup text
       NameTxt="${Name}"
      /absolute/path/to/./BOMdefuse.plx "$NameTxt";
	#alternatively, you could probably use perl /absolute/path/to/BOMdefuse.plx "$NameTxt";
	;;
 esac
done

The next, I call BOMdefuse.plx, which is a perl script I found at W3C's website - I'm really not sure why they haven't made this operate recursively, but the shell takes care of that. If I had the time, I'd fix the Perl script to handle everything, but I'm just so happy about getting the bot working again that I'm going back to work on editing/cleaning up content.

#!/usr/bin/perl
# program to remove a leading UTF-8 BOM from a file
# works both STDIN -> STDOUT and on the spot (with filename as argument)
# adapted from http://people.w3.org/rishida/blog/?p=102
# 

if ($#ARGV > 0) {
   print STDERR "Too many arguments!\n";
   exit;
   }

my @file;   # file content
my $lineno = 0;
 
my $filename = @ARGV[0];
if ($filename) {
   open( BOMFILE, $filename ) || die "Could not open source file for reading.";
   while (<BOMFILE>) {
       if ($lineno++ == 0) {
           if ( index( $_, '' ) == 0 ) {
               s/^\xEF\xBB\xBF//;
               print "BOM found and removed from $filename.\n";
               }
           else { print "No BOM found in $filename.\n"; }
           }
       push @file, $_ ;
       }
   close (BOMFILE)  || die "Can't close source file after reading.";
   open (NOBOMFILE, ">$filename") || die "Could not open source file for writing.";
   foreach $line (@file) {
       print NOBOMFILE $line;
       }
   close (NOBOMFILE)  || die "Can't close source file after writing.";
   }
 else {  # STDIN -> STDOUT
   while (<>) {
   if (!$lineno++) {
       s/^\xEF\xBB\xBF//;
       }
   push @file, $_ ;
   }

   foreach $line (@file) {
       print $line;
       }
   }

Run a chmod +x on both of these.

Then go to your main Mediawiki directory and run

RecursiveBOMdefuse.sh .

It may take a minute or two, but it works!

Note: If you use symlinks anywhere in your installation, the script above does not seem to follow them, so you have to run the script from the actual directory. Although slightly annoying, this is probably a good thing, as a bed set of symlinks could send this script off to run through your entire drive (or if you're on a system with NFS mounts, the whole network/cluster!!!). Incidentally, the scripts found more BOMs than the

I hope this helps others, and Ubuntu or Pywikipediabot folks, please take a look at your PHP/Apache and BeautifulSoup.py - stray BOMs should not be getting through..... (Of course, extension authors should sanitize their extensions first, but talk about herding cats). --Fungiblename 06:20, 6 August 2010 (UTC)Reply

Is there any easier workoround for non shell users ? I have the same problem - please advise... 83.168.82.227
talk about an extremely complex solution.
Here are some google hits on the same issue:[2] Adamtheclown 03:37, 12 December 2010 (UTC)Reply

Solution

edit

Older version

# -*- coding: utf-8  -*-

import family

# The dead Wiki.
class Family(family.Family):

    def __init__(self):

        family.Family.__init__(self)
        self.name = 'dead' #Set the family name; this should be the same as in the filename.

        self.langs = {
            'en': 'www.deadrisingwiki.com', #Put the hostname here.
        }

    def version(self, code):
        return "1.15.5"  #The MediaWiki version used. Not very important in most cases.

    def scriptpath(self, code):
        return ' ' #The value of {{SCRIPTPATH}} on this wiki

    def apipath(self, code):
        return '/api.php' #The path of api.php

Newer version

# -*- coding: utf-8  -*-

import family

# The dead Wiki.
class Family(family.Family):

    def __init__(self):

        family.Family.__init__(self)
        self.name = 'dead' #Set the family name; this should be the same as in the filename.

        self.langs = {
            'en': None,
        }

    def version(self, code):
        return "1.15.5"  #The MediaWiki version used. Not very important in most cases.

    def apipath(self, code):
        return '/api.php' #The path of api.php

    def hostname(self, code):
        return 'deadrisingwiki.com'

This may help also:

Most likely, your text editor added a byte order mark (BOM) while you edited MediaWiki's PHP files, but any other content before the opening <?php causes the same problem. This usually happens with LocalSettings.php - but see error message for exact file. Note that BOMs are invisible in most text editors. To remove the BOM, edit the file with something better than Windows Notepad, but if you don't really have time - open the file with it and choose Save as..., then choose "Unicode (UTF-8 Without signature) - Codepage 65001" as file type.[3] Suggested at: [4]

Adamtheclown 04:09, 12 December 2010 (UTC)Reply

Adamtheclown 04:09, 12 December 2010 (UTC)Reply

Another possible cause

edit

In my case the No JSON object could be decoded error was caused by a 301 redirect that I added. After I updated my family file to directly use the new URL pywikipediabot worked again as expected! Guaka (talk) 15:17, 7 February 2013 (UTC)Reply

pywikibot.exceptions.PageNotFound

edit

Attempting to extract using imageharvest.py from the (imagery) wikia page:

http://dead.wikia.com/wiki/Tape
pywikibot.exceptions.PageNotFound

Here is what I entered:

Cmd screen
C:\Users\t\Desktop\pywikipedia>imageharvest.py http://dead.wikia.com/wiki/Tape
unicode test: triggers problem #3081100
What text should be added at the end of the description of each image from this
url?
Include image http://dead.wikia.com/wiki/http://images.wikia.com/dead/images/b/bc/wiki.png?
 ([y]es, [N]o, [s]top) y
Give the description of this image:
Specify a category (or press enter to end adding categories)
Traceback (most recent call last):
  File "C:\Users\t\Desktop\pywikipedia\imageharvest.py", line 135, in <module>
    main(url, image_url, desc)
  File "C:\Users\t\Desktop\pywikipedia\imageharvest.py", line 110, in main
    uploadBot = upload.UploadRobot(image, description = desc)
  File "C:\Users\t\Desktop\pywikipedia\upload.py", line 98, in __init__
    self.targetSite.forceLogin()
  File "C:\Users\t\Desktop\pywikipedia\wikipedia.py", line 4861, in forceLogin
    if not self.loggedInAs(sysop = sysop):
  File "C:\Users\t\Desktop\pywikipedia\wikipedia.py", line 4853, in loggedInAs
    self._load(sysop = sysop)
  File "C:\Users\t\Desktop\pywikipedia\wikipedia.py", line 5932, in _load
    text = self.getUrl(url, sysop = sysop)
  File "C:\Users\t\Desktop\pywikipedia\wikipedia.py", line 5365, in getUrl
    % url)
pywikibot.exceptions.PageNotFound: Page http://www.dead-wiki.com/w/index.
php/index.php?title=Non-existing_page&action=edit&useskin=monobook could not be
retrieved. Check your family file.

C:\Users\t\Desktop\pywikipedia>


Here it appears like it is trying to pull up index.php twice:

http://www.dead-wiki.com/w/index.php/index.php?title=Non-existing_page&action=edit&useskin=monobook

I made the below changes to fix this, but came up with a new problem. Adamtheclown 20:55, 17 November 2010 (UTC)Reply

500 error message and problem #3081100

edit

Mediawiki 1.16.0

LINUX,
PHP 5.1,
CGI 2.0,
PERL 5.0,
PYTHON 2.4,
RUBY 1.8.4,
MYSQL 4.1, 5.0

Site http://dead-rising-wiki.com

(written as dead-wiki.com below)

Attempting to extract using imageharvest.py from the (imaginary) wikia page:

http://dead.wikia.com/wiki/Tape
user-config.py

I added console_encoding = 'utf-8'; because it would not work without this, although I have no idea what it does.

mylang='en'
family = 'dead';
usernames['dead']['en']=u'bobtheperv';
console_encoding = 'utf-8';


Family file

Here is my family file:

# -*- coding: utf-8  -*-

import family

# dead Wiki.
class Family(family.Family):

    def __init__(self):

        family.Family.__init__(self)
        self.name = 'dead' #Set the family name; this should be the same as in the filename.

        self.langs = {
            'en': 'www.dead-wiki.com', #Put the hostname here.
        }

    def version(self, code):
        return "1.6.10"  #The MediaWiki version used. Not very important in most cases.

    def scriptpath(self, code):
        return '/w/index.php' #The value of {{SCRIPTPATH}} on this wiki

    def apipath(self, code):
        return '/w/api.php' #The path of api.php
HTTP Error 500
Internal Server Error
Because of the error above #pywikibot.exceptions.PageNotFound I changed my family file to:
    def scriptpath(self, code):
        return '/w' #The value of {{SCRIPTPATH}} on this wiki

    def apipath(self, code):
        return '/w' #The path of api.php

I now get:

HTTP Error 500: Internal Server Error
unicode test: triggers problem #3081100

I continue to get a:

unicode test: triggers problem #3081100[5]

....error when I log in.

I think it is the line in my user-config.py file:

console_encoding = 'utf-8';

Again, it would not work without this, I added it, but I have no idea what it does.

Adamtheclown 12:47, 15 November 2010 (UTC)Reply

The following worked

edit

Family file:

# -*- coding: utf-8  -*-

import family

class Family(family.Family):

    def __init__(self):

        family.Family.__init__(self)
        self.name = 'dead' #Set the family name; this should be the same as in the filename.

        self.langs = {
            'en': 'www.dead-rising-wiki.com', #Put the hostname here.
        }

    def version(self, code):
        return "1.16.0"  #The MediaWiki version used. Not very important in most cases.

    def scriptpath(self, code):
        return '/w/index.php' #The value of {{SCRIPTPATH}} on this wiki

    def apipath(self, code):
        return '/w/api.php' #The path of api.php

user-config:

mylang='en'
family = 'dead';
usernames['dead']['en']=u'bobtheperv';
console_encoding = 'utf-8';

Adamtheclown 08:16, 19 November 2010 (UTC)Reply

Error downloading data: Extra data: line 2 column 1 - line 6 column 1 (char 158 - 458)

edit

Hello.

I've installed pywikipediabot as I'm supposed to. Created a new family file, edited as I should.. When I run the login.py script, and I login as a user, it will come up with that error message stated in the title: "Error downloading data: Extra data: line 2 column 1 - line 6 column 1 (char 158 - 458)".

What is coming from? How can I fix it?

I am using Short URL's on my wiki, and I haven't blacklisted api.php in my .htaccess, because I don't know how to. If this might be the problem, please teach me how to blacklist a certain file :-)

Kind regards, Dennis

(85.80.227.205 21:13, 11 December 2010 (UTC))Reply

This is a finicky, finicky beast; hopefully tamed by now

edit

A previously working family and user-config file on MW 1.15.4 broke when upgrading to MW 1.16.1. First, it was JSON object errors, then some odd XML warnings.

Here is what works for me in the family file (excerpt):

	def protocol(self, code):
        	"""
        Can be overridden to return 'https'. Other protocols are not supported.
        	"""
	        #return 'http'
		return 'https' # My server uses https
	def scriptpath(self, code):
        	"""The prefix used to locate scripts on this wiki.

        This is the value displayed when you enter {{SCRIPTPATH}} on a
        wiki page (often displayed at [[Help:Variables]] if the wiki has
        copied the master help page correctly).

        The default value is the one used on Wikimedia Foundation wikis,
        but needs to be overridden in the family file for any wiki that
        uses a different value."""
       # return '/w'
       		return '/scriptpath' #changed here to be more applicable to other people. I use Apache aliases - this the the alias, not the target. 

    # IMPORTANT: if your wiki does not support the api.php interface,
    # you must uncomment the second line of this method:
    	def apipath(self, code):
     #    	raise NotImplementedError, "%s wiki family does not support api.php" % self.name
 	        #return '%s/api.php' % self.scriptpath(code)
		return '/scriptpath/api.php' #The scriptpath + api.php; again, if using Apache aliases, use the alias, not the target.

I hope this helps someone else. It also looks like some extensions introduced BOMs into their files on recent updates for some stupid reason; I used the BOM fixes above to get rid of them (and thus, fix the JSON error).

--196.46.120.254 12:30, 8 February 2011 (UTC)Reply

Merged content

edit

This section, without the section header and the italic comments was here on mw.org before the import of the above content from meta
I followed everything step by step with recommended python 2.4 and I get all these errors:

D:\>d:\development\python\python.exe d:\development\pymw\login.py
Traceback (most recent call last):
  File "d:\development\pymw\login.py", line 58, in <module>
    import re, os, query
  File "d:\development\pymw\query.py", line 28, in <module>
    import wikipedia, time
  File "d:\development\pymw\wikipedia.py", line 4618, in <module>
    class Site(object):
  File "d:\development\pymw\wikipedia.py", line 4789, in Site
    def __init__(self, code, fam=ncdb, user=root, persistent_http = False ):
NameError: name 'ncdb' is not defined

—The preceding unsigned comment was added by 188.126.88.241 (talkcontribs) 01:25, 22 December 2010‎. Please sign your posts with ~~~~!

End merged content

incorrect api.php path in command line

edit

When I want to login with bot to my wiki I see this error:

No handlers could be found for logger "pywiki"
Logging in to westeros:fa as SiteBot via API.
Traceback (most recent call last):
  File "C:\pywikipedia\login.py", line 436, in <module>
    main()
  File "C:\pywikipedia\login.py", line 432, in main
    loginMan.login()
  File "C:\pywikipedia\login.py", line 319, in login
    cookiedata = self.getCookie(api)
  File "C:\pywikipedia\login.py", line 181, in getCookie
    response, data = query.GetData(predata, self.site, sysop=self.sysop, back_response = True)
  File "C:\pywikipedia\pywikibot\support.py", line 121, in wrapper
    return method(*__args, **__kw)
  File "C:\pywikipedia\query.py", line 143, in GetData
    res, jsontext = site.postForm(path, params, sysop, site.cookies(sysop = sysop) )
  File "C:\pywikipedia\wikipedia.py", line 6460, in postForm
    cookies=cookies)
  File "C:\pywikipedia\wikipedia.py", line 6514, in postData
    raise PageNotFound(u'Page %s could not be retrieved. Check your family file ?' % url)
pywikibot.exceptions.PageNotFound: Page http://www.westeros.ir/w/api.php could not be retrieved. Check your family file?

My family file is this:

# -*- coding: utf-8  -*-

import family

# westeros

class Family(family.Family):
    def __init__(self):
        family.Family.__init__(self)

        self.name = 'westeros'

        self.langs = {
                'fa': 'www.westeros.ir',
        }
        
    def version(self, code):
        return "1.19.3"
	def scriptpath(self, code):
		return '/wiki'
	def apipath(self, code):
		return '/wiki'

api.php is in '/wiki' folder, why it says "Page http://www.westeros.ir/w/api.php could not be retrieved"? Where should I change "/w" to "/wiki" other than family.py file?

Uncyclopedia

edit

Do we really need a screenful on the history of Uncyclopedia? I doubt it. Nemo 15:47, 17 May 2020 (UTC)Reply

Wikibase

edit

The manual states:

Once the family module exists for the Wikibase repository, it needs to be modified so that the Family subclass tells Pywikibot that it supports Wikibase.

There is no explanation or example about the actual parameters to set: can anyone improve the documentation adding a practical explanation on what to do? --Luca Mauri (talk) 14:01, 13 August 2021 (UTC)Reply

Parameter ns / namespace not working

edit

Hi all, I can't get my bot working on specific namespaces. I can use -start:File:! to have it run through that namespace. But neither

  • -ns:1
  • -namespace:1
  • -ns:Talk
  • -namespace:Talk

will work. I just get: ERROR: Unable to execute script because no generator was defined. Use -help for further information.

What ist wrong? Do I have to setup namespace in the family file? How to do this? Thank a lot in advance! --Plasmarelais (talk) 12:40, 12 November 2021 (UTC)Reply

The -ns parameter is not a generator, but a filter. Are you using -ns:1 with the -start parameter? Ciencia Al Poder (talk) 12:14, 14 November 2021 (UTC)Reply
Return to "Pywikibot/Use on third-party wikis" page.