Topic on User talk:APaskulin (WMF)

Klein Muçi (talkcontribs)

Hello! :) I'm an admin of the Albanian Wikipedia and Wikiquote. I was trying to follow the instructions mentioned here for my pywikibot which operates on the projects mentioned above and they work fine generally (I'm using curl) but there are no examples on how to log in. I saw the page history and saw that you were one of the main contributors so I was wondering if you could help me with that. How can I make a request to login?

APaskulin (WMF) (talkcontribs)

Hi @Klein Muçi, The MediaWiki REST API is designed to be used with OAuth for authorization. OAuth/For Developers has some helpful information about the different OAuth flows depending on the type of login you're doing. For example, in this example:

# Create a user sandbox page on English Wikipedia
$ curl -X POST https://en.wikipedia.org/w/rest.php/v1/page -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" --data '{"source": "Hello, world!", "title": "User:<my username>/Sandbox", "comment": "Creating a test page with the REST API"}'

You'd want to replace $TOKEN with your OAuth access token. Let me know if this helps!

Klein Muçi (talkcontribs)

Yes, I believe I'm already past the OAuth part, even though I didn't follow all the steps mentioned there. I've submitted the form and got the 4 tokens. (I've used them on user-config.py to authenticate when applying fixes. My bot is a replace bot.)


I was trying to find something like the first method described here: API:Login. For some reason, maybe because of my shortcomings in knowledge, I can't make it work even though I think I followed the steps carefully. The response I get looks nothing similar to the correct or erroneous responses I should be getting. Maybe it has to do with the fact that I'm using curl in cmd and in that page there are no examples in regard to that. On the other hand, I tried some of the examples mentioned here and they behave normally to me. The responses are like they should generally. And I was thinking if there was a way to "adapt" the method described in the first page to work like the examples in the second page.


Basically, what I want to do is to have my bot log in into SqWiki, edit a certain page in a certain manner and save the results in another page, then edit it again and save the results again, this cycle repeats 2-3 times and that's it. I already have examples on editing pages but I'm confused on the login part.

Klein Muçi (talkcontribs)

Are you implying that I should sort of skip the log in part and just proceed with the edit request which should come together with the token and that would make it possible to not only make the edit but also automatically authenticate my bot (and therefore put the edit under its contributions)?


I tried using the example above with EnWiki and SqWiki using the Access token. I get this error {"error":"rest-read-denied","httpCode":403,"httpReason":"Forbidden"}

APaskulin (WMF) (talkcontribs)

> Are you implying that I should sort of skip the log in part and just proceed with the edit request which should come together with the token and that would make it possible to not only make the edit but also automatically authenticate my bot (and therefore put the edit under its contributions)?

Yes exactly!


> I tried using the example above with EnWiki and SqWiki using the Access token. I get this error {"error":"rest-read-denied","httpCode":403,"httpReason":"Forbidden"}

This can be a result of an issue with the configuration of your OAuth client. Do you have the link to your client? (Example: meta:Special:OAuthListConsumers/view/b7f698ff5fcc830e98d1dae7eabffc53)

Klein Muçi (talkcontribs)
APaskulin (WMF) (talkcontribs)

I wasn't sure myself, but I tested it out just now, and it seems that the MediaWiki API:REST API only supports OAuth 2.0. (This is definitely something we can improve in the docs, so thanks for bringing this up!) The client you linked to is OAuth 1.0a. Unfortunately you won't be able to change it, so you'll need to create a new client with the same configuration except select OAuth 2.0 as the "OAuth protocol version". That should give you an access token that works!

Klein Muçi (talkcontribs)

Oh... Thank you! I'll try it out now and report on the results. :)

Klein Muçi (talkcontribs)

Yes, it works perfectly now. I did notice something though: The first time I had tried it, I substituted only the "TOKEN" part leaving the source symbol ($) intact. I did the same now and it didn't work again. It only worked after I removed that part as well. So I was thinking that maybe even the OAuth 1.0 is supported, it was just me doing it wrong (I never got the chance to try it correctly). Or did you try it personally and it didn't work?


The reason I ask, apart from not wanting to be the reason the documentation is changed in a wrong way, is that I apparently need OAuth 1.0 to be able to apply the replaces fixes I mentioned earlier according to this page: Manual:Pywikibot/OAuth. There and here is mentioned that you need 4 tokens ('<consumer_key>','<consumer_secret>', '<access_key>', '<access_secret>') to be able to use them in your user-config.py file and those 4 tokens are generated with OAuth 1.0. (What I had until now.) I'm not sure if I need to have 2 different clients now because of this. Or maybe I can somehow use the 3 tokens of OAuth 2.0?

Klein Muçi (talkcontribs)

Also, follow up question: Do I need to write the access token in every request I make forever?

APaskulin (WMF) (talkcontribs)

>The first time I had tried it, I substituted only the "TOKEN" part leaving the source symbol ($) intact.

$TOKEN is meant to be replaced as a whole, both the $ symbol and the token. I did test OAuth 1.0 personally, and it is not supported.


> those 4 tokens are generated with OAuth 1.0

That's correct: Pywikibot only supports OAuth 1.0a. Pywikibot and API:REST API provide similar functionality, so it's possible that you may not need to use both. To provide more guidance here, I'd need to understand more about your code.


> Do I need to write the access token in every request I make forever?

Yes, the token will need to be included in all the requests that your bot makes, but there are ways to make that easier depending on how your code is set up. Where are you running your code?

Klein Muçi (talkcontribs)

First of all, thank you so much for being available to answer question in regard to this. I've spent the past few months solving everything with trial and error because I couldn't find someone to help on this subject. You are usually suggested to go for help on IRC or mailing lists and unfortunately I haven't had good experiences with those.


As for the code... I run a replace pywikibot. It has a long list (around 40k lines) of regexes that help with some simple problems in regard to citations on SqWiki and SqQuote. For example, removing deprecated CS1 parameters, updating some old ones to their new corresponding ones or adapting language parameters to their ISO codes (language=Inglese -> language=en). Everything is done following the directions here. The set of regexes are set as user fixes (Manual:Pywikibot/user-fixes.py) and at the moment I have 2 specific files (one for SqWiki and one for SqQuote), 2 scripts basically, (that are made up by only the command to run the user-fixes for the specific projects) which are run periodically on ToolForge as cronjobs. OAuth 1.0a was needed for this.


Things worked fine but after every 2-3 months I needed to manually update the source code (the list of regexes) to reflect the changes that MediaWiki may had done in regard to languages or add the deprecated parameters. This is done by putting the language codes as values on template on SqWiki and a module specifically created on Lua for the bot auto-generates the list of regexes. Now I was hoping to automatize this part also. Ideally, the bot would log in, get the list of language codes, feed them to the template and copy the results, 3-5 times (because you can't do them all in one go given the vast number of lines that are which hit a limit on MediaWiki) and by doing so complete the autoupdate. I started doing this with Elinks (the terminal web browser) but I quickly found out that curl was the right thing to use and this brought me here when I started learning how to access the API with it. I'm running everything on the cmd.exe on Windows, using it to ssh to ToolForge.


My guess is that I have to have both (OAuth 1.0a and OAuth 2.0) and I'll try to somehow make the process described above happen. If you have any ideas or advices on that, everything is appreciated because I'm starting from scratch basically. Sorry for the long message, I wanted to explain myself as best as I could. If you are fine with the idea, I can send you a wiki email with the source code itself and other details that may be needed.

APaskulin (WMF) (talkcontribs)

> The bot would log in, get the list of language codes, feed them to the template and copy the results, 3-5 times (because you can't do them all in one go given the vast number of lines that are which hit a limit on MediaWiki) and by doing so complete the autoupdate.

Although I don't have much experience with Pywikibot myself, I would assume that you'd be able to do this with Pywikibot and OAuth 1.0a. For example, Manual:Pywikibot/Create your own script shows how to read and edit pages using Pywikibot. It seems like having a third script running on ToolForge as a cronjob would be simpler than using both Pywikibot and the REST API. But I'm also assuming that you probably already tried this.

> I quickly found out that curl was the right thing to use

You can also use Python with the REST API. For example, in API:REST_API/Reference#Examples_4, there's a "Python" tab that brings up an example of using the endpoint in Python. In that example, you can see that the access token is saved to the "headers" variable that you can reuse in multiple requests.

If you do want to continue using curl, you can set your access token as an environment variable so you don't have to copy it every time. Here's a guide to doing that in Linux: https://www.digitalocean.com/community/tutorials/how-to-read-and-set-environmental-and-shell-variables-on-linux. (I'm assuming that you're working with Linux because you're ssh-ing into ToolForge.)

I hope this is helpful! I don't have any first hand experience with running bots, but feel free to continue asking questions and I'll do my best :)

Klein Muçi (talkcontribs)

Actually I haven't tried that. I did check it now real fast (re-checked it actually because I've read everything related to pywikibot countless of times in a frenzy when I had problems and couldn't find anyone to ask for help :P ) and, if I'm not wrong, everything described there is done with Python in mind. Unfortunately I have 0 experience with coding in Python (I've only tried bash scripting and a bit of Javascript/Lua so far) so I don't feel comfortable yet dealing with it. I've been suggested many times actually to start working with Python but my plan was to finish and "secure" the autoupdate process before making the leap into experiments with Python.


I'm working with Linux, yes. I'll first try to fix the Pywikibot/OAuth 1.0a problem I created when I wrongly deleted the first client (thinking I wouldn't need it anymore) and then deal with setting the access token as an environmental variable and finally, hopefully, start finding a way to make the autoupdate happen. I wanted to ask if I could ask for your help in the not-so-distant (unfortunately) future regarding this but you've already answered that so... :P Thank you one more time! :))

Klein Muçi (talkcontribs)

Back quicker than I hoped for.


curl https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi:Smallem gets me the source page of https://sq.wikipedia.org/wiki/P%C3%ABrdoruesi:Smallem.


curl https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi:Smallem/Livadhi_personal gets me an error and not the source page of https://sq.wikipedia.org/wiki/P%C3%ABrdoruesi:Smallem/Livadhi_personal (which serves as a template for getting the language codes and generating the list of regexes).


Why is that so? I tried different combinations and apparently I'm not able to make that command work with any kind of subpage.

APaskulin (WMF) (talkcontribs)
Klein Muçi (talkcontribs)

Oh, I was trying to escape that with quotes or a backslash but it didn't work. Thank you! :)

Klein Muçi (talkcontribs)

So, I have a file made up of x lines (303 now, may change in the future), 1 code each line. I want to use curl to start putting the lines (codes) in batches of 100 lines per time in the link above and copy the results generated in another file.

My idea was to copy 100 lines from File0 to File1, use curl to send a PUT request with lines from File1, copy back the generated results in File3, delete 100 lines from File0, copy 100 lines from File0 to File1 (overwriting the first 100), use curl to send a PUT command with lines from File1, copy back the generated results in File3 (append), ... repeat until no lines left in File0.

Now, unfortunately I have some problems in putting that idea on practice. First of all, as far as I understand, the ID revision on the PUT request needs to be manually incremented with every edit made. Is there an easy bypass of this problem if I'm going to make it automatic (without me having to devise a script just to sort that out)? Basically, a way to just tell the PUT request to find out the latest ID revision and use that, whatever that is. Secondly, how would I go on with curl in getting lines from a certain file? Basically implementing the cat command as part of the source in the data command send in the PUT request. And thirdly I have to learn how to make loops happen in Linux so any help you can provide on that direction would be greatly appreciated.

APaskulin (WMF) (talkcontribs)

> the ID revision on the PUT request needs to be manually incremented with every edit made. Is there an easy bypass of this problem if I'm going to make it automatic (without me having to devise a script just to sort that out)? Basically, a way to just tell the PUT request to find out the latest ID revision and use that, whatever that is.

The get page endpoint (API:REST_API/Reference#Get_page) will always give you the latest revision ID for a page.

> Secondly, how would I go on with curl in getting lines from a certain file?

sed might be helpful here: https://www.digitalocean.com/community/tutorials/the-basics-of-using-the-sed-stream-editor-to-manipulate-text-in-linux

> thirdly I have to learn how to make loops happen in Linux so any help you can provide on that direction would be greatly appreciated.

The trick with programming is mostly knowing what to search for :) For this, helpful search phrases would be "loops in bash" and "loops in shell scripts". Here's a tutorial that might help: https://linuxize.com/post/bash-for-loop/

Klein Muçi (talkcontribs)

> The get page endpoint (API:REST_API/Reference#Get_page) will always give you the latest revision ID for a page.


Yes but if I plan to put a lot of curl PUT requests in a row automatically, that turns out to be a problem because, apparently, I need the Iatest ID revision set in every request and that will make up for having an extra step or more overall.


I was hoping to have something like this:

curl -X PUT https://en.wikipedia.org/w/rest.php/v1/page/Wikipedia:Sandbox -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" --data '{"source": "Hello, world!", "comment": "Testing out the REST API", "latest": { "id": get latest automatically }}'


This is sort of the same problem with utilizing the cat/sed/etc. command. I was again hoping for something like this:


curl -X PUT https://en.wikipedia.org/w/rest.php/v1/page/Wikipedia:Sandbox -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" --data '{"source": "(cat File1)", "comment": "Testing out the REST API", "latest": { "id": 555555555 }}'


Can the curl requests be utilized in any way close to that?

APaskulin (WMF) (talkcontribs)

I don't have much experience with this type of shell scripting; I usually use Python. But I did a quick search, and I'd try something like this https://stackoverflow.com/a/17032673

Klein Muçi (talkcontribs)

Thank you! I've been searching through StackOverflow all day today. So theoretically it is possible to put something else other than plain strings in the data command. And how about the id number? There's no way it can be resolved automatically eh? Getting it with a GET request every time just doesn't work in the devised scheme I mentioned above. It's a shame because that's the only missing step.


But thank you a lot anyway for your patience and help! :)

APaskulin (WMF) (talkcontribs)

That's correct: I don't believe there's an easier way than using a GET request to get the latest revision ID. For example, if you have jq installed, you can use a function like this:

getrevisionid()
{
  curl https://en.wikipedia.org/w/rest.php/v1/page/Jupiter/bare | jq -r '.latest.id'
}

(Except with your correct authorization) Then you can use the revision id in your POST using $(getrevisionid)

Klein Muçi (talkcontribs)

Oh.. I didn't know of that command. Maybe not all hope is lost. LOL Strangely enough, curl https://en.wikipedia.org/w/rest.php/v1/page/Jupiter/bare | jq -r works while if I add the last part of the command ('.latest.i$), it doesn't (same if trying to apply it as a function). Maybe there is a typo somewhere? Or maybe I'm doing something wrong.

APaskulin (WMF) (talkcontribs)

My mistake! I copied the function incorrectly from my terminal. It's correct now.

Klein Muçi (talkcontribs)

Thank you! It works now. I tried following your example for creating the other needed variable.


getcodevalues()

{

echo "{{#invoke:Smallem|lang_lister|list=y}}{{#invoke:Smallem|lang_lister|lang=" ; head -n 50 TEST | tr "\n" " " ; echo "}}"

}


This produces:


{{#invoke:Smallem|lang_lister|list=y}}{{#invoke:Smallem|lang_lister|lang= aa, ab, ace, ady, af, ak, als, alt, am, an, ang, ar, arc, ary, arz, as, ast, atj, av, avk, awa, ay, az, azb, ba, ban, bar, bat-smg, bcl, be, be-tarask, be-x-old, bg, bh, bi, bjn, bm, bn, bo, bpy, br, bs, bug, bxr, ca, cbk-zam, cdo, ce, ceb, ch, }}


When I put the aforementioned string directly on the POST request, it works fine. The request gets through and the whole string is interpreted as just that, a plain text string.

When I put it through a variable, "'$(getcodevalues)'" I get this:


"httpCode":400, "httpReason": "Bad Request"

curl: (6) Could not resolve host: aa,

curl: (6) Could not resolve host: ab,

curl: (6) Could not resolve host: ace,

curl: (6) Could not resolve host: ady,

... (that goes on for all the codes)

curl: (3) [globbing] unmatched close brace/bracket in column 1


Any idea what may be confusing the request? I read I can disable globbing but I don't know how to fix the first part.

APaskulin (WMF) (talkcontribs)

If you can share your POST request, I can take a look. Make sure to remove your authorization tokens before sharing :)

Klein Muçi (talkcontribs)

curl -X PUT https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi%3ASmallem%application/json" -H "Authorization: Bearer $(getaccesstoken)" --data '{"source": "'$(getcodevalues)'", "comment": "Testing out the REST API", "latest": { "id": "'$(getrevisionid)'" }}'

APaskulin (WMF) (talkcontribs)

I was able to re-create your error locally, and it definitely seems like the JSON is getting parsed incorrectly when it's being pulled into the PUT request. I wasn't able to find a quick fix for this, but some things you might want to try are 1) building the data payload in a function so that you're only passing one function to the data option in the API request (as they do in this example https://stackoverflow.com/questions/17029902/using-curl-post-with-variables-defined-in-bash-script-functions/17032673#17032673) 2) using jq (I saw some answers on StackOverflow for this, but nothing that I was able to quickly test out).

Klein Muçi (talkcontribs)

I don't understand what you exactly mean. I did use the page you mentioned above to create a function for the source code, which brings the problem we're discussing. And I'm not sure how I would utilize jq in this case. I mean, the general idea on using it.

Klein Muçi (talkcontribs)

Update: I was able to make the variable work by adding 1 extra set of quotes around it.


The final request is like this:


curl -X PUT https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi%3ASmallem%2FLivadhi%20personal%2FRegexList -H "Content-Type: application/json" -H "Authorization: Bearer $(getaccesstoken)" --data '{"source": "'"$(getcodevalues)"'", "comment": "Testing out the REST API", "latest": { "id": "'$(getrevisionid)'" }}'


Now I'll work on sorting the loop out and I'll be done. I hope I don't bother you much during this phase. :P :)

Klein Muçi (talkcontribs)

Can you help me state the first part of this while loop statement:


While the filesize of myfile.txt is greater than 0, do...


I'm good with the last part I believe. I'm just having difficulties with extracting the filesize as an integer to make the comparison with 0.

Klein Muçi (talkcontribs)

Update: I was able to make that work with while [ $(wc -l < myfile.txt) -gt 0 ]; do... :)

Klein Muçi (talkcontribs)

I'm running into a strange problem. The loop works fine but the overall program gets confused with the revision id variable every once in a while and brings forth an edit conflict error. Making the loop sleep after every iteration does make the edit conflicts appear more rarely but I can't seem to get rid of them even after setting the sleep time on 10 minutes per iteration. No one is really editing anything on that page I'm working on.


Do you have any idea what's really going on? This used to happen even when sending requests manually (without loops) but then I could just resend the request and it would work fine. Now, every time I get an edit conflict, some part of data goes missing because each request is unique so even one error would be enough to corrupt the whole data unfortunately.

APaskulin (WMF) (talkcontribs)

I'm surprised to hear that you're running into edit conflicts, even with the latest revision ID and using a sleep. If you're open to it, I'd suggest opening an issue on Phabricator to get some advice from the developers who implemented the API: How to report a bug

Klein Muçi (talkcontribs)

Yes. It's always the same pattern: 2 requests go through, 1 gets stuck into a conflict. Sleep time doesn't play a role if it is more than 30 seconds. Having it less than 30 seconds will make the edit conflicts appear more frequently and less requests will go through but once you get past the 30 second mark, you can make it even wait for 15 minutes between requests and it will still bring forth the same pattern: 2 requests go through, 1 gets stuck into a conflict.


I'm a bit reluctant on going to Phabricator for this, fearing that I won't be able to explain myself clearly or that the issue is caused by my side. But if you think that would be a good step to take, I'll try it.

Klein Muçi (talkcontribs)

Update: If you add the revision ID manually it works fine every time, without needing any sleep time. When I use the function you gave me, the 2:1 pattern starts manifesting itself. Could it be that I need to change something in regard to that? I've set all of the variables as globals, if that plays any role.

APaskulin (WMF) (talkcontribs)

Do you think the issue is that the GET request is returning the wrong latest revision ID or that the PUT request is failing even though it has the correct latest revision ID?

Klein Muçi (talkcontribs)

I think the GET request is returning an outdated (hence wrong) version every once in a while (basically every third request).

Klein Muçi (talkcontribs)

Update: It's definitely related to the GET request not getting updated in time. I made an experiment to see the id revision on real time while the requests were going through. Strangely, this forces the update of the id revision to happen in time and every requests starts going through. Edit conflicts completely disappear. I remade the test 10 times and the same results get reproduced every time.


This gets into edit conflicts:

while [ $(wc -l < TEST) -gt 0 ]; do curl -X PUT https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi%3ASmallem%2FLivadhi%20personal%2FRegexList -H "Content-Type: application/json" -H "Authorization: Bearer $(getaccesstoken)" --data '{"source": "'"$(getcodevalues)"'", "comment": "Testing out the REST API", "latest": { "id": "'$(getrevisionid)'" }}' && sed -i "1,100d" TEST && curl https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi%3ASmallem%2FLivadhi%20personal%2FRegexList/html | grep -P -o '(?<=RegexListBegin:).*?(?=:RegexListEnd)' | fmt -1 >> TEST2 ; sleep 10s ; done


This never gets into edit conflicts (notice the added command):

while [ $(wc -l < TEST) -gt 0 ]; do curl -X PUT https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi%3ASmallem%2FLivadhi%20personal%2FRegexList -H "Content-Type: application/json" -H "Authorization: Bearer $(getaccesstoken)" --data '{"source": "'"$(getcodevalues)"'", "comment": "Testing out the REST API", "latest": { "id": "'$(getrevisionid)'" }}' && sed -i "1,100d" TEST && curl https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi%3ASmallem%2FLivadhi%20personal%2FRegexList/html | grep -P -o '(?<=RegexListBegin:).*?(?=:RegexListEnd)' | fmt -1 >> TEST2 ; curl https://sq.wikipedia.org/w/rest.php/v1/page/P%C3%ABrdoruesi%3ASmallem%2FLivadhi%20personal%2FRegexList | jq -r '.latest.id' ; sleep 10s ; done


I removed the added lines and the 2:1 patterns reappears. I made 5 tests like this and all of them have the 2:1 pattern. So basically now I have a solution but that's a crude hack. Why doesn't the function work on its own 1/3 of the times and why adding that line makes it work? (Take the final question more like a brainstorming invitation than a request.) And I'm really sorry for the unaesthetic code. I've yet to apply cosmetic changes to it.

Klein Muçi (talkcontribs)

If I echo the $(getrevisionid) variable anywhere in the loop works too. I just can't understand why having only the variable on its own starts malfunctioning.

APaskulin (WMF) (talkcontribs)

Thanks for this extra info! I'll keep thinking about this and let you know if I have any ideas.

Klein Muçi (talkcontribs)

On another subject, when I do send a GET request, am I always getting the most up to date version of the page or a cached version of it? If it's the later, is there an easy way to send a purge command?

APaskulin (WMF) (talkcontribs)

I was also thinking that it might be a caching issue, but I don't know much about the caching behavior of that endpoint. Here's an idea: there are a few headers related to cache behavior that we might be able to look at to see a difference between the requests. If you add -v to the curl requests (curl -v https...), it will output the headers for the request. Then we can see if there are any differences in the cache-related headers like x-cache-status.

Klein Muçi (talkcontribs)

Hey there! Thanks for the answer!


I'm a bit confused. The last request wasn't related to the problem we were discussing earlier but to another situation. The thing is that my bot needs to go to this page, get the codes from there and start using them in groups of 10 per request in another page. The codes in that page are generated by the invocation of a certain module (check the source code) which generates them depending on Mediawiki language codes. If they change, so does the module's results. The question was in this specific context, if the codes were changed by the Mediawiki, would I be sure that I was getting the most updated version of them when sending a GET request to that page? Or would I need to somehow make a null edit to the page to be sure of that? I'm reluctant on making POST/PUT requests on other pages because I'd have the same problem we've discussed above (edit conflicts) multiplied.


I guess your answer is related to the edit conflict situation. In the past days I did try activating the verbose mode like you suggest but I couldn't understand much from it. What exactly would I be looking for if I did redo it now?

Klein Muçi (talkcontribs)

Hey there! Today I had some free time and I looked at the verbose mode out of curiosity again to compare the requests when they got through and when they didn't. I compared them line by line and strangely enough, you were right on your last suggestion. The cache part changed in them.


This is when the request goes through and this is when it doesn't. Please take a look at them whenever you have some free time.


As always the pattern is always 2 go, 1 doesn't. (If you don't do the trick I've mentioned before.)

APaskulin (WMF) (talkcontribs)

I did notice this difference:

This is when the request goes through:

< cache-control: private, max-age=0, s-maxage=0

and this is when it doesn't"

< cache-control: no-cache

I'm not sure what this means or how to fix it, but I'll try to ask around.

Klein Muçi (talkcontribs)

Yes. I noticed the same thing when sending it. Don't mind the question I've asked before this because I was able to solve that. Thank you! :))

APaskulin (WMF) (talkcontribs)

Hi, my apologies that this is taking so long. Would it be possible to re-post the requests in pastebin? It seems that they've expired, and I can no longer see them. No rush

Klein Muçi (talkcontribs)

Hey there! No need to apologize at all. I've tried contacting with some other Wikimedia users on the developer profile and they haven't been able to help me at all. You trying to help even after literal months, means a lot to me.


I reuploaded my pasta, this time with a 6 month expiration period. This is when the request goes through and this is when it doesn't.


Also, if needed, I've uploaded the whole script for my bot with a 6 month expiration date. Maybe you can check if I'm doing something wrong somewhere else.

Link - Line 38 is the "unnecessary" line which normally shouldn't be there but if it's not added the request cycle starts the "2 go, 1 doesn't" pattern. Comments are in Albanian, you can translate them if you want (or ask me) but I think you won't need them much. I can also elaborate as much as you want in the purpose of the script if that's not clear.

APaskulin (WMF) (talkcontribs)

When I try to access those links, I get "This page is no longer available. It has either expired, been removed by its creator, or removed by one of the Pastebin staff." I don't have much experience with Pastebin, maybe they need to be reuploaded?

Klein Muçi (talkcontribs)

Reuploaded them all and updated the links above. They work now. My belief is that they may have been deleted by the administrators because of containing sensitive information. For the same reason I would urge you to see them as soon as possible and maybe copy the needed info in some other place of choice.

Klein Muçi (talkcontribs)

Apparently they still have problems. I used another service. The links should be working fine now.

Klein Muçi (talkcontribs)

Hello! :) I don't know if you have seen the "pastas" I sent above but I would ask to please save them somewhere of choice when you do so and notify me to delete them from the above links given that they contain confidential information.

APaskulin (WMF) (talkcontribs)

Thank you! I've saved them locally, so you can remove them from codepile.

APaskulin (WMF) (talkcontribs)

Hi @Klein Muçi, Apologies for taking so long to get back to you! I haven't gotten a chance to set up a test and open a Phabricator task for the issue you discovered, but hopefully I can get to it soon.

I wanted to reach out and see if you would be interested in participating in a user research study with the Wikimedia Foundation. I'm currently working on a project to make it easier to find technical documentation related to Wikimedia APIs and tools. My team is looking for people to help us by providing feedback, and I think your feedback would be really valuable. This would be a 60-minute interview in the next couple of weeks. If you're interested, please fill out this survey. Thank you!

Klein Muçi (talkcontribs)

Hello! No problem for that. Throughout this time my bot-job has evolved in ways in which that part wouldn't be that much helpful anyway. Although it would be interesting to understand why I kept seeing that patter of requests (2 go, 1 doesn't). I've asked a lot of people around for that and no one was able to help me.

As for the interview, I'd be glad to help in written form if that is possible but I don't really enjoy participating in real-time calls.

Reply to "Help"