Adding proper email bounce handling to MediaWiki (with VERP)

edit
Public URL
(https://www.mediawiki.org/wiki/VERP)
Bugzilla report
(https://bugzilla.wikimedia.org/show_bug.cgi?id=46640 )
Announcement
(http://lists.wikimedia.org/pipermail/wikitech-l/2014-March/074911.html).
Progress Report
(https://www.mediawiki.org/wiki/VERP/GSOC_Progress_Rerport)

Name and contact information

edit
Name
Tony Thomas
Email
01tonythomas@gmail.com
IRC or IM networks/handle(s)
tonythomas
Location
Kerala, India
Timezone
Kolkata, INDIA, UTC+5:30
Typical working hours
5pm to 12:30am (workdays) 10:00am to 9:30pm (weekends)

Project Summary

edit

It's likely that many Wikipedia accounts have a validated email address that once worked but is out of date. Wikipedia do not currently unsubscribe users who trigger multiple non-transient failures and some addresses might be 10+ years old. The wiki should not keep sending email that is just going to bounce. It's a waste of resources and might trigger spam heuristics. Two API calls need to be implemented:

  1. One to generate a VERP address to use when sending mail from MediaWiki.
  2. One that records a non-transient failure. That API call would record the current incident and if there had been some threshold level met, eg at least 3 bounces with the oldest at least 7 days ago, then it would un-confirm the user's address so mail will stop going to it.

For the second call, authentication will be needed so fake bounces are not a DoS vector or a mechanism for hiding password reset requests. The reason for the threshold is that some failure scenarios will resolve themselves, eg mailbox over quota, so we don't want to react to one bounce. We want a history of consecutive mails bouncing. There would be a Mediawiki development component to this task to build the API, to add VERP request calls wherever email is sent, and an Ops component to route VERP bounces to a script (taking the mail as stdin, and optionally e.g. the e-mail address as arguments), which can then call the (authenticated) MediaWiki API method to remove the mail address. Since its the time MediaWiki mail infrastructure is being moved to new Data Center, this is the right time to implement VERP.

VERP stands for Variable Envelope Return Path, and on implementation alters the default envelope sender. For eg: if an email needs to be send to bob@example.com, VERP alters the default envelope sender from  : wiki@wikimedia.org to a prefix/delim/hash:  [bob][-][mdfkdjw6R4xGdiflfdfkQ]@wikimedia.org, so that the bounce can be used more effectively . The API would record the return address of the bounce and deduce that a mail to bob have failed. On consecutive failures, say at least 3 bounces with the oldest at least 7 days ago, the second API un-confirms the user's address.

The return path address needs to be a prefix/delim/hash as to avoid fake bounces DoSing a user. The VERP address will generally look like this :

bounce-{$key}@wikimedia.org

The prefix /^bounce-/ is used by the incoming MTA as a hook to route messages to the bounce processor, and $key is used by the bounce processor to figure out which wiki user is having delivery issues. An attacker needs to be prevented from spoofing bounce messages and causing mass unsubscribes. This can be accomplished by making $key secret, and not a simple hash that can be reversed or guessed. Generating an HMAC, with a secret key, over a string containing the user's email address, timestamp, and the list name will be the best option as per security experts in MediaWiki. HMAC can be generated by one of PHP's built in function.

Possible mentors
Jeff Green, Kunal Mehta

Problem Background

edit

When an email is sent, on the Wiki web server a message is injected to the local MTA in a shell call by the user the MediaWiki web server daemon runs under. MediaWiki uses the config variable $wgPasswordSender to set the envelope sender, and all messages are sent as the user (for example 'wiki@wikimedia.org'). In WMF's environment, the webserver's MTA is configured to route all messages through the organization's main mail server, which relays them to the destination/remote server as determined from DNS MX records. There are many points where the delivery can fail, for example:

  • DNS lookup failure (Permanent failure)
  • Network failure (Temporary failure)
  • Remote server could be overloaded (Temporary failure)
  • Remote server might blacklisted wikimedia.org or wiki@wikimedia.org (Temporary failure)
  • Remote server could say example@gmail.com is a bad address (Permanent failure)
  • Remote server could say example@gmail.com is over quota (Temporary failure)

Each case can result in the mailserver currently handling the transaction to originate a bounce message. So a bounce can originate within the local system (i.e. the WMF environment) or the remote system (the recipient's environment). Bounce messages generally go back to the envelope sender. Currently, in the case of WMF's system, bounces coming back to wiki@wikimedia.org are sent to /dev/null.

Deliverables

edit

Since its time the WMF is moving its servers to a new data center and the mail infrastructure is being rebuilt, this is the right time to implement the functionality. The final results should be :

  • All emails for users of WMF-hosted wikis should have their default envelope sender changed from wiki@wikimedia.org to a VERP generated envelope sender as (prefix/delim/hash) say : [bob][_][mdfkdjw6R4xGdiflfdfkQ]@wikimedia.org
  • If the mail delivery fails due to any of the problem discussed above, a return mail should reach WMF mail servers with the receipient [bob][_][mdfkdjw6R4xGdiflfdfkQ]@wikimedia.org, and an API running there should record the failure and check for the past history of bounces of for this user from a database, and unconfirm the user if threshold level met.
  • The VERP generated recipient address will be the output of an HMAC with a secret key, over a string containing the user's email address, timestamp, and the list name.
Additional Deliverables

Replacing PHP MailUser with SwiftMailer : Swift_Mailer seems to be lot robust than PHP MailUser. Parent5446 Writes- "PHPMailer has everything packed into a few classes, whereas Swift_Mailer actually has a separation of concerns, with classes for attachments, transport types, etc. A result of this is that PHPMailer has two different functions for embedding multimedia: addEmbeddedImage() for files and addStringEmbeddedImage() for strings. Another example is that PHPMailer supports only two bodies for multipart messages, whereas Swift_Mailer will add in as many bodies as you tell it to since a body is wrapped in its own object. In addition, PHPMailer only really supports SMTP, whereas Swift_Mailer has an extensible transport architecture, and multiple transport providers. (And there's also plugins, and monolog integration, etc.) ".

Why SwiftMailer

edit

SwiftMailer is a serious and well maintained library dedicated to sending email from PHP.

Adding it to core is not some new feature that would be better in an extension. It is a serious improvement to our core handling of email within PHP that replaces our crude UserMailer code.

Our current UserMailer code is ridiculous when you look at it.

By default our UserMailer code will just use php's creaky mail() function. Which has some 'features' like:

  • Unpredictable message headers
  • Lack of feedback regarding delivery failures
  • And this beautiful comment in our codebase:
# PHP's mail() implementation under Windows is somewhat shite, and
# can't handle "Joe Bloggs <joe at bloggs.com>" format email addresses,
# so don't bother generating them

If you want to send email a different way, you could of course instead use $wgSMTP which:

  • First requires you to install PEAR Mail.
    • Who the hell uses PEAR anymore!
    • And don't forget, PEAR is a tool that installs modules globally and is difficult to impossible to use without shell and even admin access.
    • ;) And this is the hell we put tarball users through, not devs.
  • This PEAR Mail library we're relying on to handle all users who can't use mail() or don't want it's features, here's the dev page:
    • http://pear.php.net/package/Mail
    • Current stable 1.2.0, released in March of 2010
    • Next release, 1.2.1 scheduled release date in January of 2011, never released.
    • ((In other words, the library we're entrusting all our SMTP handling to is practically dead and no longer maintained, Whoops.))
    • Admittedly if you hunt down PEAR Mail's github repo there has been a bit of activity:
      • https://github.com/pear/Mail/commits/trunk
      • But none of this will be installed by pear, it'll only install old code from 2010 missing fixes for a number of bugs in PEAR Mail
      • The majority of changes in 2014 have been minor/trivial things (adding travis builds, whitespace fixes)
      • And, ;) making PEAR Mail installable via Composer *snicker*

And to sprinkle this all off, because mail() and PEAR Mail are so different, half the code in UserMailer::send() is split into two different large code paths, which is a recipe for maintenance headaches.

Using Swift Mailer on the other hand:

  • The library has ample development and maintenance going on: https://github.com/swiftmailer/swiftmailer/commits/master
  • SMTP and mail() are abstracted away into transports, so instead of two large (unmaintained?) code paths we just craft and send an email in one code path, a well maintained library takes care of the difference between transports.
  • In the future we can move away from using mail() and add the ability to use sendmail as a transport directly, without the bugs (in theory I think it would even be possible to try swapping a different transport in place of mail() automatically).
  • All this gets bundled into the tarball directly, so $wgSMTP now works out of the box and doesn't require installation of something that in some situations is impossible to install.

Project Schedule

edit
Task Timeline Remarks Status
Setup environment to replicate WMF Mail servers 04/03/14 - 20/03/14 Created a client-server model including 2 virtual boxes Box1 and Box2. Box1 has MW running -> sends the email -> intercepted by Box2 -> routed to Box2 /var/mail/root. Box2 has external connection via NAT. More info: picture[1].

Box2 rejects the mail, Box1 exim produce the bounce to wiki@wikimedia.org in /var/mail/wiki

  Done
Move local environment to Labs 20/03/14 - 10/04/14 Shifted the above local instance to WikiTech labs ( under project mediawiki-verp ). box1verpnop sends the mail having box2verpnop as the smarthost which rejects all mails to wiki@wikimedia.org. The bounce is created by box1 in the /var/mail/root of box1verpnop.   Done
Automate and handle bounces on test server ( Make the bounce redirect to a script ) 20/04/14 - 30/04/14 Successfully redirected mails to the default root@system_domain to a PHP script, which parse it. Script can be found at /var/www/script.php in Wikitech-I labs box1verpnop.   Done
Write API-I to create VERP address on sending mail 30/04/14 - 15/05/14   Done-https://gerrit.wikimedia.org/r/#/c/138655/ , https://gerrit.wikimedia.org/r/#/c/141287/ and   Done https://gerrit.wikimedia.org/r/#/c/140330/   Done
Enhance API-I to make the from address generated using HMAC with a secret key 15/05/14 - 30/05/14   Done https://gerrit.wikimedia.org/r/#/c/138655/ and   Done https://gerrit.wikimedia.org/r/#/c/140330/   Done
Write API-II to record bounces 30/05/14 - 20/06/14   Done https://gerrit.wikimedia.org/r/#/c/140330/   Done
Mid Term Evaluation 20/06/14 - 23/06/14   Done   Done
Enhance API-II filter out only valid bounce messages 23/06/14 - 15/07/14   Done in https://gerrit.wikimedia.org/r/#/c/140330/   Done
Enhance API-II un-confirm users if a threshold level met. Add a job queue handler to support multiple bounces calling the script. 15/07/14 - 30/07/14   Done https://gerrit.wikimedia.org/r/#/c/142786/,  Done https://gerrit.wikimedia.org/r/#/c/144656/   Done https://gerrit.wikimedia.org/r/#/c/145881/   Done
Writing Documentation, Deploying changes to WMF side 30/07/14 - 10/08/14   Donehttps://gerrit.wikimedia.org/r/#/c/154342/   Done
Final Report Submission 20/08/2014

Workflow

edit
 
MediaWiki Test Environment

As stated, my project involves work on the System Ops component to alter the behavior of McHenry, and a MediaWiki development component to build the API's. I had been discussing with my mentors (Jeff Green and Legoktm) and came out with an environment model to automate the whole setup using multiple Virtual Boxes. The setup involves a Linux machine ( Box1 ) hosting MediaWiki communicating to a second Linux server ( Box2 ) with postfix , bind9 and mutt installed acting as a router to the external world. The Box2 acts as a DNS server for the internal network and have NAT enabled so that, it can connect to the external world, and is configured to intercept any mails/packets passing its way and redirect it to /var/www/mail/root folder. This make sure that all emails sent from Box1 reaches safely into Box2 for inspection. The MediaWiki instance is made accessible from my host by a defined ip. We are successful in getting this stage done, now I will have to move it to the Wikimedia Labs instance to include collaboration with my mentors and the community. Later API for Verp can be implemented in Box1, and bounce calls originate on Box2 to implement the feature, which can be implemented in the WMF side.

Participation

edit

Personally, I am strict with the idea of sharing. It's always - Think free, code better. This has always helped me to get well with the WikiMedia community. I find time to write about something new learnt in my blog, and it will be the central focus of all progress reports. All source code I write will be published daily to my Github repo and Gerrit to make sure of collaboration. I try always to stay live in IRC, and am regular in replying to emails, allowing me to keep pace with the community. Testing and documentation will be added to the Wikitech Mail page. I had always found IRC's better than hangouts in getting things done, and always will utilize #mediawiki and #wikimedia-dev to the maximum.

About Me

edit

I am a 19 year old Computer Science Engineering student from Amrita Vishwa Vidyapeetham, Kerala, India. I am an active member of the FOSS community here - FOSS@Amrita. The FOSS club helps me work with code even late night in my college lab. I am a consistent user of Linux for the past three years. The feeling of Open Source is so compelling, you can never quit contributing to them. I found the Wikimedia Community one of them. My first contribution to Open Source was a bug fix to MediaWiki almost six months before. Since then, I was working with the codebase, fixing and looking for errors, and creating new. I was able to help and mentor many of my FOSS club mates to contribute to MediaWiki, you can see the full list here. Along with academics I find myself time to work in the lab from 5:00 pm to 11:00 am on all working days and 10:00 am to 11:00 pm on weekends.

I have chosen adding functionality to the Email Component as my project as it involves both the server side and MediaWiki side development. As one of my mentors mentioned it, I am sure its going to be a fun project, and helps me implement all the Networking lessons I learnt when I did the MCITP networking course before. My main aim is studying new things, understanding huge code bases, but this involves a hardware element too, which excites me. Coming from a remote village in Kerala I think this would be a boost for me to spread Open Source in a much removed society, who still are ignorant about collaboration and thinking freely. You can check my blog here - Through my Pages.

How did you hear about this program?

Being part of the FOSS club here, I get updated on various events and competitions in the Open Source world. I have many of my senior members who already did GSoC for various projects, and they were my primary source of inspiration. Moreover being active in #mediawiki gets me all the latest announcements.

Past open source experience

edit

I have contributed to the Wikimedia Project by various bug fixes in MediaWiki Core, Extensions and Browsertests.