Open main menu

Extension:EventLogging/Programming

< Extension:EventLogging
See Extension:EventLogging/Guide for a comprehensive introduction to EventLogging, developing and deploying EventLogging schemas, and more.

Contents

How it worksEdit

After you have created a schema, you must register it, by using the $wgEventLoggingSchemas configuration variable (or the EventLoggingRegisterSchemas hook if it needs to be done dynamically). For an extension supporting extension registration, that would mean adding something like

"EventLoggingSchemas": {
    "GettingStarted": 5285779
}

to extension.json (in this example the schema is GettingStarted; 5285779 is the schema revision ID).

This automatically creates an mw.track topic you can send event data to; EventLogging will take care of logging them:

mw.track( 'event.GettingStarted', {
        experimentId: 'ob3-split-retest',
        bucketId: 'ob3a',
        action: 'gettingstarted-impression',
        userId: mw.user.getId()
} );
Registration of the schema name and revision ID is needed for the client-side javascript event logging to work. Without it the provided javascript library will not process the events.

If you only want to log data on the server side, there is no need to register the schema. You can log an event like this:

EventLogging::logEvent( 'GettingStarted', 5285779, [
        'experimentId' => 'ob3-split-retest',
        'bucketId' => 'ob3a',
        'action' => 'gettingstarted-impression',
        'userId' => User::newFromSession()->getId(),
] );

How to make a data modelEdit

  • Meet a researcher and determine what you're going to log, name the fields to log, reusing well-known field names.
  • Create a JSON structure representing this data model in the Schema: namespace on meta, tweak it until it saves without errors.
    • Sample: m:Schema:OpenTask
    • Tip: http://jsonlint.com/ has better error reporting, copy and paste your JSON into it.
    • Tip: if you have a JSON file with desired fields and values, http://www.jsonschema.net/ will guess at a schema for it (but with extra info like "id" that we don't currently use) that you can start with.
  • Use the schema's talk page (sample) to link to experiments using this, discuss details, etc.
    • Always document what code in what circumstances logs the event

Then:

  • Developers write code to log events that match the data model.
  • The data model tells analysts what information is in the logs.

VersioningEdit

If code tries to log an event that doesn't match the data model that EventLogging retrieved, EventLogging will log the event anyway but flag it as invalid. Since you always give a schema revision, you can edit the schema as much as you want without affecting existing code.

It's OK to have different kinds of events (often called actions) sharing one data model. That way the events go into one table and it may simplify querying and multi-dimensional analysis. Only add "required":true to the fields that are applicable to all events.

Data fieldsEdit

Available data modelsEdit

Implementation notesEdit

JSON schema validationEdit

Each data model JSON file on meta-wiki is a JSON schema. This is an evolving standard to specify the format of JSON structures, in our case the logged event.

  • the JSON schema draft.
  • When code attempts to log an event, EventLogging only pays attention to a subset of JSON schema features, including:
    • type: boolean, integer, number, string, array, object
    • required: true/false
    • enum values
  • For details, see schemaschema.json.

Programming topicsEdit

Good starting codeEdit

  • The WikimediaEvents extension has working code to log server-side events in PHP in WikimediaEventsHooks.php.
  • The GettingStarted extension has setup code to declare and require the "openTasks" schema resource and log to it in JavaScript, but it's gotten more elaborate in 2013.

Client-side loggingEdit

  • require your schema wherever you need to log events (it will pull in the ext.eventLogging module which contains the mw.eventLog object).
  • See modules/ext.eventLogging/core.js for API documentation.

TipsEdit

  • In JavaScript code, use mw.eventLog.setDefaults() to set common values for fields to log that don't change, such as version, the user's name, etc.
  • Extension:EventLogging/Guide#Data fields lists common field names already used in schemas and the JavaScript that fills them. Don't reinvent the wheel.
  • Adjust your sampling ratio such you are not sending more than 5-10 events per sec. With 3 events per sec in 3 months you are likely to have over 2 million rows in your schema table it will be hard to query data if volume is so high.

DebuggingEdit

If code attempts to log an invalid event, EventLogging logs it anyway. If you want to enable informational validation (does not affect logging) see: https://www.mediawiki.org/wiki/Extension:EventLogging/Guide#See_logging_in_your_browser. If the logged event has a revision of -1, it's possible you haven't registered your Schema correctly.

Monitoring eventsEdit

  • Client-side event logging works by sending a beacon request (falling back to a beacon image request) to $wgEventLoggingBaseUri with the the JSON-encoded event capsule in its query string. To see the log events you can
    • watch for this request in your browser's network console,
    • look for it in your web server's access logs, or
    • run the toy web server server/bin/eventlogging-devserver in the EventLogging extension which pretty-prints the query string.
  • An alternative to the above is to enable the more user-friendly debugging UI introduced in Gerrit #I1ac4a5. Currently, the debugging UI is shipped to all users but is enabled via a hidden user preference, which can only be set by pasting the following into your browser's JavaScript console:
mw.loader.using('mediawiki.api.options')
    .then(
        () => new mw.Api().saveOption('eventlogging-display-web', '1')
    );
  • To monitor events after processing, you can append an then callback after a logEvent call, for example:
mw.eventLog.logEvent('MySchema', {foo: 'bar'}).then(
    () => {
        console.log('A MySchema event has been sent!');
        
        // All validation errors will have been tracked via the
        // 'eventlogging.error' topic. Since I0bf3bd91, however, there's no
        // easy way to detect if the event that was logged was valid.
    },
    () => console.warn('Couldn\'t log the MySchema event!')
);

Logging clicks on linksEdit

Often you want to log clicks on links. If these take the user away from the current page, there's a chance that the browser will move to the new page before the request for the beacon image makes it onto the network, and the browser will drop the request. The E3 team experimented with using deferred promises to deal with this, but that introduced known and unknown unknowns. task T44815 is related to this issue.

There are significant performance concerns regarding logging before showing the next page and our recommendation is not to do that until the new beacon API becomes available [1]. Details on performance issues can be found here: https://bugzilla.wikimedia.org/show_bug.cgi?id=52287

See alsoEdit