EventLogging/OperationalSupport

Intro edit

The goal of this page is to describe the operational support the Analytics team provides for Event Logging.

Tier 2 support edit

We consider event logging a tier-2 system. An informal definition for a tier-2 system is one that helps you to operate better. A tier-1 system helps you operate, meaning that without the system being up, you cannot function. We believe Event Logging is tier-2 as it is used for data that help us improve our applications but we can certainly function without it.

Being tier-2 means that we provide support for Event Logging during business hours in the absence of any tier-1 issue that might be affecting our infrastructure. Event Logging could go down and be down for 48 hours (a weekend), so you should be sure your reporting can deal with gaps in the data.

Outages edit

Any outages that affect Event Logging will be tracked on wikitech:Incident documentation and notified to the lists eventlogging-alerts@lists.wikimedia.org and ops@lists.wikimedia.org

Alarms edit

Alarms at this time come to the Analytics team. We are working on being able to claim alarms in icinga.

Contact edit

You can contact the analytics team at: analytics@lists.wikimedia.org