Readers/Web/Metrics Platform Adoption/Search Recommendations Hypothesis Testing 2024
Hypothesis and Metrics Platform Setup for Session Tracking (2024 Q2 Sprint 4)
editHypothesis
edit“Users who interact with this feature in mobile web (Minerva) will have 5% longer sessions than those who do not.” See T378109
Session Tick Implementation for Hypothesis Testing
editTo measure session length accurately and verify this hypothesis, we set up a new Session Tick mechanism to track engagement in WikimediaEvents:
Initiate Tracking:
editSessionTickMixin.start(streamName, schemaID);
This starts tracking user sessions tied to a specific data stream (streamName) and schema (schemaID), both configured for the Metrics Platform.
Tracking Events:
editEach tick event includes:
{ "action": "tick", "action_context": $tickNumber }
action: Specifies the event type as “tick” to represent session continuity.
action_context: Tracks the tick count ($tickNumber), which increments to estimate session duration.
Metrics Platform Integration
editThese session ticks should integrate directly into the Metrics Platform, providing data for analysis on session length and frequency of interactions for users with and without feature engagement. Setting up a new session tick instrument ensures the platform can:
1. Differentiate Users by Feature Interaction:
edit- The current sesstionTick instrument (modules/ext.wikimediaEvents/sessionTick.js) aggregates all users, so we need a different instrument to identify if users are logged in, logged out, or using a temporary account.
- Add fields (like performer.is_temp) or use extra events to track user state.
- If necessary, separate session length tracking by user type (?) This lets us see if logged-in, logged-out, or temporary users who interact with the search feature have longer sessions.
2. Evaluate Hypothesis Based on Session Length Increase
editThe hypothesis suggests a 5% increase in session duration for users interacting with the search feature. By comparing tick counts (action_context tick increments) across sessions, we can determine if feature interactions correlate with longer user sessions, signifying success if the average session length increases by the predicted margin.
3. Payload Management and Event Consistency
editUsing { "action": "tick", "action_context": $tickNumber } fields standardizes event tracking across instruments. With HTTP/2 compression, tick events are efficient in terms of payload, allowing frequent updates without overloading the platform.
Meeting Notes
editModularized Instrumentation for Wikimedia Events
editOverview
editWe discussed an approach that simplifies tracking by putting all relevant events in a single instrument. No cross-code dependency headaches, fewer bugs. Inspired by Product Analytics, we’re making each instrument self-contained and responsible for sending its own events.
Key Goals & Terms
edit- Modular Mix-In: Create instruments that handle their own events. No cross-code complexity—each feature has its own “bubble.”
- Session Tick Instrument: Tracks active browsing sessions with regular “tick” events to monitor session length/depth.
- Events & Browser Optimization: Browser handles background data sending. Small payloads + HTTP/2 compression = low performance impact.
Implementation Steps
edit1. Create a New Instrument
edit- Where? Wikimedia Events folder (preferred for clarity, documentation).
- Goal: Instrument sends all required events (no dependency on other instruments).
2. Integrate Event Listeners
edit- Attach: Connect to search overlay’s events like “open,” “search start,” etc.
- Track: Add listeners to detect user actions and store session data.
Example Code:
edit$('#search-overlay')
.on('open', () => sessionTick.recordEvent('search_open'))
.on('searchStart', () => sessionTick.recordEvent('search_start'));
3. Session Tick Mix-In
edit- What It Does: Starts counting ticks (every minute by default) for active sessions. Runs independently, syncing with other events.
- Setup: Specify a stream name and schema ID to connect it to analytics.
Usage:
const sessionTick = new SessionTickMixIn(STREAM_NAME, SCHEMA_ID);
sessionTick.start();
4. Event Schema & Action Field
edit- Schema Design: Make sure your schema has an action field to store tick events.
- Tick Structure: Each tick event has action and action_context fields (e.g., tick count, active session time).
Example:
{ "action": "tick", "action_context": "tick_number_i" }
5. Tick Frequency
edit- Default: Every 1 minute, capturing active browsing.
- Adjustable: Can increase frequency if needed, but 1 minute works for general user actions.
- Zero Tick: An initial tick is sent immediately to capture very short sessions (e.g., quick mobile lookups).
6. Session Types
editThree Key Types:
edit- Search Session: (Not tracked here)
- Browsing Session: Active engagement sessions only.
- Analytics Session: Internal data tracking.
- Focus: This instrument tracks browsing sessions only.
7. Handling Backgrounded Tabs & Edge Cases
edit- Detect Backgrounding: Instrument pauses ticks when tab is backgrounded, resumes when in view.
- Local Storage Signaling: Manages session state across multiple tabs and tracks active/inactive status.
- Edge Cases: Tracks active time accurately (e.g., user away from the screen, backgrounded tabs).
Example Instrument Code Structure
// Define constants
const STREAM_NAME = 'yourStreamName';
const SCHEMA_ID = 'yourSchemaID';
// Initialize Session Tick Instrument
const sessionTick = new SessionTickMixIn(STREAM_NAME, SCHEMA_ID);
sessionTick.start();
// Attach event listeners to search overlay
$('#search-overlay')
.on('open', () => sessionTick.recordEvent('search_open'))
.on('searchStart', () => sessionTick.recordEvent('search_start'));
// Automatic ticks every minute
Quick Notes & Optimization
edit- Session Length Approximation: We track active time in ~1-minute chunks. Good enough for most purposes but doesn’t capture exact session end.
- Low Impact: Background data sending keeps performance stable (small payloads, browser manages async).
- Schema Compatibility: Stick to the action field standard in schema for easy data handling.
Things to Keep in Mind
edit- High Resolution (1-minute tick): Works well for most analytics needs; tests show it’s reliable even for fast browsing.
- Immediate Zero Tick: Short sessions are captured from the start; crucial for mobile interactions.
- Edge Case Handling: Instrument detects when tabs are backgrounded, so we only log active engagement.
FAQ
editQ: Does this track users switching tabs?
A: Yes! It detects if a tab is backgrounded, pausing ticks until the tab is active again.
Q: What if the session ends under a minute?
A: The initial tick records instantly, so short sessions are captured.
Q: What about multiple tabs?
A: Local storage handles cross-tab syncing, ensuring consistent session tracking.