Product Analytics/Query Style Guide

Who should use this guideEdit

This document is a guide for anyone who is working with SQL/HiveQL queries in the product analytics team and the foundation, and would like to write clean and clear code that is meant to be shared and reused. For product analysts, these style conventions are mandatory for any publicly-available query.

General principlesEdit

  • Use spaces(2 space indent), or tab characters.
  • All SQL keywords in all caps.
  • Variable names are lowercase, with spaces represented by underscores (variable_name not variableName).

SELECT statementsEdit

Align all columns to the first column on their own line:

SELECT
  projects.name,
  users.email,
  projects.country,
  COUNT(backings.id) AS backings_count
FROM ...

SELECT goes on its own line:

SELECT
  name,
  ...

Always rename aggregates and function-wrapped columns:

SELECT
  name,
  SUM(amount) AS sum_amount
FROM ...

Always rename all columns when selecting with table aliases:

SELECT
  projects.name AS project_name,
  COUNT(backings.id) AS backings_count
FROM ksr.backings AS backings
INNER JOIN ksr.projects AS projects ON ...

Always use AS to rename columns:

SELECT
  projects.name AS project_name,
  COUNT(backings.id) AS backings_count
...

FROM statementsEdit

Only one table should be in the FROM. Never use FROM-joins:

SELECT
  projects.name AS project_name,
  COUNT(backings.id) AS backings_count
FROM ksr.projects AS projects
INNER JOIN ksr.backings AS backings ON backings.project_id = projects.id
...

FROM statementsEdit

Explicitly use INNER JOIN not just JOIN, making multiple lines of INNER JOINs easier to scan:

SELECT
  projects.name AS project_name,
  COUNT(backings.id) AS backings_count
FROM ksr.projects AS projects
INNER JOIN ksr.backings AS backings ON ...
INNER JOIN ...
LEFT JOIN ksr.backer_rewards AS backer_rewards ON ...
LEFT JOIN ...

The ON keyword and condition goes on the INNER JOIN line:

SELECT
  projects.name AS project_name,
  COUNT(backings.id) AS backings_count
FROM ksr.projects AS projects
INNER JOIN ksr.backings AS backings ON projects.id = backings.project_id
...

WHERE statementsEdit

Multiple WHERE clauses should go on different lines and begin with the SQL operator:

SELECT
  name,
  goal
FROM projects
WHERE
  country = 'US'
  AND deadline >= '2015-01-01'
...

CASE statementsEdit

CASE statements aren't always easy to format but try to align WHENand ELSE together inside CASE and END:

CASE 
    WHEN category = 'Art' THEN backer_id
    ELSE NULL
END

WITH clause subqueriesEdit

Multiple Common Table Expressions (CTEs) should be formatted accordingly:

WITH backings_per_category AS (
  SELECT
    ...
), backers AS (
  SELECT
    ...
), backers_and_creators AS (
  ...
)
SELECT * FROM backers;