Product Analytics/Dashboarding Guidelines
Publishing/sharing
editBefore publishing and/or sharing your Superset dashboard, please double check that you have:
- contact information
- correct access and permissions for your data and your audience
Refer to the sections below for details.
Contact Info
editUse the following template for the information at the top or bottom of the dashboard as Markdown:
This dashboard is maintained by {NAME}, [Product Analytics](https://www.mediawiki.org/wiki/Product_Analytics). If you have questions or feedback please email {name}@wikimedia.org or product-analytics@wikimedia.org
Permissions
editVirtual datasets
editFor Presto-based charts that rely on virtual datasets derived from event data, make sure the stakeholder has been added to analytics-privatedata-access
group.
If they are not, ask them to request access through Phabricator. Refer T286746 to as an example.
Physical datasets
editFor charts that rely on Hive tables added as physical datasets, make sure that users outside of your group have read access to the files in Hadoop:
hdfs dfs -chmod -R o+r <path to your table>
Example
editSuppose you did your ETL and created a countries.csv that you then make available in Hive via:
import wmfdata as wmf
wmf.hive.load_csv(
"countries.csv",
field_spec="name string, iso_code string, economic_region string, maxmind_continent string",
db_name="canonical_data",
table_name="countries"
)
You add it as a physical dataset within Superset and create a chart that relies on it. To make sure that everyone can view that chart (and dashboard) you would update permissions with:
hdfs dfs -chmod -R o+r /user/hive/warehouse/canonical_data.db/countries
If you loaded data into Hive manually and have the data available elsewhere, change the path accordingly.