Product Analytics/Dashboarding Guidelines
Publishing/sharing edit
Before publishing and/or sharing your Superset dashboard, please double check that you have:
- contact information
- correct access and permissions for your data and your audience
Refer to the sections below for details.
Contact Info edit
Use the following template for the information at the top or bottom of the dashboard as Markdown:
This dashboard is maintained by {NAME}, [Product Analytics](https://www.mediawiki.org/wiki/Product_Analytics). If you have questions or feedback please email {name}@wikimedia.org or product-analytics@wikimedia.org
Permissions edit
Virtual datasets edit
For Presto-based charts that rely on virtual datasets derived from event data, make sure the stakeholder has been added to analytics-privatedata-access
group.
If they are not, ask them to request access through Phabricator. Refer T286746 to as an example.
Physical datasets edit
For charts that rely on Hive tables added as physical datasets, make sure that users outside of your group have read access to the files in Hadoop:
hdfs dfs -chmod -R o+r <path to your table>
Example edit
Suppose you did your ETL and created a countries.csv that you then make available in Hive via:
import wmfdata as wmf
wmf.hive.load_csv(
"countries.csv",
field_spec="name string, iso_code string, economic_region string, maxmind_continent string",
db_name="canonical_data",
table_name="countries"
)
You add it as a physical dataset within Superset and create a chart that relies on it. To make sure that everyone can view that chart (and dashboard) you would update permissions with:
hdfs dfs -chmod -R o+r /user/hive/warehouse/canonical_data.db/countries
If you loaded data into Hive manually and have the data available elsewhere, change the path accordingly.