Analytics/Research

Whitepapers, blog posts, etc. of note and interest to us, the Analytics team

Ideas

edit
  • Create an Ambrose action for Oozie. (Docs for Custom Oozie Actions)
  • Job Generator Scripts
    • Job script & templates -- Instead of copy-pasting old jobs to create new jobs, we could script the use of templates, with parameters & actions for job-types and common tasks.
    • Script to use Pig's DESCRIBE on the target of the final STORE command to generate a Hive CREATE TABLE statement.
  • Aggregate and index task stats by generating code unit (bundle/coord//workflow/script), by spawning job, by user, etc
    • sum counters, run stats, make graphs