Analytics/Hardware/Appendix
(Redirected from Analytics/2012-2013 Roadmap/Hardware Appendix Comparions)
This page is obsolete. It is being retained for archival purposes. It may document extensions or features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. |
Comparisons
editSeveral large consumers of big-data have published detailed information about some of their data processing products and clusters. A few are presented here for reference.
Twitter Rainbird (Cassandra, 2011)
editAnalytics for the Promoted Tweets advertising platform.
- 100,000s writes/sec
- 10,000s reads/sec
- 100TB+ storage
- Extremely low latency: <100ms reads
- Events are batched for ~60 seconds
- Parsing and structuring performed by bundlers
- Clients submitting events are Rainbird-aware
References
edit
Facebook Insights (Hbase, 2011)
editSocial-plugin analytics for site owners.
- 20 billion events/day
- 200,000 events/second
- <30s average delay before event surfaces in queries
- 100+ metrics, but stored only as counters
- Events are batched for ~1.5 seconds
- Each node handling 10k writes/sec
References
edit- https://www.facebook.com/note.php?note_id=10150103900258920
- http://highscalability.com/blog/2011/3/22/facebooks-new-realtime-analytics-system-hbase-to-process-20.html
Facebook Messages (Hbase, 2010)
editThe Facebook messaging system.
- 135 billion messages/month (~4.5B/day)
- 1.5M+ operations/second at peak
- ~55% reads: 825,000 reads/sec
- ~45% writes: 675,000 writes/sec
- 2PB+ (petabytes) data in storage