Here lies Chad's first impressions on Solr 4.2, after a week of using it.
- I setup a SolrCloud on labs
- Used 3 nodes acting as zookeeper (solr-zk[0-2])
- Used 4 nodes for solr (solr0-solr3), one collection in two shards
- The new SolrCloud stuff is (mostly) awesome
- It's super easy to add new replicas to a collection, so it scales out nicely.
- Each instance can act as a master (for writes) or a slave (for reads)
- This removes the SPOF of having a single indexer (lsearchd) or a single master (solr 3.x and below)
- Has a gui which makes it easy to look at the state of the "cloud"
- Zookeeper manages config & index state
- Can't re-shard a collection, requires index rebuild. There's bugs reported for this, no ETA.
- Proper initial planning for the larger indicies would make this less of a priority.
- Zookeeper was easy to setup, works with standard ubuntu packages & minimal config
- Not really a SPOF since it requires multiple instances to run.
- Formula is "require 50% + 1" to operate. So with 3 servers you need 2/3 operational, with 5 you need 3/5, etc.
- Even the "leader" isn't a SPOF--if the leader goes away then zookeeper elects a new leader.
- Zookeeper is already used by analytics, and they like it.
- Unknown how well it would work cross-DC (conflicting reports)
- Not really a SPOF since it requires multiple instances to run.
- Solr 4.x isn't in Ubuntu yet, not even raring
- Installed by hand for the demo, but we'll want to look into real packages.