We are thinking about upgrade to Solr4.0 early next year. The main reason is to utilize the Solr cloud to solve the current single point failure(only one update server) and scalability issue. I want to get your expertise advice how to proceed with it. Backward compatibility is the major concern. Do we have to recreate all index?(It will take a few days) Any special difference that we need to pay attention to? Any known big issue in Solr4.0?
You don't need to reindex, but it is advised to reindex. I have done a migration w/o reindexing. It will still read 3.x indexes and convert them to 4.0 on merges. Solr 4.0 can also handle the routing for single core just the same even though the default core comes packaged a little different. You will want to revisit your configuration files (solrconfig.xml, solr.xml and schema.xml) because there are some changes there. Use the default ones that come with 4.0 and merge in your schema, config, etc instead of the other way around. The best place to keep track of known issues is on the JIRA page: https://issues.apache.org/jira/browse/SOLR/fixforversion/12322551
From there, you now have the new features of Solr 4.0 to think about, especially architecturally with Solr Cloud. If you decide to use this, it is highly advised (one even might say required w/o it really being required) installing a separate zookeeper instance. IE: don't use the one that comes with Solr so you have redundancy, You will bootstrap your configuration into zookeeper and point the rest of your servers to the zookeeper host and port. The easiest way you can make changes are either use LucidWorks via the UI or change the files on the filesystem and restart your solr instance so it can rebootstrap into ZK.