If I have a Solr core with a half-dozen small fields that’s loaded with 100 million documents, will adding a batch of 1 million documents run in a reasonable amount of time? How about 10 million? By reasonable, I’m thinking hours, rather than days. I’ve been told that this will take a long time to run. Is this really an issue? What are known strategies to improve performance? The fields are typically small, that is, 5-50 characters.
Share
This is a very “tricky” question whose answer differs from schema to schema.
It’s upto you to decide if a long running data post is an issue or not. If your application is user intensive, then I suggest that you follow some kind of master-slave configuration so that the user is not impacted by the high cpu usage when you post the data. Some strategies which I know about improving performance is “sharding”.
http://carsabi.com/car-news/2012/03/23/step-by-step-solr-sharding/
or if it is possible to demarcate the records by some field and put those different documents onto different servers.