For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at firstname.lastname@example.org
Download the OSCON Data Sponsor/Exhibitor Prospectus
For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com
For media-related inquiries, contact Maureen Jennings at email@example.com
To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)
View a complete list of OSCON contacts
Scaling Solr horizontally (having multiple indexes) can be painful, and may even cause nightmares in some situations. There are multiple techniques available to scale Solr horizontally. The simplest is running multiple Solr webapps, however this falls apart due to the memory requirements. Next up is using Solr’s multiple core functionality, which scales much further but still runs into memory usage issues after awhile. The final technique is to use the multiple core in combination with LRU (least recently used) management code to shutdown inactive cores. Lets dive into each of these techniques in more detail.
Running multiple Solr webapps is the first technique and easiest to setup. The problems is it doesn’t scale very far at all. It doesn’t take many instances of Solr webapps to be running before it starts to eat up a ton of memory. Adding swap only helps for so long, before your severs start to scream for mercy.
Multicore on the other hand, allows Solr to scale much further, especially if you can share a common schema between cores. Setting up cores and the stopping of cores is easily accomplished via HTTP requests.
If you have many inactive cores in your cloud you can scale Solr even further by managing the multiple core setup. If you have an application were the cores do not need to be active at all times you can implement a LRU management layer. When the index is needed it is started if it is stopped. The other half of it is a management script that finds the LRU cores and shuts them down.
Solr is a great platform to use for searching your application’s data, and knowing different techniques for scaling it horizontally will help you make the most of it within your cloud environment.
Andy is a husband, programmer, system administrator, entrepreneur, musician, private pilot, & optimist. He is the lead software engineer on the Barracuda Networks Backup product.
Comments on this page are now closed.