Yes, you can run a Splunk instance in an OS hosted on a hypervisor. We do this ourselves for testing and will support it if the OS and file system is otherwise supported.
However, there are serious performance implications to consider.
A search head (i.e. a Splunk instance that only services search / distributed search) is a good candidate for virtualization. It is generally CPU bound, but uses CPU in bursts (unless overloaded). So long as you expect there to be about one core available per active user, you should be fine. Think of this as a heavy-weight web server, or medium weight app server.
Indexers are troublesome for virtualization. They are like a relational DB, but backwards; instead of a steady background of reads with bursty writes, Splunk has a constant set of writes with bursty reads. As such they are usually disk I/O bound, and occasionally CPU bound (when servicing lots of searches).
For deployments larger than a few GB/day and a couple of users, likely this is more intensive than a fractional bare metal box. We suggest more, cheaper full-box indexers, as that is by far the best value/money (both for your Splunk, and compared to anything else on the market).
However, if you must virtualize your indexers here is what you should know:
All that said, some of the largest deployments are on VMs, but with expert care and high performance hardware. If you're the kind of shop that would virutalize an n-way DB cluster you'll probably be OK. If that sounds daunting, we're sure Splunk can put a couple of commodity bare-metal servers to good (and full) use.
answered 01 Feb '10, 21:04
Running in a VM in principle is fine.
In practice, the most common problem with running Splunk in VM environments is that the storage that is assigned to a VM is often low-grade high-latency pooled storage on some anonymous 7200 rpm RAID 5 disk that's used to serve up non-IO intensive apps. These will usually give you under 100 IOPS, occasionally bursting up to 200 IOPs.
If there is good dedicated storage, configured right with the right storage drivers and interfaces on the VM, and the disks are dedicated to the VM, then Splunk can perform very well.
On physical servers with local disk, it is fairly predictable what performance they will provide. For example, 4 x 10k RPM disks in RAID 10 will give us at least 800 IOPs, which is what we ask for for good performance. With Virtual Machines, we have to perform much greater diligence on the storage that is assigned. It's not that you can't get good performance, it's just that in a VM environment, you are much more likely to be have poorly performing resources. (One of the motivations of virtualization is to remove excess physical capacity from data centers by sharing stuff, I.e., providing less resources to applications. Most applications don't need this excess physical capacity. But for Splunk, that isn't "excess", it's what we need.)
Bottom line, we want at least 800 to 1000 IOPS from the disk, however it is provisioned. (SAN, NAS, virtual, real, whatever.)
A decent way to let someone understand what Splunk requires is to tell them we need the same quality (or higher) of disk storage as they would have to provision for a production transactional Oracle or MSSQL server database. If they can do that in their VM environment (or on their NAS or SAN), then Splunk will probably also be fine with the same kind of storage.
answered 10 Nov '10, 17:57
Will it be ok, to run an indexer on a VM with dedicated SSDs? Could an Indexer-VM like that handle up to 50GB?
answered 24 Sep '12, 10:37
There is also a whitepaper that talks about some tricks etc: http://www.splunk.com/web_assets/pdfs/secure/Splunk_and_VMware_VMs_Tech_Brief.pdf
answered 24 Sep '12, 11:27