Wondering if there are any best practices (or reference architectures) for running Splunk against an Azure (or another cloud) solution where there are, for example, multiple web servers, and in this case a very large number of worker nodes. There could also be n number of these deployments. So essentially LOTS of cloud VM instances. All the logs are automatically transferred to Azure Table Storage.
We don't want to have to transfer all this data on-premise as it could get a little unwieldy.
Would the best approach be to run Splunk up on a VM in the cloud and have it download the logs to local storage? This could be problematic if the VM was recycled as the local storage could (will eventually) get wiped...
Appreciate any guidance.
asked 26 Apr '11, 17:40
There has definitely been some work done on this before, and I believe Splunk's SEs have used Amazon-based instances for demos from time to time.
The following would be a good starting point:
answered 28 Apr '11, 11:32
I just wrote an Azure Diagnostics App for Splunk and submitted it to splunkbase yesterday for approval. I tested it in both windows and Linux. What it does is pull the azure diagnostics from the azure WAD tables and populate the splunk indexes with it. Currently it doesn't do any grooming of the azure tables but that is something I plan on adding later. It can run on or off-premises, some due diligence is needed to determine what makes the most sense in different scenarios (pay for instances vs data transfers). If you do decide to give it a try, do let me know, I'd love to hear some feedback.
answered 16 Jan '12, 17:29