Quick thoughts on Splunk

Here is the skinny on Splunk. It’s an generic analytics tool that most folks use for log consolidation and searching. I’ve spent about two weeks in the lab with it in between other projects. Here’s the executive summary:

  1. It is, indeed, useful. It enables you to find data you otherwise might not have found due to too-much-effort-required-so-why-try syndrome.
  2. With the right server, it’s very fast
  3. It’s quite pretty and the graphs are actually useful
  4. You can also log, search, and graph non-syslog data like Windows WMI logs, Cisco logs, Web proxy logs, and various security-oriented logs
  5. It’s dead simple to get it up and running
  6. It scales really well

Just so you don’t think I’m cheerleading. I’ll let you in on the things that are not-so-good but still bearable in my opinion.

  1. If you have any significant log volume at all, you’d better get a beefy server to run the indexing and search servers (can be one box or multiple). I tested it on a 16-way Nehalem box with 48Gb of RAM. It stood up well there, but on my original modest VMware guest it was dog-slow. I was doing about 5 GB  of logs per day.
  2. The “lightweight forwarder” functionality is very useful, but not at all what I’d (I’m a C programmer) call lightweight. It’s often, by far, the most CPU and RAM eating process on some, otherwise underutilized, servers. It likes to chew up 1-2% of the CPU at times where I really have to wonder what the heck it’s doing (ie.. no new logs). Yes, I had indexing turned off on those nodes.
  3. If you rely on their system of using lightweight forwarders rather than just sending remote syslogs you end up kind of in bed with them in a way that would be a little painful should you wish to come back onto the “let’s just use what’s traditional” reservation. However, the lightweight forwarder does add some valuable features like the ability to run arbitrary commands, log them, then mine that data (think ‘ps’, ‘vmstat’, ‘netstat’, and ‘iostat’).
  4. Their product is moderately expensive

Yes, you could almost everything Splunk does with egrep, gnuplot, graphviz, and a lot of custom scripts. However, it’s a great tool if you don’t have the time or just want a nice way to centralize the logs with a minimum of fuss. Overall, I can tell they listened to a few good sysadmins on how to design the tool and make it useful. That’s something very rare when dealing with commercial software these days.

~ by aliver on February 2, 2010.

One Response to “Quick thoughts on Splunk”

  1. Thanks for the thoughts on Splunk. We do try to engage our customers as much as possible and always appreciate candid feedback, so keep it coming.

    A couple notes: 1) Splunk on a VM will generally be inferior because the indexing is I/O-bound (as opposed to CPU-bound), 2) we are planning on releasing a dedicated lightweight fowarder that is closer to what you’d expect.

    Johnvey Hwang
    Splunk Engineering

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: