Logs, Lies & Open Source: Building a Budget SIEM That Doesn’t Suck

Build a powerful, budget-friendly SIEM using open source tools like Wazuh, Logstash, and Elasticsearch. This expert guide walks through architecture, benefits, and pitfalls-giving you full control, deep visibility, and zero license fees. Security, your way.

Logs, Lies & Open Source: Building a Budget SIEM That Doesn’t Suck
Photo by La-Rel Easter / Unsplash

We’ve all seen the dashboards. Bright red alert boxes. Login attempts from Vietnam. 3am spikes in CPU usage. The illusion of control. The sweet, sweet dopamine hit of a clean log report. But behind that thin veneer of security lies the truth: most SIEMs are glorified log collectors wrapped in paywalls and overpromises.

What if you could build your own? Not the “Hello World” of cybersecurity—but a real, usable Security Information and Event Management system, without setting fire to your budget?

Let’s talk about how to do exactly that—with open source tools, real architecture, and just enough elbow grease to make it sing.


🧱 What Even Is a SIEM, Really?

Before we dive into YAML, Logstash filters, and self-inflicted suffering, let's align on what we're building.

A SIEM (Security Information and Event Management) platform has three main responsibilities:

  1. Collect log and event data from systems, devices, networks, and apps.
  2. Normalize & correlate that data into something meaningful.
  3. Alert & report on anomalies, threats, and trends.

Basically: it’s Splunk meets Sherlock Holmes.

And while Splunk is cool, the price tag could make your CISO cry. So we turn to open source.


🧰 The Core Stack: Open Source Power Tools

Let’s build an actual architecture—something you could show your CTO without embarrassment.

Here’s our battle-tested stack:

  • Log Shippers: Filebeat, Winlogbeat (Elastic Beats family)
  • Ingest & Transform: Logstash
  • Data Store: Elasticsearch
  • Dashboard & Visuals: Kibana
  • Security Intelligence: Wazuh (forked from OSSEC)
  • Log Archival: OpenSearch or Loki (optional)
  • Alerting: ElastAlert or Wazuh integrated rules

These are not toys. These tools are running in Fortune 500 companies and critical infrastructure today.


📦 Let’s Ship Some Logs

Start at the edge. The log shippers live on your endpoints and servers:

  • Filebeat: Reads text-based logs (e.g., /var/log/auth.log)
  • Winlogbeat: Feeds Windows Event Logs into the system

They send structured logs (via JSON over TLS, ideally) to Logstash, which is your central intake point.

👉 Tip: Always use encryption on this leg. If an attacker gets MITM access to your logs, you're handing them your entire threat landscape.


🔄 Logstash: The Log Alchemist

Logstash is your data refinery. It:

  • Accepts raw log events
  • Applies filters and grok patterns
  • Tags, parses, and enriches
  • Routes to Elasticsearch, or Wazuh for correlation

It’s the middleware glue that turns raw spaghetti logs into security linguine.

💡 Pro move: Build pipeline logic for log types. Treat logs like streams, not blobs. For example:

filter {
  if [type] == "apache-access" {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
    geoip { source => "clientip" }
  }
}

📚 Wazuh: The Brains of the Operation

Wazuh takes your logs, matches them against rulesets, and tells you when something weird is happening.

Wazuh isn’t just a SIEM engine—it’s also:

  • Host intrusion detection (HIDS)
  • File integrity monitoring (FIM)
  • Vulnerability detection
  • Active response system

And yes—it integrates with Elastic and Kibana like a charm.

🤖 Security Use Case: Wazuh notices failed sudo attempts on a box, correlates it with SSH brute force from a known bad IP range, and raises a high-priority alert.

That's actual SIEM intelligence—not just syslog trivia.


📊 Kibana: Pretty Pictures for Terrifying Problems

You don’t want to be parsing raw JSON at 2am. Kibana turns logs into dashboards, charts, and time-series heatmaps that can instantly tell you:

  • Who failed to log in the most last week
  • Which IPs are scanning your network
  • What your server CPU looked like during that DDoS

Build dashboards for:

  • Authentication patterns
  • Privilege escalation attempts
  • Geolocation of inbound connections
  • Anomalous process execution

📈 Fact: With the right Kibana visualizations, your intern can spot suspicious activity faster than your $500k EDR license.


🚨 ElastAlert: Alerts That Don’t Suck

SIEMs are only useful if they tell you when bad stuff happens.

ElastAlert (by Yelp) gives you programmable, repeatable alerting logic:

name: Brute Force SSH
type: frequency
index: logstash-*
num_events: 5
timeframe:
  minutes: 2
filter:
- term:
    event.action: "authentication_failure"
alert:
- "email"
email:
- "security@example.com"

Now your inbox becomes the battlefield.

📣 Don’t email everything. Noise kills effectiveness. Use thresholds and meaningful correlations.


💡 Architectural Considerations (a.k.a. The Things That Will Bite You)

Here’s where experience hits the fan:

  • Storage: Elastic chews disk like a late-night stress eater. Use hot/warm/cold node strategies, or offload to OpenSearch/S3/Loki after 30 days.
  • Scalability: Filebeat is fine for 10 servers. 100? Use Kafka or Redis buffers.
  • Security: Your SIEM is a juicy target. Lock down endpoints, use TLS everywhere, and watch for lateral movement inside your log ingestion pipeline.
  • Data Retention Laws: Depending on your industry, you might have to keep logs for 90 days, 6 months, or 7 years. Plan accordingly.
  • Multi-Tenancy: Wazuh now supports it—but you’ll have to architect it yourself.

⚖️ Benefits vs. Caveats: Know What You’re Building

FeatureOpen Source SIEMCommercial SIEM
CostFree (mostly)$$$$$$$
CustomizabilityFull controlLimited, often obscure
Setup TimeHigh (weeks)Low (days)
Alert TuningManual (but flexible)Managed (but rigid)
Compliance FeaturesAdd-on (Wazuh, Falco, etc.)Built-in
SupportForums, Discord, Docs24/7 Phone Support

🔍 Translation: You’re trading time and expertise for dollars. But if you have time and expertise? You win.


🎯 Who Should Actually Build This?

✅ You, if:

  • You’re a small/mid org with basic compliance needs
  • You want to train a junior SOC team on fundamentals
  • You enjoy building security systems more than paying for them

❌ Not you, if:

  • You need GDPR/SOX/HIPAA audit trails tomorrow
  • You’re already drowning in unparsed log data
  • You expect plug-and-play security

This is the deep end. Don’t swim without floaties unless you know the water.


🧩 Bonus Integrations for the Ambitious

  • MITRE ATT&CK Mapping: Enrich Wazuh alerts using Sigma rules and MITRE tags.
  • SOAR Tools: StackHive, Shuffle, or custom PowerShell/Python responders.
  • EDR+SIEM: Use open source EDRs like Velociraptor or OpenEDR to feed into Wazuh for richer context.

🚀 Conclusion: Build It, Learn It, Own It

Building your own SIEM isn’t just an exercise in cost-saving. It’s a masterclass in understanding your own infrastructure, attack surfaces, and visibility gaps.

Yes, it’s work. Yes, it’s complex. But you’ll come out of it with something no vendor can sell you: real understanding.

🧠 Because in cybersecurity, visibility isn’t just power—it’s survival.


If this gave you ideas, nightmares, or the itch to finally replace that Excel spreadsheet you’ve been calling a threat model—go forth and build. And maybe drop me a log line or two.

Would you like a follow-up on how to add anomaly detection using machine learning or how to architect this on Kubernetes?