← Back to ruthner.at
v1.0
Security Apache Log Project

Apache Suspicious Log Scanner

Intelligent threat detection and automatic blocking for Apache web servers

Python-based • nftables/iptables integration • Real-time monitoring • Zero dependencies

⚡ Quick Start Guide

Get up and running in under 5 minutes

1

Download & Extract

wget https://www.ruthner.at/security_scanner/apaches_secscanner.tgz
tar -xzf apaches_secscanner.tgz
cd apaches_secscanner
2

Run Initial Scan (Dry-Run)

python3 apache_suspicious_scan_multi.py --days 7 /var/log/apache2/access.log*

Reviews last 7 days without making firewall changes

3

Create Whitelist (Optional but Recommended)

sudo nano /etc/apache_scan.whitelist
# Add trusted IPs:
192.168.1.0/24
10.0.0.0/8
4

Enable Auto-Blocking

sudo python3 apache_suspicious_scan_multi.py \
  --days 7 --block --score-threshold 5 \
  --whitelist /etc/apache_scan.whitelist \
  /var/log/apache2/access.log*

Blocks IPs with score > 5, respects whitelist

📋 Requirements

🛡️ Overview & Key Features

The Apache Suspicious Log Scanner is a production-ready security tool that analyzes Apache HTTP access logs to identify and automatically block malicious activity. Built with Python's standard library, it requires no external dependencies and integrates seamlessly with modern Linux firewall systems.

🔍

Intelligent Detection

Multi-heuristic analysis detects SQL injection, path traversal, brute force, and scanning attempts

🚫

Auto-Blocking

Automatic firewall integration with nftables (preferred) and iptables fallback

📊

Real-Time Dashboard

Live HTML status page with packet counters and colored threat indicators

Live Mode

Tail active logs and block threats as they happen with rate limiting

🎯

Smart Scoring

Per-IP risk scoring combines multiple behavioral indicators

🔒

Flexible Whitelist

CIDR, wildcards, ranges — protect trusted networks with ease

✅ Production Ready

Zero external dependencies • Handles rotated & gzipped logs • Idempotent firewall setup • Tested on Raspberry Pi to enterprise servers

⚙️ How It Works

The scanner processes Apache Combined Log Format entries and builds a behavioral profile for each IP address:

  1. Log Parsing: Reads access.log files including rotated .gz archives
  2. Pattern Matching: Identifies SQL injection, LFI/RFI, sensitive file access, binary protocols
  3. Behavioral Analysis: Tracks request rates, 4xx ratios, unique paths, User-Agent diversity
  4. Risk Scoring: Combines heuristics into a per-IP threat score (0-20+)
  5. Automatic Response: Blocks high-risk IPs via nftables/iptables when score exceeds threshold
  6. Continuous Monitoring: Live mode tracks new threats and updates dashboard in real-time
🎓 Note

All detection is based on observable behavior in logs. The scanner does NOT inspect actual application code or database queries — it identifies attack attempts based on HTTP request patterns.

🔎 Detection Methods

The scanner analyzes multiple behavioral indicators to identify malicious activity:

Behavioral Heuristics

Indicator Threshold What It Means
PHP File Scanning > 20 unique .php files Vulnerability scanner probing for known exploits
4xx Error Ratio > 50% (min 20 requests) Path enumeration, forced browsing attacks
Request Rate > 0.5 req/sec (min 10 req) Aggressive scanning, DoS attempts
User-Agent Diversity ≥ 5 unique UAs per IP Botnet activity, proxy rotation
Path Discovery > 200 unique paths Extensive reconnaissance, directory brute force
Auth Failures > 50 (401/403 codes) Credential stuffing, brute force login

🚩 Per-Request Suspicion Flags

Attack Type Detection Patterns Examples
SQL Injection UNION SELECT, OR 1=1, sleep() Database enumeration, blind SQLi
LFI/RFI/Traversal ../, %2e%2e, php:// File inclusion, path traversal attacks
Sensitive Files .env, .git, wp-login Configuration file access, admin panel probes
Binary/TLS \x16\x03 (TLS handshake) Protocol confusion, HTTPS to HTTP port
Unprintable Chars Non-ASCII bytes in request Binary data, malformed requests
Overlong URI > 1000 characters Buffer overflow attempts, large payloads

📊 Scoring & Blocking Logic

Each IP receives a cumulative risk score based on observed behavior. Default blocking threshold is score > 5 (customizable via --score-threshold).

Score Calculation

Condition Points Severity
> 20 unique PHP files accessed +5 High
4xx error ratio > 50% (≥20 requests) +4 High
Request rate > 0.5/sec (≥10 requests) +4 High
≥ 5 unique User-Agents +2 Medium
> 200 unique paths accessed +2 Medium
> 50 authentication failures (401/403) +1 Medium
SQLi/LFI/Sensitive/Binary patterns +1 to +5 Variable
⚠️ Threshold Tuning

Start with threshold 5 and monitor for false positives. High-traffic sites may need 7-10. Use --days 1 initially to test scoring accuracy.

Example Score Scenarios

🔴 Live Mode & Real-Time Blocking

Monitor active logs and respond to threats instantly with live mode capabilities:

Live Mode Features

Live Mode Command

sudo python3 apache_suspicious_scan_live.py \
  --days 7 --live --block --score-threshold 4 \
  --whitelist /etc/apache_scan.whitelist \
  --whitelist-mode ignore \
  --status-page /var/www/html/security_scanner/secscanstat.html \
  --status-interval 60 \
  --rate-limit 2 --rate-window 15 \
  /var/log/apache2/access.log

Rate Limiting Parameters

Parameter Default Description
--rate-limit None Max requests per second before auto-block
--rate-window 10s Time window for rate calculation
--block-on-first-sqli Off Immediate block on first SQL injection attempt
--block-on-sensitive Off Immediate block on .env, .git access
💡 Production Tip

Run live mode in a screen/tmux session or as a systemd service. Combine with logrotate to handle Apache log rotation seamlessly.

📈 Status Dashboard

The scanner generates a modern, auto-refreshing HTML dashboard showing real-time security status.

🔴 View Live Demo

See a production scanner in action:

Open Live Status Dashboard →

Dashboard Sections

🚫 Blocked IPs

Currently blocked threats with scores, request counts, Hits (firewall packet counter), and colored reason tags

⚠️ Yellow Zone

Near-threshold IPs (score 3-5) that warrant monitoring

👀 Top Observed

High-activity IPs below blocking threshold — normal traffic or early reconnaissance

🚩 Recent Flags

Last 50 suspicious events with timestamps and short tags (SQLi, LFI, Sensitive)

Firewall Hits Counter

The dashboard displays Hits for each blocked IP — the number of packets dropped by the firewall since blocking began. This helps identify:

🔄 Refresh Interval

Default: 60 seconds. Adjust with --status-interval. Hits are polled via --fw-poll-interval (if supported in your version).

✅ Whitelist Configuration

Protect trusted networks and services from accidental blocking with flexible whitelist formats.

Whitelist File Format

# /etc/apache_scan.whitelist
# Lines starting with # are comments

# Full CIDR notation
192.168.1.0/24
10.0.0.0/8
172.16.0.0/12

# Shorthand notation
192.168.1
10

# Wildcards
203.0.113.*
198.51.100.1*

# IP ranges
203.0.113.10-203.0.113.20

# Single IPs
8.8.8.8
1.1.1.1

Whitelist Modes

Mode Behavior Use Case
block-only IPs are scored and visible, but never auto-blocked Default. Monitor internal networks for anomalies
ignore IPs are fully excluded from scoring, display, and blocking Monitoring systems, load balancers, trusted services
⚠️ Security Recommendation

Always whitelist your own management IPs, monitoring systems, and known partners. Test with block-only mode first to verify no legitimate traffic is flagged.

🔥 Firewall Integration

The scanner automatically configures and manages firewall rules with graceful fallback between backends.

Backend Selection

  1. nftables (preferred): Modern, efficient, with per-IP packet counters
  2. iptables (fallback): Legacy compatibility for older systems

nftables Configuration

The scanner creates an idempotent nftables infrastructure:

# Table and set
Table: inet apache_scan
Set: bad_ips (type ipv4_addr, interval flag)

# Chains
input_drop (base chain, priority -5)
per_ip_drop (for individual IP counters)

# Rules
ip saddr @bad_ips drop        # Main blocking rule
ip saddr 1.2.3.4 counter drop # Per-IP stats

Manual nftables Setup (if needed)

# Create infrastructure (idempotent)
sudo nft add table inet apache_scan 2>/dev/null || true
sudo nft add set inet apache_scan bad_ips '{ type ipv4_addr; flags interval; }' 2>/dev/null || true
sudo nft add chain inet apache_scan per_ip_drop {} 2>/dev/null || true

# Recreate base chain
sudo nft delete chain inet apache_scan input_drop 2>/dev/null || true
sudo nft add chain inet apache_scan input_drop '{ type filter hook input priority -5; policy accept; }'
sudo nft add rule inet apache_scan input_drop jump per_ip_drop
sudo nft add rule inet apache_scan input_drop ip saddr @bad_ips drop
🚨 Important

The scanner requires root/sudo privileges to modify firewall rules. Always test with --days 1 and dry-run before enabling --block in production.

🔧 Troubleshooting

Common Issues & Solutions

Issue: "nft: Could not process rule: No such file or directory"

Solution: The nftables infrastructure doesn't exist. Run the manual setup commands above, or let the scanner create it automatically (requires root).

Issue: Whitelist not working

Solution: Check file path and format. Ensure no trailing spaces. Use --whitelist-mode ignore for complete exclusion.

Issue: No logs processed

Solution: Verify log path with ls -lh /var/log/apache2/access.log*. Check date filter isn't too restrictive.

Issue: False positives blocking legitimate traffic

Solution: Increase --score-threshold to 7-10. Add legitimate bots to whitelist.

Reset Everything

# Remove all blocks
sudo python3 apache_suspicious_scan_multi.py --unblock

# Delete nftables infrastructure
sudo nft delete table inet apache_scan

📄 JSON Report Structure

Every scan generates a machine-readable JSON report for integration with SIEM, dashboards, or custom tools.

Structure Example

{
  "total_lines_attempted": 14523,
  "processed_entries": 14231,
  "unique_ips": 158,
  "top_suspicious": [
    {
      "ip": "203.0.113.57",
      "score": 14,
      "reqs": 412,
      "reasons": ["many_php_files=67", "4xx_ratio=0.86"]
    }
  ],
  "flagged_events": [
    {
      "ip": "203.0.113.57",
      "reason": "Possible LFI/RFI attempt",
      "detail": "/../../etc/passwd"
    }
  ]
}

📖 Command Examples

Basic Analysis

# Dry-run: analyze last 7 days, no firewall changes
python3 apache_suspicious_scan_multi.py --days 7 /var/log/apache2/access.log*

# Include all rotated logs (automatic .gz support)
python3 apache_suspicious_scan_multi.py --days 30 /var/log/apache2/access.log*

# Custom output file
python3 apache_suspicious_scan_multi.py --days 7 --out /tmp/scan_report.json /var/log/apache2/access.log*

Production Deployment

# Standard blocking with whitelist (recommended)
sudo python3 apache_suspicious_scan_multi.py \
  --days 7 --block --score-threshold 5 \
  --whitelist /etc/apache_scan.whitelist \
  /var/log/apache2/access.log*

# Aggressive blocking (lower threshold)
sudo python3 apache_suspicious_scan_multi.py \
  --days 3 --block --score-threshold 3 \
  --whitelist /etc/apache_scan.whitelist \
  /var/log/apache2/access.log*

# Conservative blocking (higher threshold)
sudo python3 apache_suspicious_scan_multi.py \
  --days 14 --block --score-threshold 8 \
  --whitelist /etc/apache_scan.whitelist \
  /var/log/apache2/access.log*

Live Monitoring

# Basic live mode with status page
sudo python3 apache_suspicious_scan_live.py \
  --live --block --score-threshold 5 \
  --status-page /var/www/html/security_scanner/secscanstat.html \
  /var/log/apache2/access.log

# Advanced live mode with rate limiting
sudo python3 apache_suspicious_scan_live.py \
  --days 7 --live --block --score-threshold 4 \
  --whitelist /etc/apache_scan.whitelist \
  --whitelist-mode ignore \
  --status-page /var/www/html/security_scanner/secscanstat.html \
  --status-interval 60 \
  --rate-limit 2 --rate-window 15 \
  --block-on-first-sqli \
  /var/log/apache2/access.log

# Run in screen/tmux (recommended for production)
screen -dmS apache-scanner sudo python3 apache_suspicious_scan_live.py \
  --live --block --score-threshold 5 \
  --whitelist /etc/apache_scan.whitelist \
  --status-page /var/www/html/security_scanner/secscanstat.html \
  /var/log/apache2/access.log

Maintenance

# Remove all firewall blocks
sudo python3 apache_suspicious_scan_multi.py --unblock

# Analyze specific IP from logs
grep "1.2.3.4" /var/log/apache2/access.log* | python3 apache_suspicious_scan_multi.py --days 999 -

# Check what would be blocked (dry-run)
python3 apache_suspicious_scan_multi.py --days 1 --score-threshold 3 /var/log/apache2/access.log*

Cron Job (Hourly Scan)

# /etc/cron.d/apache-scanner
0 * * * * root /usr/bin/python3 /home/pi/sec_scanner/apache_suspicious_scan_multi.py \
  --days 1 --block --score-threshold 5 \
  --whitelist /etc/apache_scan.whitelist \
  /var/log/apache2/access.log* > /var/log/apache_scanner.log 2>&1

📁 File Locations

Path Purpose Permissions
/var/log/apache2/access.log* Apache log files (incl. .gz rotations) 644 (readable)
apache_suspicious_report.json Scan results and statistics 644
/etc/apache_scan.whitelist Trusted IP whitelist 644
apache_suspicious_scan_*.py Scanner scripts 755 (executable)
secscanstat.html Live status dashboard 644 (web-accessible)

❓ Frequently Asked Questions

Does this work with Nginx or other web servers?

Currently optimized for Apache Combined/Common log format. Nginx support possible with minor modifications to regex patterns.

Will it block search engine bots like GoogleBot?

Legitimate bots typically have low scores (< 3) unless they trigger specific attack patterns. Add known bot IPs to whitelist if needed.

How much CPU/memory does live mode use?

Minimal — typically < 50MB RAM and < 5% CPU on a Raspberry Pi 4. Scales well even with high-traffic sites.

Can I run multiple instances simultaneously?

Yes, but only one should have --block enabled to avoid firewall conflicts. Use different --out paths.

How do I unblock a specific IP?

# nftables
sudo nft delete element inet apache_scan bad_ips { 1.2.3.4 }

# iptables
sudo iptables -D INPUT -s 1.2.3.4 -m comment --comment apache-scan -j DROP

Does it handle log rotation automatically?

Yes — reads all matching files including .gz archives. Live mode gracefully handles Apache log rotation.

Can I integrate with Fail2Ban?

Yes — both can coexist. Fail2Ban handles authentication failures; this scanner handles broader attack patterns.

What about IPv6 support?

Currently IPv4 only. IPv6 support planned for future release.

📦 Download

Apache Suspicious Log Scanner v1.0

Complete package with scanner scripts, whitelist templates, and documentation. Ready for production deployment.

📥 Download apaches_secscanner.tgz

Size: ~50KB • Python 3.6+ • No dependencies • Open Source

Package Contents

Installation

# Download and extract
wget https://www.ruthner.at/security_scanner/apaches_secscanner.tgz
tar -xzf apaches_secscanner.tgz
cd apaches_secscanner

# Make executable
chmod +x apache_suspicious_scan_*.py

# Test installation
python3 apache_suspicious_scan_multi.py --help

# Copy whitelist template
sudo cp apache_scan.whitelist.example /etc/apache_scan.whitelist
sudo nano /etc/apache_scan.whitelist  # Add your trusted IPs

Recommended Deployment

# Create scanner directory
sudo mkdir -p /home/pi/sec_scanner
sudo mv apache_suspicious_scan_*.py /home/pi/sec_scanner/

# Create web-accessible status directory
sudo mkdir -p /var/www/html/security_scanner
sudo chown www-data:www-data /var/www/html/security_scanner

# Initial scan (dry-run)
python3 /home/pi/sec_scanner/apache_suspicious_scan_multi.py \
  --days 7 /var/log/apache2/access.log*

# Deploy live monitoring
sudo screen -dmS apache-scanner python3 \
  /home/pi/sec_scanner/apache_suspicious_scan_live.py \
  --live --block --score-threshold 5 \
  --whitelist /etc/apache_scan.whitelist \
  --status-page /var/www/html/security_scanner/secscanstat.html \
  /var/log/apache2/access.log

⚖️ License & Disclaimer

✅ Free to Use

This software is provided free of charge for personal, educational, and commercial use.

🚨 Disclaimer

This software is provided AS-IS, without warranty or guarantees of any kind, express or implied. You are solely responsible for:

  • Correct configuration and testing before production use
  • Monitoring for false positives that may block legitimate traffic
  • Compliance with applicable laws and regulations
  • Any damages, data loss, or service disruption resulting from use

The author assumes no liability for misconfiguration, data loss, service interruption, or any other damages arising from the use of this software.

Use at your own risk. Always test thoroughly in a non-production environment before deploying to production systems.