Overview
The Apache Suspicious Log Scanner is a Python-based security tool designed to analyze Apache HTTP access logs for potential malicious activity. It detects suspicious IP addresses, rates their behavior with a score, and can automatically block high-risk IPs via the system firewall (nftables preferred, iptables fallback).
Typical use cases:
- Detect mass scans, brute-force attempts, or exploit probes
- Identify malformed or binary requests to HTTP ports
- Track potential SQL injection, LFI/RFI, or path traversal attempts
- Auto-block bad actors from your network
Key Features
- Multi-log support: reads
access.log, rotated files, and.gzarchives. - Date filtering:
--days Nlimits analysis to the last N days. - Scoring system: combines heuristics into a single risk score per IP.
- Automatic firewall blocking: optional with
--block; nftables preferred, iptables fallback. - Whitelist support: never block trusted IPs (plain text file, one IP per line).
- Readable console report & machine JSON export.
- Follow mode: real-time monitoring with
--follow.
What the Scanner Checks
The scanner parses the Combined Log Format:
IP IDENT AUTH [DATE] "METHOD PATH HTTP/1.1" STATUS BYTES "REFERER" "USER_AGENT"
For each IP, it tracks patterns and behavior:
- High number of distinct PHP file requests: scanning for vulnerabilities.
- Frequent 4xx errors: indicates enumeration or probing.
- High request rate: requests per second over observed span.
- Many unique User-Agents per IP: UA-rotation to evade filters.
- Too many unique paths: aggressive directory/resource discovery.
- Excessive auth failures: repeated 401/403 results.
Per-Request Suspicion Flags
| Type | Example Pattern | Description |
|---|---|---|
| SQL Injection | UNION SELECT, sleep(5), or 1=1 | Attempts to inject SQL commands |
| LFI / RFI / Traversal | ../, php://, /etc/passwd, %2e%2e | Access to local/remote resources via file inclusion |
| Sensitive filenames | .env, .git, wp-login.php, phpmyadmin, config.php | Known sensitive entry points |
| Binary/TLS bytes | \x16\x03\x01 | Encrypted handshake sent to HTTP port |
| Unprintable characters | (binary garbage) | Protocol misuse or scanning |
| Overlong URI | >1000 characters | Exploit payload / buffer overflow attempt |
Scoring & Blocking Logic
| Condition | Score |
|---|---|
>20 unique .php requests | +5 |
| 4xx ratio > 0.5 (≥20 requests) | +4 |
| >0.5 requests/sec (≥10 requests) | +4 |
| ≥5 unique User-Agents | +2 |
| >200 unique paths | +2 |
>50 401/403 responses | +1 |
IPs are sorted by total score. By default, IPs with a score > 5 are considered malicious (configurable via --score-threshold).
Firewall Integration
nftables (preferred)
- Table:
inet apache_scan - Set:
bad_ips - Chain:
input_dropwith ruleip saddr @bad_ips drop - Drops packets early (priority -5)
iptables (fallback)
iptables -I INPUT -s <IP> -m comment --comment apache-scan -j DROP
Whitelist
Provide trusted IPs in a file (one per line). Lines starting with # are ignored.
# /etc/apache_scan.whitelist
127.0.0.1
192.168.0.0/16
203.0.113.42
Unblocking
sudo python3 apache_suspicious_scan_multi.py --unblock
JSON Report Structure
{
"total_lines_attempted": 14523,
"processed_entries": 14231,
"unique_ips": 158,
"top_suspicious": [
{ "ip": "203.0.113.57", "score": 14, "reqs": 412, "reasons": ["many_php_files=67","4xx_ratio=0.86"] }
],
"flagged_events": [
{ "ip": "203.0.113.57", "reason": "Possible LFI/RFI attempt", "detail": "/../../etc/passwd" }
]
}
Command Examples
# 1) Analyze last 7 days (dry-run)
python3 apache_suspicious_scan_multi.py --days 7 /var/log/apache2/access.log*
# 2) Analyze & block (score > 5) with whitelist
sudo python3 apache_suspicious_scan_multi.py \
--days 7 --block --score-threshold 5 \
--whitelist /etc/apache_scan.whitelist \
/var/log/apache2/access.log*
# 3) Remove all firewall blocks
sudo python3 apache_suspicious_scan_multi.py --unblock
# 4) Follow current log file in real time
sudo python3 apache_suspicious_scan_multi.py --follow /var/log/apache2/access.log
File Locations & Outputs
| Path | Purpose |
|---|---|
/var/log/apache2/access.log* | Log sources (including .gz rotations) |
apache_suspicious_report.json | Summary of suspicious activity (JSON) |
/etc/apache_scan.whitelist | Trusted IPs (never block) |
/home/pi/sec_scanner/apache_suspicious_scan_multi.py | Scanner script |
/home/pi/sec_scanner/info.html | This documentation (HTML) |
Security Notes
- Run as root only when necessary (firewall actions).
- Test without
--blockfirst to avoid false positives. - Review the JSON report for legit bots (Google, Bing) before blocking.
- Maintain a whitelist of internal IPs and monitoring systems.
- For persistence across reboots, add nftables rules to
/etc/nftables.conf.
Typical Workflow
1) Run dry (e.g., --days 7)
2) Review top IPs & flagged events
3) Adjust thresholds / update whitelist
4) Re-run with --block to enforce
5) (Optional) Schedule via cron
Feature Summary
| Feature | Description |
|---|---|
| Multi-file parsing | Handles rotated & gzipped logs |
| Date filtering | --days N |
| Suspicious pattern detection | SQLi, LFI/RFI, brute-force, traversal |
| Intelligent scoring | Combines heuristics into a total risk score |
| JSON output | Machine-readable summary |
| Auto firewall blocking | nftables preferred, iptables fallback |
| Whitelist | Prevents blocking trusted sources |
| Unblock mode | Removes scanner-added rules safely |
| Real-time mode | --follow for live watching |
Author
Wolfgang Ruthner — 2025
Security Apache Log Project — Austria 🇦🇹
Website: ruthner.at