Lab 12 β Log Analysis & Anomaly Detection¶
Course: SCIA-120 Β· Introduction to Secure Computing
Topic: Security Practices β Log Analysis, SIEM Fundamentals & Incident Detection
Difficulty: ββ BeginnerβIntermediate
Estimated Time: 60β75 minutes
Related Reading: Chapter 14 β Security Practices, Risk Management, and Compliance
Overview¶
"You can't defend what you can't see." Logs are a security professional's primary forensic tool β they record what happened, when, and from where. In this lab you will generate and analyze web server logs, detect brute-force attack patterns, identify suspicious activity using command-line tools, and build a basic anomaly detection script β all using Docker.
Learning Objectives¶
- Understand what information web server access logs contain.
- Use
grep,awk,sort, anduniqto analyze log data. - Identify brute-force attack patterns in authentication logs.
- Detect unusual activity such as high-frequency requests and 404 scan patterns.
- Write a basic Python script to automate log anomaly detection.
Prerequisites¶
- Docker Desktop installed and running.
Part 1 β Generate a Web Server with Real Logs¶
Step 1.1 β Start an Nginx Server¶
Step 1.2 β Generate Realistic Traffic¶
# Normal user browsing
for path in "/" "/about" "/contact" "/products" "/faq"; do
docker run --rm curlimages/curl curl -s http://localhost:8080$path -o /dev/null
done
# 404 errors (scanner behavior β probing for common files)
for path in "/admin" "/.env" "/wp-login.php" "/phpmyadmin" "/.git/config" "/backup.zip"; do
docker run --rm curlimages/curl curl -s http://localhost:8080$path -o /dev/null
done
# Simulated brute-force (many requests from same IP)
for i in $(seq 1 20); do
docker run --rm curlimages/curl curl -s "http://localhost:8080/login?attempt=$i" -o /dev/null
done
Step 1.3 β Extract the Logs¶
Nginx writes access logs to stdout. Filter out startup messages by keeping only lines that contain HTTP/:
docker logs log-server 2>&1 | grep 'HTTP/' > /tmp/nginx_access.log
wc -l /tmp/nginx_access.log
cat /tmp/nginx_access.log
πΈ Screenshot checkpoint: Take a screenshot showing the log file created with a line count.
Part 2 β Understanding Log Format¶
Step 2.1 β View Raw Log Entries¶
Nginx Combined Log Format:
| Field | Meaning |
|---|---|
127.0.0.1 | Client IP address |
[date] | Timestamp |
"GET / HTTP/1.1" | HTTP method, path, protocol |
200 | Response code (200=OK, 404=Not Found, 401=Unauthorized) |
615 | Response size in bytes |
"-" | Referrer |
"curl/7.x" | User agent |
πΈ Screenshot checkpoint: Take a screenshot of several raw log lines and annotate each field in your submission.
Part 3 β Basic Log Analysis Commands¶
Run a container with the log file for analysis:
Step 3.1 β Count Requests by HTTP Status Code¶
πΈ Screenshot checkpoint: Take a screenshot showing the HTTP status code distribution.
Step 3.2 β Find the Most Requested URLs¶
πΈ Screenshot checkpoint: Take a screenshot showing the top requested URLs.
Step 3.3 β Count Requests by IP Address¶
Step 3.4 β Find All 404 Errors (Scanner Behavior)¶
πΈ Screenshot checkpoint: Take a screenshot showing the 404 entries β these are paths a scanner probed.
Step 3.5 β Find High-Frequency Requesters (Potential Brute-Force)¶
Any IP with more than 5 requests is potentially suspicious (in a real environment the threshold would be much higher, like 100/min).
Part 4 β Simulated SSH Brute-Force Log Analysis¶
Step 4.1 β Create a Simulated Auth Log¶
cat > /tmp/auth.log << 'EOF'
Apr 17 10:00:01 server sshd[1234]: Failed password for invalid user admin from 203.0.113.15 port 54321 ssh2
Apr 17 10:00:02 server sshd[1234]: Failed password for invalid user admin from 203.0.113.15 port 54322 ssh2
Apr 17 10:00:03 server sshd[1234]: Failed password for invalid user root from 203.0.113.15 port 54323 ssh2
Apr 17 10:00:04 server sshd[1234]: Failed password for invalid user oracle from 203.0.113.15 port 54324 ssh2
Apr 17 10:00:05 server sshd[1234]: Failed password for invalid user postgres from 203.0.113.15 port 54325 ssh2
Apr 17 10:00:06 server sshd[1234]: Failed password for invalid user user from 203.0.113.15 port 54326 ssh2
Apr 17 10:00:07 server sshd[1234]: Failed password for invalid user test from 203.0.113.15 port 54327 ssh2
Apr 17 10:00:08 server sshd[1234]: Failed password for invalid user ubuntu from 203.0.113.15 port 54328 ssh2
Apr 17 10:15:01 server sshd[5678]: Accepted publickey for alice from 10.0.0.5 port 44444 ssh2
Apr 17 10:30:00 server sshd[9012]: Failed password for bob from 198.51.100.99 port 12345 ssh2
Apr 17 10:30:01 server sshd[9012]: Failed password for bob from 198.51.100.99 port 12346 ssh2
Apr 17 10:30:02 server sshd[9013]: Accepted password for alice from 10.0.0.5 port 44445 ssh2
Apr 17 11:00:00 server sshd[1235]: Invalid user pi from 203.0.113.15 port 54329
Apr 17 11:00:01 server sshd[1235]: Failed password for invalid user pi from 203.0.113.15 port 54329 ssh2
EOF
Step 4.2 β Analyze the Auth Log¶
docker run --rm \
-v /tmp/auth.log:/logs/auth.log \
ubuntu:22.04 bash -c "
echo '=== Failed login attempts by IP ==='
grep 'Failed password' /logs/auth.log | awk '{print \$(NF-3)}' | sort | uniq -c | sort -rn
echo ''
echo '=== Usernames being targeted ==='
grep 'Failed password' /logs/auth.log | awk '{print \$9}' | sort | uniq -c | sort -rn
echo ''
echo '=== Successful logins ==='
grep 'Accepted' /logs/auth.log
"
πΈ Screenshot checkpoint: Take a screenshot showing the brute-force analysis results β which IP, which usernames, and successful logins.
Part 5 β Automated Anomaly Detection Script¶
Step 5.1 β Write a Python Anomaly Detector¶
docker run --rm -v /tmp:/data python:3.11-slim bash -c "
cat > /data/detect_anomalies.py << 'PYEOF'
import re
from collections import defaultdict
BRUTE_FORCE_THRESHOLD = 3 # flag IPs with >3 failed attempts
SCAN_404_THRESHOLD = 2 # flag IPs with >2 404 errors
failed_logins = defaultdict(int)
errors_404 = defaultdict(list)
suspicious_paths = ['/.env', '/wp-login.php', '/.git', '/phpmyadmin', '/admin', '/backup']
print('=== SECURITY ANOMALY REPORT ===\n')
# Analyze auth log
print('--- Brute Force Detection (auth.log) ---')
try:
with open('/data/auth.log') as f:
for line in f:
if 'Failed password' in line:
m = re.search(r'from (\d+\.\d+\.\d+\.\d+)', line)
if m:
failed_logins[m.group(1)] += 1
for ip, count in sorted(failed_logins.items(), key=lambda x: -x[1]):
if count >= BRUTE_FORCE_THRESHOLD:
print(f' β οΈ ALERT: {ip} had {count} failed login attempts β possible brute force')
else:
print(f' βΉοΈ {ip}: {count} failed attempt(s)')
except FileNotFoundError:
print(' (auth.log not found)')
# Summary
print(f'\n--- Summary ---')
print(f' IPs flagged for brute force: {sum(1 for c in failed_logins.values() if c >= BRUTE_FORCE_THRESHOLD)}')
print(f' Total failed login sources: {len(failed_logins)}')
print('\n=== END OF REPORT ===')
PYEOF
python3 /data/detect_anomalies.py
"
πΈ Screenshot checkpoint: Take a screenshot of the automated anomaly detection report output.
Type exit when done.
Cleanup¶
docker stop log-server && docker rm log-server
rm -f /tmp/nginx_access.log /tmp/auth.log /tmp/detect_anomalies.py
docker system prune -f
Lab Assessment¶
Screenshot Submission Checklist¶
- [ ]
screenshot-12aβ Log file created with line count - [ ]
screenshot-12bβ Raw log lines with fields annotated - [ ]
screenshot-12cβ HTTP status code distribution - [ ]
screenshot-12dβ Top requested URLs - [ ]
screenshot-12eβ 404 entries showing scanner probing - [ ]
screenshot-12fβ Brute-force IP and username analysis from auth.log - [ ]
screenshot-12gβ Automated anomaly detection script output
Reflection Questions¶
- Looking at the 404 patterns in Part 3, what is an attacker likely trying to do by probing paths like
/.env,/wp-login.php, and/.git/config? - In the auth log analysis, IP
203.0.113.15tried multiple different usernames. What type of attack is this? How is it different from a brute-force attack on a known username? - Why is it important to correlate logs from multiple sources (web server + authentication + firewall)? What might you miss if you only looked at one log?
- A company is breached on a Tuesday. The attackers are discovered on Friday. Why are logs critically important to the incident response team investigating the breach?
Grading Rubric
- Screenshots complete and clearly labeled: 40 points
- Log pattern analysis notes accompanying each screenshot: 20 points
- Reflection questions answered thoughtfully: 40 points
- Total: 100 points