AI Threat Detection Architecture

Real-time anomaly detection system processing JSON logs every 5 minutes

System Overview

This serverless architecture processes streaming logs through machine learning to detect security threats with 94.2% accuracy.

JSON Log Ingestion

Streaming JSON logs delivered every 5 minutes

CloudTrail API events
System activity logs
Network traffic data

↓

Lambda Feature Extraction

Key features extracted:

api_freq: API call frequency
ip_entropy: IP address entropy
128-dimensional feature vector

Performance: P99: 220ms | RAM: 1024MB

↓

Amazon SageMaker Scoring

Machine learning model evaluation:

Anomaly score generation (0-1)
Threshold: 0.957 for alerts
Model accuracy: 94.2% | AUC: 0.97

↓

Alerting & Archival

Action based on threat score:

Score > 0.957: Slack/PagerDuty alert
Score ≤ threshold: Archive to Glacier

Key Metrics

Processing Speed

220ms

P99 latency for feature extraction

Model Accuracy

94.2%

Threat detection rate

Feature Dimensions

128

Dimensional feature vector

AUC Score

0.97

Model discrimination ability

Architecture Legend

Data Collection: JSON log streams (blue)

Processing: Feature extraction (orange)

Critical Alerts: Score > 0.957 (red)

Normal Flow: Archive to Glacier (green)

Technical Specifications

Lambda Function: Python runtime, 1024MB RAM
Feature Vector: 128 dimensions including API frequency and IP entropy
Alert Threshold: 0.957 anomaly score
Data Retention: Hot storage (30 days), Glacier archive (1 year+)

Architecture Overview

This system processes AWS CloudTrail logs through a machine learning pipeline to detect suspicious activities in real-time. The architecture combines serverless components with managed ML services for scalable threat detection.

Key Characteristics

Real-time processing: Analyzes logs within 60 seconds of generation
High accuracy: 94% detection rate with <1% false positives
Self-learning: Model retrains weekly with new data
Multi-output: Alerts for critical threats, archives normal activity

Component Deep Dive

CloudTrail Log Ingestion

The entry point for all security-relevant API activities across AWS services. We capture:

Log types: Management events, data events, and CloudTrail insights
Filtering: Focus on security-related APIs (IAM, S3, EC2, KMS)
Volume: Processes ~2.3M events/day across all accounts

Implementation Note: Uses CloudTrail Lake for cross-account log aggregation with 90-day retention.

S3 Raw Storage Layer

The data lake foundation storing raw logs before processing:

Structure: Partitioned by account/region/day/hour
Security: SSE-KMS encryption with bucket policies restricting access
Lifecycle: Transitions to Glacier after 30 days

Sample path: s3://security-logs/raw/AWSLogs/123456789012/CloudTrail/us-east-1/2023/11/15/

Lambda Feature Extractor

Serverless function that transforms raw logs into ML-ready features:

Runtime: Python 3.9 with 1GB memory
Features extracted:
- API call frequency per service
- Time-of-day patterns
- Geographic anomalies
- Resource access sequences
Output: 128-dimensional feature vectors

Optimization: Batch processing 100 events per invocation to reduce costs.

TensorFlow Model Serving

The anomaly detection brain of the system:

Architecture: 3-layer LSTM neural network
Deployment: SageMaker real-time endpoint
Performance: 8ms latency per prediction
Training: Weekly retraining with new data

Key metric: Model outputs anomaly scores from 0 (normal) to 1 (critical threat) with 0.95 threshold for alerts.

Alerting & Archival System

Decision point routing anomalies to security teams and normal traffic to cold storage:

Critical alerts: Slack/PagerDuty notifications enriched with context
Normal traffic: Compressed and archived to S3 Glacier Deep Archive
Audit trail: All decisions logged in DynamoDB for forensics

Escalation: Repeated anomalies from same source auto-trigger AWS Security Hub actions.