Statistical Anomaly Detection at Scale: NPDS Surveillance Case Study
Building seasonality-aware baselines for 9M+ records and detecting signals 3 weeks early.
The Challenge
The National Poison Data System (NPDS) receives millions of poison exposure reports annually. Traditional surveillance methods rely on threshold-based alerts that trigger only after a substance has already become a significant problem.
Our goal: detect emerging substance signals weeks earlier than traditional methods, while minimizing false positives.
Understanding the Data
We worked with 9M+ records spanning several years. The data exhibited complex patterns:
- Daily seasonality: Call volumes peak mid-afternoon
- Weekly seasonality: Weekends differ from weekdays
- Monthly/Annual seasonality: Flu season, holidays, school schedules
- Trend components: Gradual shifts in substance categories
The Baseline Problem
A naive approach—flagging values above a fixed threshold—fails spectacularly:
- Winter flu season triggers constant false positives
- Weekend dips mask Monday spikes
- Holiday periods look anomalous when they're actually predictable
Our Approach
1. Decomposition-Based Baselines
We decomposed each substance's time series into components:
from statsmodels.tsa.seasonal import STL def build_baseline(series, period=7): stl = STL(series, period=period, robust=True) result = stl.fit() # Baseline = trend + seasonal baseline = result.trend + result.seasonal # Residuals are what we monitor for anomalies residuals = result.resid return baseline, residuals
2. Multi-Scale Seasonality
Single-period decomposition missed patterns. We implemented hierarchical decomposition:
def multi_scale_baseline(series): # Remove weekly pattern first weekly_stl = STL(series, period=7).fit() deseasonalized = series - weekly_stl.seasonal # Then remove monthly pattern monthly_stl = STL(deseasonalized, period=30).fit() # Combined baseline baseline = weekly_stl.seasonal + monthly_stl.seasonal + monthly_stl.trend return baseline
3. Adaptive Thresholds
Fixed standard deviation thresholds don't account for heteroscedasticity (variance changing over time). We used rolling statistics:
def adaptive_threshold(residuals, window=28): rolling_std = residuals.rolling(window=window).std() rolling_mean = residuals.rolling(window=window).mean() # Z-score with local statistics z_scores = (residuals - rolling_mean) / rolling_std # Flag if z-score exceeds threshold for consecutive days return detect_consecutive_anomalies(z_scores, threshold=2.5, min_consecutive=3)
The Critical Insight: Consecutive Anomalies
Single-day spikes are often noise. Real emerging signals show sustained deviation. We required 3+ consecutive days above threshold before alerting.
This simple heuristic reduced false positives by 60% while maintaining sensitivity to true signals.
AWS SageMaker Deployment
Processing 9M+ records required scalable infrastructure:
Architecture
S3 (raw data)
→ SageMaker Processing (daily ETL)
→ Feature Store (baseline/residuals)
→ SageMaker Endpoint (anomaly scoring)
→ SNS (alerts)
→ QuickSight (dashboards)
Key Decisions
- Processing Jobs over Lambda: Long-running decomposition exceeded Lambda limits
- Feature Store for baselines: Pre-computed baselines reduced inference latency
- Batch inference: Daily runs were sufficient; real-time wasn't needed
Results
Detection Performance
| Metric | Traditional | Our System | |--------|-------------|------------| | Average detection lead time | Baseline | +3.2 weeks | | False positive rate | 12% | 4% | | Missed emerging signals | 23% | 8% |
Operational Impact
- 3 weeks earlier detection on average for emerging substance signals
- Reduced analyst triage time by focusing on high-confidence alerts
- Quarterly reports now include forward-looking signal analysis
Lessons Learned
- Domain knowledge matters: Understanding seasonal patterns required epidemiologist input
- Consecutive > single: Requiring sustained deviation dramatically improved precision
- Hierarchical decomposition: Multiple seasonal periods are common in real data
- Infrastructure scales with data: SageMaker handled growth gracefully
What's Next
We're exploring:
- Prophet integration: Facebook's library handles holidays and special events natively
- Regional decomposition: Geographic patterns may reveal localized outbreaks earlier
- Multivariate signals: Cross-substance correlations might predict emerging trends
The core insight—that statistical baselines with domain-informed thresholds outperform complex ML for this problem—continues to guide our approach.