Back to blog
MLOpsSeptember 10, 202410 min read

Statistical Anomaly Detection at Scale: NPDS Surveillance Case Study

Building seasonality-aware baselines for 9M+ records and detecting signals 3 weeks early.

The Challenge

The National Poison Data System (NPDS) receives millions of poison exposure reports annually. Traditional surveillance methods rely on threshold-based alerts that trigger only after a substance has already become a significant problem.

Our goal: detect emerging substance signals weeks earlier than traditional methods, while minimizing false positives.

Understanding the Data

We worked with 9M+ records spanning several years. The data exhibited complex patterns:

  • Daily seasonality: Call volumes peak mid-afternoon
  • Weekly seasonality: Weekends differ from weekdays
  • Monthly/Annual seasonality: Flu season, holidays, school schedules
  • Trend components: Gradual shifts in substance categories

The Baseline Problem

A naive approach—flagging values above a fixed threshold—fails spectacularly:

  • Winter flu season triggers constant false positives
  • Weekend dips mask Monday spikes
  • Holiday periods look anomalous when they're actually predictable

Our Approach

1. Decomposition-Based Baselines

We decomposed each substance's time series into components:

from statsmodels.tsa.seasonal import STL

def build_baseline(series, period=7):
    stl = STL(series, period=period, robust=True)
    result = stl.fit()

    # Baseline = trend + seasonal
    baseline = result.trend + result.seasonal
    # Residuals are what we monitor for anomalies
    residuals = result.resid

    return baseline, residuals

2. Multi-Scale Seasonality

Single-period decomposition missed patterns. We implemented hierarchical decomposition:

def multi_scale_baseline(series):
    # Remove weekly pattern first
    weekly_stl = STL(series, period=7).fit()
    deseasonalized = series - weekly_stl.seasonal

    # Then remove monthly pattern
    monthly_stl = STL(deseasonalized, period=30).fit()

    # Combined baseline
    baseline = weekly_stl.seasonal + monthly_stl.seasonal + monthly_stl.trend
    return baseline

3. Adaptive Thresholds

Fixed standard deviation thresholds don't account for heteroscedasticity (variance changing over time). We used rolling statistics:

def adaptive_threshold(residuals, window=28):
    rolling_std = residuals.rolling(window=window).std()
    rolling_mean = residuals.rolling(window=window).mean()

    # Z-score with local statistics
    z_scores = (residuals - rolling_mean) / rolling_std

    # Flag if z-score exceeds threshold for consecutive days
    return detect_consecutive_anomalies(z_scores, threshold=2.5, min_consecutive=3)

The Critical Insight: Consecutive Anomalies

Single-day spikes are often noise. Real emerging signals show sustained deviation. We required 3+ consecutive days above threshold before alerting.

This simple heuristic reduced false positives by 60% while maintaining sensitivity to true signals.

AWS SageMaker Deployment

Processing 9M+ records required scalable infrastructure:

Architecture

S3 (raw data)
    → SageMaker Processing (daily ETL)
    → Feature Store (baseline/residuals)
    → SageMaker Endpoint (anomaly scoring)
    → SNS (alerts)
    → QuickSight (dashboards)

Key Decisions

  1. Processing Jobs over Lambda: Long-running decomposition exceeded Lambda limits
  2. Feature Store for baselines: Pre-computed baselines reduced inference latency
  3. Batch inference: Daily runs were sufficient; real-time wasn't needed

Results

Detection Performance

| Metric | Traditional | Our System | |--------|-------------|------------| | Average detection lead time | Baseline | +3.2 weeks | | False positive rate | 12% | 4% | | Missed emerging signals | 23% | 8% |

Operational Impact

  • 3 weeks earlier detection on average for emerging substance signals
  • Reduced analyst triage time by focusing on high-confidence alerts
  • Quarterly reports now include forward-looking signal analysis

Lessons Learned

  1. Domain knowledge matters: Understanding seasonal patterns required epidemiologist input
  2. Consecutive > single: Requiring sustained deviation dramatically improved precision
  3. Hierarchical decomposition: Multiple seasonal periods are common in real data
  4. Infrastructure scales with data: SageMaker handled growth gracefully

What's Next

We're exploring:

  • Prophet integration: Facebook's library handles holidays and special events natively
  • Regional decomposition: Geographic patterns may reveal localized outbreaks earlier
  • Multivariate signals: Cross-substance correlations might predict emerging trends

The core insight—that statistical baselines with domain-informed thresholds outperform complex ML for this problem—continues to guide our approach.