Methodology (Metrics & Freshness)
This page documents the calculation logic behind PortPulse’s operational metrics and the freshness SLO. It is designed to be auditable and reproducible.
What you get
- Daily trend time series per port (JSON/CSV)
- Snapshot/Overview with aggregated indicators
- Dwell/Wait breakdown (at‑anchor / at‑berth when available)
- Alerts (percentile thresholds + change‑point detection)
- Freshness telemetry:
as_ofandlast_updatedattached to outputs
- We aggregate public/authorized sources (port notices, movement events, AIS‑derived signals) and apply de‑duplication + robust smoothing.
- Congestion score ∈ [0,1] is a normalized blend of queue length, average wait and terminal utilization proxies.
- Freshness SLO: p95 ≤ 2h. 30‑day replay with no gaps on core ports.
Data sources
- Movement events: arrivals/departures, alongside/anchor events (from public schedules, vessel traffic reports, AIS‑derived signals).
- Port notices & bulletins: closures, weather suspensions, labor updates.
- Reference catalogs: UN/LOCODE, terminal aliases, public holiday calendars.
Access & licensing. Only public or duly authorized datasets are used. We maintain a provenance trail per record.
Ingestion & normalization
- Polling cadence: hourly (burst faster during active windows).
- Parsing & standardization: ISO‑8601 timestamps in UTC, canonical
UNLOCODE(e.g.,USLAX), consistent units (hours). - De‑duplication: keep latest event of the day per
(port, ship, event_type); collapse near‑duplicates within a tolerance window (e.g., ≤ 10 minutes). - Outlier control: winsorize extreme wait values at
[p1, p99], clip physically impossible durations. - Gap filling: forward‑fill short gaps ≤ 3 hours; longer gaps remain missing (not imputed).
- Confidence tagging: each point carries a
confidence∈ low driven by source consistency.
Metrics definitions
Average wait hours
Mean waiting time for vessels arriving during the day, measured from port limits / anchor on‑scene to all fast/berth (when berth signals are not available, we approximate using first alongside or departure with heuristics).
Edge cases
- Ballast shifts inside harbor excluded.
- Aborted calls removed (arrival without subsequent berth).
- Multi‑terminal ports: weighted by call counts.
Queue length (proxy)
Count of vessels in waiting state within port’s AOR at snapshot time, filtered by commercial type (container/general cargo as applicable).
Congestion score (0–1)
A bounded, unit‑free index combining normalized queue and wait:
let W = winsorized_avg_wait_hours;
let Q = queue_length_per_capacity; # queue divided by rolling capacity proxy
let Z = 0.5 * zscore(W) + 0.5 * zscore(Q); # standardized blend
# squash to [0,1]
congestion_score = sigmoid(Z) # 1 / (1 + e^-Z)
Notes
- Capacity proxy derives from a rolling baseline of weekly handled calls.
- We also publish raw components where available.
Trend & snapshot generation
We emit one daily point per port (UTC end‑of‑day), plus an intra‑day snapshot:
for each port, hourly:
fetch latest events + notices
clean + dedupe + winsorize + fill short gaps
compute wait/queue + congestion_score
write snapshot {as_of, last_updated}
end
for each port, daily (UTC 23:59):
aggregate daily metrics
append to trend series (30d+ retention)
publish CSV/JSON (ETag, Cache-Control)
end
as_of: timestamp the metric represents.last_updated: ETL completion time for that record.
Freshness SLO & monitoring
- SLO: p95 of
(now - last_updated)≤ 2 hours per port. - Dashboards: freshness percentiles
p50/p95/maxtracked by port and region. - Alerting: breach at
p95 > 2hfor > 2 consecutive hours triggers incident. - Backfill policy: missed windows retried automatically; next‑day backfill if needed (records keep original
as_of).
See also: SLA & Status.
Quality controls
- Schema checks: required fields present, units consistent.
- Statistical guards: day‑to‑day delta bounds; structural break detection.
- Replay audits: 30‑day series must be gapless on core ports before release.
- Manual review hooks: anomalies with
confidence=lowsurface to triage.
Reproducibility & auditing
- OpenAPI describes every field and shape:
/openapi(Redoc). - CSV parity: every trend endpoint supports
?format=csv. - Caching:
Cache-Control: public, max-age=300+ strongETagon CSV for304support (CSV & ETag). - Traceability: each response carries
x-request-id; errors use the unified body (Errors).
Quick check (cURL):
curl -H "X-API-Key: dev_demo_123" \
"https://api.useportpulse.com/v1/ports/USLAX/trend?days=30&format=csv" \
-i
Limitations & caveats
- Some ports do not publish berth events; we approximate with robust heuristics.
- Extreme weather / labor events can distort baselines; we mark such windows.
- AIS coverage variance can affect queue detection; confidence will reflect sources.
Changelog & versioning
- Contract:
/v1is frozen for P1; breaking changes go to/v1beta. - Deprecation window: ≥ 90 days; changes announced on /docs/changelog and Versioning.
Field reference
See the curated Field Dictionary for types, units, and nullability.