Skip to content

Observability Overview

The API provides structured logging, metrics, log-shipping, tracing, and optional analytics via a small set of ports and adapters — the same pattern as the Caching System.

Architecture

Ports and adapters

PillarPurposePort / contractAdapters / backendsConfig prefixWired in
LoggingStructured JSON logsConfig only (no port)Pino → stdoutLOG_*lib/logger
MetricsHTTP request metricsGET /metrics scrapePrometheusMETRICS_*lib/metrics
TelemetryRequest-summary log shippingITelemetryAdapternoop, CloudWatchTELEMETRY_*lib/telemetry, request-logging middleware
TracingDistributed tracesOpenTelemetry SDKJaeger, OTLPTRACING_*, JAEGER_*, OTLP_*lib/tracing (first import in server)
AnalyticsOptional event trackingIAnalyticsAdapternoop, UmamiANALYTICS_*lib/analytics

Runbook and Prometheus config: observability/README.md, observability/prometheus.yml. First dashboard: Grafana dashboards.

Quick start

bash
LOG_LEVEL=info
LOG_PRETTY_PRINT=false
METRICS_ENABLED=true
# Optional: TRACING_ENABLED=true, TRACING_BACKEND=jaeger, JAEGER_ENDPOINT=http://localhost:14268/api/traces

Logging

Structured JSON via Pino. Level and pretty-print in LOGGING_CONFIG (env.config.ts); applied at bootstrap in lib/logger.

VariableDefault (prod / dev)Description
LOG_LEVELinfo / debugtrace, debug, info, warn, error, fatal
LOG_PRETTY_PRINTfalse / trueHuman-readable in dev

Wired in: Request middleware attaches requestId and a child logger. The context built for each request includes requestLogger (see Structured Logging).

Request-scoped logger: In REST routes use context.requestLogger; in GraphQL resolvers use context.requestLogger (same context). For code that only has req, use getRequestLogger(req) from @/middleware/request-logging.middleware. Any log from that logger includes requestId. Handlers that log (e.g. on error) accept an optional last parameter requestLogger?: ILogger; pass context.requestLogger from routes and resolvers so handler-originated logs are correlated by requestId.

typescript
context.requestLogger.info({ msg: 'Organization created', organizationId: org.id });
// In a handler: (requestLogger ?? this.logger).error({ msg: 'Send failed', err: error });

Log objects with msg; avoid string interpolation. In apps/api: import createLogger, createModuleLogger from @/lib/logger. Never import @grantjs/logger directly.

LevelUse
traceVery noisy, dev only
debugDiagnostic
infoBusiness events
warnHandled anomalies
errorFailures
fatalUnrecoverable

Metrics

HTTP request duration and count for Prometheus when enabled. Labels: method, route, status_code.

VariableDefaultDescription
METRICS_ENABLEDfalseExpose GET /metrics
METRICS_ENDPOINT/metricsPath
METRICS_COLLECT_DEFAULTStrueCPU, memory, event loop

Wired in: lib/metrics. Middleware records on response finish; handler serves at METRICS_ENDPOINT.

Runbook: Start API with METRICS_ENABLED=true; docker compose up -d prometheus grafana; in Grafana add Prometheus data source http://prometheus:9090. See Grafana dashboards for first dashboard steps.

PromQL: Request rate rate(http_requests_total[5m]). P95 duration by route: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le, route)).


Telemetry (log shipping)

Optional adapter to send request-summary logs to a backend (e.g. CloudWatch Logs). Port: ITelemetryAdapter (@grantjs/core); adapters in @grantjs/telemetry (noop, CloudWatch).

VariableDefaultDescription
TELEMETRY_PROVIDERnonenone or cloudwatch
TELEMETRY_CLOUDWATCH_REGIONus-east-1AWS region
TELEMETRY_CLOUDWATCH_LOG_GROUPLog group name
TELEMETRY_CLOUDWATCH_LOG_STREAM_PREFIXgrant-apiStream prefix

Wired in: lib/telemetry. Request-logging middleware calls getTelemetryAdapter().sendLog(...) on response finish (fire-and-forget).


Tracing

Distributed tracing via OpenTelemetry. Implemented. Spans include http.request_id and optional http.user_id for correlation with logs.

VariableDefaultDescription
TRACING_ENABLEDfalseEnable OTel SDK
TRACING_BACKENDjaegerjaeger, otlp, xray
JAEGER_ENDPOINThttp://localhost:14268/api/tracesJaeger collector
OTLP_ENDPOINThttp://localhost:4318/v1/tracesOTLP endpoint
TRACING_SAMPLING_RATE1.0Sampling 0–1
TRACING_SERVICE_NAMEgrant-apiService name in traces

Wired in: lib/tracingfirst import in server.ts so the SDK patches http/express before they load. Request-logging middleware sets http.request_id and http.user_id on the active span. shutdownTracing() called in graceful shutdown before DB/cache close.

Runbook: Start Jaeger (docker compose up -d jaeger); set TRACING_ENABLED=true, TRACING_BACKEND=jaeger, JAEGER_ENDPOINT; restart API; open Jaeger UI (http://localhost:16686), service grant-api. Full reference: Tracing.


Analytics

Optional event tracking. Port: IAnalyticsAdapter (@grantjs/core); adapters in @grantjs/analytics (noop, Umami). Grant does not store events; adapters forward to the backend of your choice.

VariableDefaultDescription
ANALYTICS_ENABLEDfalseEnable tracking
ANALYTICS_PROVIDERnonenone or umami
ANALYTICS_UMAMI_API_URLUmami API base URL
ANALYTICS_UMAMI_WEBSITE_IDWebsite ID from Umami
ANALYTICS_UMAMI_HOSTNAMEgrant-apiHostname per event

Wired in: lib/analytics. Handlers call getAnalyticsAdapter().trackEvent(...) (fire-and-forget). Usage: Analytics. First dashboard: Umami dashboards.


Practices

  • Use req.logger (or getRequestLogger(req)) in request handlers so logs include requestId.
  • Log structured objects with msg; avoid string interpolation.
  • Set LOG_LEVEL by environment (e.g. debug in dev, info in prod).
  • Correlation: requestId is automatic via request-scoped logger and is set on trace spans.

Related:

Released under the MIT License.