Skip to content

Diagnostics Mode

The server includes a built-in diagnostics system that periodically captures system and application metrics, writing JSON snapshot files for offline analysis of performance, scalability, and memory behavior.

Activation

There are two ways to enable diagnostics:

bash
./safecall --debug

The --debug flag overrides server_mode to "debug" at startup without requiring config file changes. This is the preferred method for temporary debugging sessions.

Config-based (for persistent debug deployments)

Set server_mode to "debug" in config/main.config.json:

json
{
  "server_mode": "debug"
}

Both methods activate the same diagnostics system. The CLI flag takes precedence over the config file.

Configuration Options

KeyDefaultDescription
server_debug_snapshot_interval300Seconds between diagnostic snapshots
server_debug_log_dir"logs/diagnostics"Directory for snapshot files (relative to binary)
server_debug_max_files288Maximum snapshot files to retain (288 = 24h at 5min intervals)

Output

Snapshots are written as JSON files to the configured directory:

logs/diagnostics/
  diag-2026-02-18T14-30-00-000Z.json
  diag-2026-02-18T14-35-00-000Z.json
  ...

Each file is a self-contained snapshot. Old files are automatically rotated when server_debug_max_files is exceeded. The first snapshot is taken 5 seconds after startup to capture the initial state.

Snapshot Contents

Server Info

Basic identification: version, server ID, and mode.

System Metrics

  • Memory: RSS, heap used, heap total, external, array buffers (raw bytes and formatted)
  • CPU: User/system microseconds (cumulative and delta since last snapshot), CPU percentage
  • Event loop lag: Measured in milliseconds via timer scheduling
  • Platform: OS platform and architecture

MQTT Metrics

  • Connection status (connected, reconnecting, disconnected)
  • Total messages received (cumulative and since last snapshot)
  • Message rate per second
  • Breakdown by message type (adv1, adv4, adv8, alive)
  • Invalid messages dropped
  • Total bytes received

Device Metrics

  • Gateways, beacons, and sightings in memory
  • Active beacons (from DB cache)
  • KISS beacons
  • Pending inactivity timers

Tracking / RTLS Metrics

  • Current algorithm name
  • Beacons being tracked and predictions cached
  • Zones, gateways, and maps loaded from DB
  • Total RSSI entries and gateway-beacon pairs
  • Last algorithm cycle: duration in ms, beacons processed, beacons skipped, predictions changed, average RSSI samples per beacon

Event Metrics

  • Trigger cache, panic event IDs, sensor entries, gateway cache sizes

WebSocket Metrics

  • Active and total connections
  • Messages received and published
  • Subscriptions by type (track, debug, overview, events, test_beacon, gatify)

Memory Estimates

Estimated memory consumption for the largest in-memory caches:

  • Tracking RSSI cache (based on entry count and structure)
  • Device sightings cache
  • Combined total

Example Snapshot

json
{
  "timestamp": "2026-02-18T14:30:00.000Z",
  "uptime_seconds": 3600,
  "server": { "version": "5.7.4", "id": "TST000", "mode": "debug" },
  "system": {
    "memory": {
      "rss": 104857600,
      "heap_used": 52428800,
      "heap_total": 67108864,
      "rss_formatted": "100.00 MB",
      "heap_used_formatted": "50.00 MB"
    },
    "cpu": {
      "percent_since_last": 2.5,
      "delta_user_us": 500000,
      "delta_system_us": 100000
    },
    "event_loop_lag_ms": 1.2,
    "platform": "linux",
    "arch": "arm64"
  },
  "mqtt": {
    "status": "connected",
    "messages_total": 150000,
    "messages_since_last": 2500,
    "rate_per_second": 8.3,
    "by_type": { "adv1": 500, "adv4": 1800, "adv8": 100, "alive": 100 }
  },
  "tracking": {
    "algorithm": "zone_voting",
    "beacons_tracked": 150,
    "last_cycle": {
      "duration_ms": 12,
      "beacons_processed": 150,
      "predictions_changed": 8
    }
  },
  "websocket": {
    "connections_active": 4,
    "messages_published": 8200
  },
  "memory_estimates": {
    "total_estimated_formatted": "1.00 MB"
  }
}

Use Cases

Identifying scaling bottlenecks

Compare tracking.last_cycle.duration_ms across snapshots as the number of tracking.beacons_tracked grows. If cycle duration increases non-linearly with beacon count, the algorithm may need optimization.

Detecting memory leaks

Monitor system.memory.rss and system.memory.heap_used across snapshots taken over hours or days. Steadily increasing values without a corresponding increase in tracked devices indicates a leak.

MQTT throughput analysis

Use mqtt.rate_per_second and mqtt.by_type to understand message volume. High mqtt.invalid_dropped values may indicate gateway firmware issues.

Resource planning

Use memory_estimates to project memory requirements for larger deployments. The tracking RSSI cache grows linearly with beacon count and gateway density.

Architecture

The diagnostics system consists of two modules:

  • server/src/diagnostics.ts — Owns the snapshot timer, collects metrics from all subsystems, writes JSON files, and manages file rotation.
  • server/src/diagnostics_counters.ts — Lightweight counters that other modules (MQTT, WebSocket, tracking algorithms) increment during normal operation. When diagnostics is disabled, these are effectively no-ops.

Counters are incremented inline in the hot path (MQTT message processing, WS publish, algorithm cycles) with minimal overhead. The snapshot collection runs on a separate timer and reads all counters plus process.memoryUsage() and process.cpuUsage().