|
Monitoring System 0.1.0
System resource monitoring with pluggable collectors and alerting
|
Derive from metric_base, implement collect(), and register the type with the factory:
See Metrics Tutorial for a full example.
Depends on the metric cost and the signal frequency. System metrics (CPU, RSS) at 5–10s is usually fine. High-rate counters (request count) should use event-driven updates, not polling.
Create an otlp_exporter_config, set the endpoint and service name, construct an otlp_exporter, and register it with the tracer:
Yes — at scale, exporting every span is expensive and mostly noise. Start with probabilistic sampling (e.g., 1%), and add rules to always sample errors and slow requests.
Use both. The breaker keeps the system running; the alert tells you to fix the root cause.
Use the sustained duration on threshold_trigger so transient spikes don't fire. Combine with the suppression_window to prevent re-alerts on the same condition within a cooldown period.
Register the logger as a notification sink, and the alert pipeline will write alert events to the log in addition to external notifiers. See examples/logger_di_integration_example.cpp.
monitoring_system can be both a service that other systems consume (they push metrics into it) and a consumer of other services (it pulls context from logger_system, etc.). The DI container supports both directions without circular dependency. See examples/bidirectional_di_example.cpp.
A well-tuned span creation is ~200ns and a metric sample push is ~50ns. The dominant cost is the exporter I/O, which runs async. Budget for 1–3% overhead in production workloads; more if you over-instrument.
Yes. Collectors, metrics registration, and trigger evaluation are all thread-safe. Registration during hot paths is discouraged for performance, not correctness.
Yes — the plugin system supports dynamic loading. See examples/plugin_collector_example.cpp and examples/plugin_example/. Dynamic plugins are useful for optional telemetry that shouldn't link into the main binary.
In-memory ring buffer (default), file-backed ring buffer, and OTLP export. For long-term retention, export to a proper time-series database via OTLP and let the backend handle storage.