Common System 0.2.0
Common interfaces and patterns for system integration
Loading...
Searching...
No Matches
kcenon::common::interfaces::health_monitor Class Reference

Central health monitoring system. More...

#include <health_monitor.h>

Collaboration diagram for kcenon::common::interfaces::health_monitor:
Collaboration graph

Public Member Functions

 health_monitor ()=default
 
 ~health_monitor ()
 
 health_monitor (const health_monitor &)=delete
 
health_monitoroperator= (const health_monitor &)=delete
 
 health_monitor (health_monitor &&)=delete
 
health_monitoroperator= (health_monitor &&)=delete
 
Result< bool > register_check (const std::string &name, std::shared_ptr< health_check > check)
 Register a health check.
 
Result< bool > unregister_check (const std::string &name)
 Unregister a health check.
 
Result< health_check_resultcheck (const std::string &name)
 Execute a specific health check.
 
Result< bool > add_dependency (const std::string &dependent, const std::string &dependency)
 Add a dependency between health checks.
 
VoidResult start ()
 Start the health monitoring.
 
VoidResult stop ()
 Stop the health monitoring.
 
bool is_running () const
 Check if health monitoring is running.
 
void refresh ()
 Refresh all health checks.
 
void register_recovery_handler (const std::string &name, recovery_handler handler)
 Register a recovery handler for a health check.
 
health_monitor_stats get_stats () const
 Get monitoring statistics.
 
std::string get_health_report () const
 Get a formatted health report.
 
health_status get_overall_status () const
 Get the overall health status.
 
bool has_check (const std::string &name) const
 Check if a health check is registered.
 
std::vector< std::string > get_check_names () const
 Get all registered check names.
 

Private Member Functions

void update_stats_after_check (const health_check_result &result)
 
void attempt_recovery (const std::string &name)
 
std::string get_overall_status_string () const
 
health_status get_overall_status_internal () const
 

Private Attributes

health_dependency_graph graph_
 
std::unordered_map< std::string, recovery_handlerrecovery_handlers_
 
std::unordered_map< std::string, health_check_resultlast_results_
 
health_monitor_stats stats_
 
std::atomic< bool > running_ {false}
 
std::mutex mutex_
 

Detailed Description

Central health monitoring system.

This class provides a complete health monitoring solution with:

  • Health check registration and management
  • Dependency tracking between checks
  • Automatic health check execution
  • Recovery handler support
  • Statistics and reporting

Example usage:

// Register health checks
monitor.register_check("database", db_check);
monitor.register_check("cache", cache_check);
monitor.register_check("api", api_check);
// Define dependencies
monitor.add_dependency("api", "database");
monitor.add_dependency("api", "cache");
// Register recovery handlers
monitor.register_recovery_handler("database", []() {
// Attempt to reconnect
return true;
});
// Start monitoring
monitor.start();
// Get health report
std::cout << monitor.get_health_report();
Central health monitoring system.
Result< bool > add_dependency(const std::string &dependent, const std::string &dependency)
Add a dependency between health checks.
VoidResult start()
Start the health monitoring.
std::string get_health_report() const
Get a formatted health report.
void register_recovery_handler(const std::string &name, recovery_handler handler)
Register a recovery handler for a health check.
Result< bool > register_check(const std::string &name, std::shared_ptr< health_check > check)
Register a health check.

Definition at line 95 of file health_monitor.h.

Constructor & Destructor Documentation

◆ health_monitor() [1/3]

kcenon::common::interfaces::health_monitor::health_monitor ( )
default

◆ ~health_monitor()

kcenon::common::interfaces::health_monitor::~health_monitor ( )
inline

Definition at line 99 of file health_monitor.h.

99{ stop().value_or(std::monostate{}); }
T value_or(T default_value) const
Get value or return default (C++23 std::expected compatible)
Definition core.h:384
VoidResult stop()
Stop the health monitoring.

References stop(), and kcenon::common::Result< T >::value_or().

Here is the call graph for this function:

◆ health_monitor() [2/3]

kcenon::common::interfaces::health_monitor::health_monitor ( const health_monitor & )
delete

◆ health_monitor() [3/3]

kcenon::common::interfaces::health_monitor::health_monitor ( health_monitor && )
delete

Member Function Documentation

◆ add_dependency()

Result< bool > kcenon::common::interfaces::health_monitor::add_dependency ( const std::string & dependent,
const std::string & dependency )
inline

Add a dependency between health checks.

Parameters
dependentThe check that depends on another
dependencyThe check being depended upon
Returns
Result indicating success or failure

Definition at line 175 of file health_monitor.h.

175 {
176 std::lock_guard<std::mutex> lock(mutex_);
177 return graph_.add_dependency(dependent, dependency);
178 }
Result< bool > add_dependency(const std::string &dependent, const std::string &dependency)
Add a dependency between two nodes.

References kcenon::common::interfaces::health_dependency_graph::add_dependency(), kcenon::common::interfaces::dependency, graph_, and mutex_.

Here is the call graph for this function:

◆ attempt_recovery()

void kcenon::common::interfaces::health_monitor::attempt_recovery ( const std::string & name)
inlineprivate

Definition at line 350 of file health_monitor.h.

350 {
351 auto it = recovery_handlers_.find(name);
352 if (it == recovery_handlers_.end()) {
353 return;
354 }
355
357 if (it->second()) {
359 }
360 }
std::unordered_map< std::string, recovery_handler > recovery_handlers_

References kcenon::common::interfaces::health_monitor_stats::recovery_attempts, recovery_handlers_, stats_, and kcenon::common::interfaces::health_monitor_stats::successful_recoveries.

Referenced by check(), and refresh().

Here is the caller graph for this function:

◆ check()

Result< health_check_result > kcenon::common::interfaces::health_monitor::check ( const std::string & name)
inline

Execute a specific health check.

Parameters
nameName of the check to execute
Returns
Result containing health check result or error

Definition at line 144 of file health_monitor.h.

144 {
145 std::lock_guard<std::mutex> lock(mutex_);
146
147 auto start_time = std::chrono::steady_clock::now();
148 auto result = graph_.check_with_dependencies(name);
149 auto end_time = std::chrono::steady_clock::now();
150
151 if (result.is_ok()) {
152 last_results_[name] = result.value();
153 update_stats_after_check(result.value());
154
155 stats_.last_check_time = std::chrono::system_clock::now();
156 stats_.last_check_duration = std::chrono::duration_cast<std::chrono::milliseconds>(
157 end_time - start_time);
159
160 // Trigger recovery if unhealthy
161 if (result.value().status == health_status::unhealthy) {
162 attempt_recovery(name);
163 }
164 }
165
166 return result;
167 }
Result< health_check_result > check_with_dependencies(const std::string &name)
Execute health check with its dependencies.
void attempt_recovery(const std::string &name)
void update_stats_after_check(const health_check_result &result)
std::unordered_map< std::string, health_check_result > last_results_
std::chrono::system_clock::time_point last_check_time

References attempt_recovery(), kcenon::common::interfaces::health_monitor_stats::check_executions, kcenon::common::interfaces::health_dependency_graph::check_with_dependencies(), graph_, kcenon::common::interfaces::health_monitor_stats::last_check_duration, kcenon::common::interfaces::health_monitor_stats::last_check_time, last_results_, mutex_, stats_, kcenon::common::interfaces::unhealthy, and update_stats_after_check().

Referenced by register_check().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ get_check_names()

std::vector< std::string > kcenon::common::interfaces::health_monitor::get_check_names ( ) const
inlinenodiscard

Get all registered check names.

Returns
Vector of check names

Definition at line 327 of file health_monitor.h.

327 {
328 std::lock_guard<std::mutex> lock(mutex_);
329 return graph_.get_all_nodes();
330 }
std::vector< std::string > get_all_nodes() const
Get all node names.

References kcenon::common::interfaces::health_dependency_graph::get_all_nodes(), graph_, and mutex_.

Here is the call graph for this function:

◆ get_health_report()

std::string kcenon::common::interfaces::health_monitor::get_health_report ( ) const
inlinenodiscard

Get a formatted health report.

Returns
Human-readable health report

Definition at line 267 of file health_monitor.h.

267 {
268 std::lock_guard<std::mutex> lock(mutex_);
269
270 std::ostringstream report;
271 report << "=== Health Report ===\n";
272 report << "Status: " << get_overall_status_string() << "\n";
273 report << "Total Checks: " << stats_.total_checks << "\n";
274 report << "Healthy: " << stats_.healthy_count << "\n";
275 report << "Degraded: " << stats_.degraded_count << "\n";
276 report << "Unhealthy: " << stats_.unhealthy_count << "\n";
277 report << "Unknown: " << stats_.unknown_count << "\n";
278 report << "\n--- Individual Checks ---\n";
279
280 for (const auto& [name, result] : last_results_) {
281 report << name << ": " << to_string(result.status);
282 if (!result.message.empty()) {
283 report << " - " << result.message;
284 }
285 report << "\n";
286 }
287
288 return report.str();
289 }
std::string to_string(log_level level)
Convert log level to string.

References kcenon::common::interfaces::health_monitor_stats::degraded_count, get_overall_status_string(), kcenon::common::interfaces::health_monitor_stats::healthy_count, last_results_, mutex_, stats_, kcenon::common::interfaces::to_string(), kcenon::common::interfaces::health_monitor_stats::total_checks, kcenon::common::interfaces::health_monitor_stats::unhealthy_count, and kcenon::common::interfaces::health_monitor_stats::unknown_count.

Here is the call graph for this function:

◆ get_overall_status()

◆ get_overall_status_internal()

◆ get_overall_status_string()

std::string kcenon::common::interfaces::health_monitor::get_overall_status_string ( ) const
inlinenodiscardprivate

Definition at line 362 of file health_monitor.h.

362 {
364 }

References get_overall_status_internal(), and kcenon::common::interfaces::to_string().

Referenced by get_health_report().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ get_stats()

health_monitor_stats kcenon::common::interfaces::health_monitor::get_stats ( ) const
inlinenodiscard

Get monitoring statistics.

Returns
Current statistics

Definition at line 258 of file health_monitor.h.

258 {
259 std::lock_guard<std::mutex> lock(mutex_);
260 return stats_;
261 }

References mutex_, and stats_.

◆ has_check()

bool kcenon::common::interfaces::health_monitor::has_check ( const std::string & name) const
inlinenodiscard

Check if a health check is registered.

Parameters
nameName of the health check
Returns
true if registered

Definition at line 318 of file health_monitor.h.

318 {
319 std::lock_guard<std::mutex> lock(mutex_);
320 return graph_.has_node(name);
321 }
bool has_node(const std::string &name) const
Check if a node exists.

References graph_, kcenon::common::interfaces::health_dependency_graph::has_node(), and mutex_.

Here is the call graph for this function:

◆ is_running()

bool kcenon::common::interfaces::health_monitor::is_running ( ) const
inlinenodiscard

Check if health monitoring is running.

Returns
true if running

Definition at line 206 of file health_monitor.h.

206{ return running_.load(); }

References running_.

◆ operator=() [1/2]

health_monitor & kcenon::common::interfaces::health_monitor::operator= ( const health_monitor & )
delete

◆ operator=() [2/2]

health_monitor & kcenon::common::interfaces::health_monitor::operator= ( health_monitor && )
delete

◆ refresh()

void kcenon::common::interfaces::health_monitor::refresh ( )
inline

Refresh all health checks.

Executes all registered health checks and updates statistics.

Definition at line 213 of file health_monitor.h.

213 {
214 std::lock_guard<std::mutex> lock(mutex_);
215
216 auto start_time = std::chrono::steady_clock::now();
217
218 // Reset counts
223
224 auto nodes = graph_.get_all_nodes();
225 for (const auto& name : nodes) {
226 auto result = graph_.check_with_dependencies(name);
227 if (result.is_ok()) {
228 last_results_[name] = result.value();
229 update_stats_after_check(result.value());
230
231 if (result.value().status == health_status::unhealthy) {
232 attempt_recovery(name);
233 }
234 }
235 }
236
237 auto end_time = std::chrono::steady_clock::now();
238 stats_.last_check_time = std::chrono::system_clock::now();
239 stats_.last_check_duration = std::chrono::duration_cast<std::chrono::milliseconds>(
240 end_time - start_time);
242 }

References attempt_recovery(), kcenon::common::interfaces::health_monitor_stats::check_executions, kcenon::common::interfaces::health_dependency_graph::check_with_dependencies(), kcenon::common::interfaces::health_monitor_stats::degraded_count, kcenon::common::interfaces::health_dependency_graph::get_all_nodes(), graph_, kcenon::common::interfaces::health_monitor_stats::healthy_count, kcenon::common::interfaces::health_monitor_stats::last_check_duration, kcenon::common::interfaces::health_monitor_stats::last_check_time, last_results_, mutex_, stats_, kcenon::common::interfaces::unhealthy, kcenon::common::interfaces::health_monitor_stats::unhealthy_count, kcenon::common::interfaces::health_monitor_stats::unknown_count, and update_stats_after_check().

Here is the call graph for this function:

◆ register_check()

Result< bool > kcenon::common::interfaces::health_monitor::register_check ( const std::string & name,
std::shared_ptr< health_check > check )
inline

Register a health check.

Parameters
nameUnique name for this check
checkHealth check implementation
Returns
Result indicating success or failure

Definition at line 112 of file health_monitor.h.

112 {
113 std::lock_guard<std::mutex> lock(mutex_);
114
115 auto result = graph_.add_node(name, std::move(check));
116 if (result.is_ok()) {
118 }
119 return result;
120 }
Result< bool > add_node(const std::string &name, std::shared_ptr< health_check > check)
Add a health check node to the graph.
Result< health_check_result > check(const std::string &name)
Execute a specific health check.

References kcenon::common::interfaces::health_dependency_graph::add_node(), check(), graph_, mutex_, stats_, and kcenon::common::interfaces::health_monitor_stats::total_checks.

Here is the call graph for this function:

◆ register_recovery_handler()

void kcenon::common::interfaces::health_monitor::register_recovery_handler ( const std::string & name,
recovery_handler handler )
inline

Register a recovery handler for a health check.

Parameters
nameName of the health check
handlerRecovery function to execute on failure

Definition at line 249 of file health_monitor.h.

249 {
250 std::lock_guard<std::mutex> lock(mutex_);
251 recovery_handlers_[name] = std::move(handler);
252 }

References mutex_, and recovery_handlers_.

◆ start()

VoidResult kcenon::common::interfaces::health_monitor::start ( )
inline

Start the health monitoring.

Returns
Result indicating success or failure

Definition at line 184 of file health_monitor.h.

184 {
185 if (running_.exchange(true)) {
186 return {error_info{1, "Health monitor is already running", "health_monitor"}};
187 }
188 return ok(std::monostate{});
189 }
VoidResult ok()
Create a successful void result.
Definition utilities.h:71

References kcenon::common::ok(), and running_.

Here is the call graph for this function:

◆ stop()

VoidResult kcenon::common::interfaces::health_monitor::stop ( )
inline

Stop the health monitoring.

Returns
Result indicating success or failure

Definition at line 195 of file health_monitor.h.

195 {
196 if (!running_.exchange(false)) {
197 return {error_info{1, "Health monitor is not running", "health_monitor"}};
198 }
199 return ok(std::monostate{});
200 }

References kcenon::common::ok(), and running_.

Referenced by ~health_monitor().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ unregister_check()

Result< bool > kcenon::common::interfaces::health_monitor::unregister_check ( const std::string & name)
inline

Unregister a health check.

Parameters
nameName of the check to remove
Returns
Result indicating success or failure

Definition at line 127 of file health_monitor.h.

127 {
128 std::lock_guard<std::mutex> lock(mutex_);
129
130 auto result = graph_.remove_node(name);
131 if (result.is_ok()) {
133 recovery_handlers_.erase(name);
134 last_results_.erase(name);
135 }
136 return result;
137 }
Result< bool > remove_node(const std::string &name)
Remove a health check node from the graph.

References graph_, last_results_, mutex_, recovery_handlers_, kcenon::common::interfaces::health_dependency_graph::remove_node(), stats_, and kcenon::common::interfaces::health_monitor_stats::total_checks.

Here is the call graph for this function:

◆ update_stats_after_check()

Member Data Documentation

◆ graph_

health_dependency_graph kcenon::common::interfaces::health_monitor::graph_
private

◆ last_results_

std::unordered_map<std::string, health_check_result> kcenon::common::interfaces::health_monitor::last_results_
private

Definition at line 384 of file health_monitor.h.

Referenced by check(), get_health_report(), refresh(), and unregister_check().

◆ mutex_

std::mutex kcenon::common::interfaces::health_monitor::mutex_
mutableprivate

◆ recovery_handlers_

std::unordered_map<std::string, recovery_handler> kcenon::common::interfaces::health_monitor::recovery_handlers_
private

Definition at line 383 of file health_monitor.h.

Referenced by attempt_recovery(), register_recovery_handler(), and unregister_check().

◆ running_

std::atomic<bool> kcenon::common::interfaces::health_monitor::running_ {false}
private

Definition at line 386 of file health_monitor.h.

386{false};

Referenced by is_running(), start(), and stop().

◆ stats_


The documentation for this class was generated from the following file: