Thread System 0.3.1
High-performance C++20 thread pool with work stealing and DAG scheduling
Loading...
Searching...
No Matches
kcenon::thread::diagnostics::health_status Struct Reference

Comprehensive health status of the thread pool. More...

#include <health_status.h>

Collaboration diagram for kcenon::thread::diagnostics::health_status:
Collaboration graph

Public Member Functions

auto is_operational () const -> bool
 Checks if the thread pool is operational.
 
auto is_healthy () const -> bool
 Checks if the thread pool is fully healthy.
 
auto http_status_code () const -> int
 Gets HTTP status code for this health status.
 
auto find_component (const std::string &name) const -> const component_health *
 Finds a component by name.
 
auto calculate_overall_status () -> void
 Calculates overall status from component states.
 
auto to_json () const -> std::string
 Converts health status to JSON string.
 
auto to_string () const -> std::string
 Converts health status to human-readable string.
 
auto to_prometheus (const std::string &pool_name="default") const -> std::string
 Converts health status to Prometheus-compatible metrics format.
 

Public Attributes

health_state overall_status {health_state::unknown}
 Overall health state of the thread pool.
 
std::string status_message
 Human-readable message about overall status.
 
std::chrono::steady_clock::time_point check_time
 Time when this health check was performed.
 
std::vector< component_healthcomponents
 Health status of individual components.
 
double uptime_seconds {0.0}
 Time since the thread pool was started (seconds).
 
std::uint64_t total_jobs_processed {0}
 Total number of jobs processed since startup.
 
double success_rate {1.0}
 Job success rate (0.0 to 1.0).
 
double avg_latency_ms {0.0}
 Average job latency in milliseconds.
 
std::size_t active_workers {0}
 Number of active workers.
 
std::size_t total_workers {0}
 Total number of workers.
 
std::size_t queue_depth {0}
 Current queue depth.
 
std::size_t queue_capacity {0}
 Queue capacity (if bounded).
 

Detailed Description

Comprehensive health status of the thread pool.

Contains overall health status, individual component health, and summary metrics. Designed to be compatible with standard health check frameworks and easily serializable to JSON.

Health Check Integration

This structure is designed to integrate with:

  • Kubernetes liveness/readiness probes
  • Spring Boot Actuator style health endpoints
  • Prometheus health metrics

Usage Example

auto health = pool->diagnostics().health_check();
if (health.overall_status == health_state::healthy) {
return http_response(200, health.to_json());
} else {
return http_response(health_state_to_http_code(health.overall_status),
health.to_json());
}
@ healthy
Component is fully operational.
auto health_state_to_http_code(health_state state) -> int
Gets HTTP status code for health state.

Definition at line 205 of file health_status.h.

Member Function Documentation

◆ calculate_overall_status()

auto kcenon::thread::diagnostics::health_status::calculate_overall_status ( ) -> void
inline

Calculates overall status from component states.

Updates overall_status based on component health:

  • If any unhealthy → unhealthy
  • If any degraded → degraded
  • If all healthy → healthy
  • If empty → unknown

Definition at line 334 of file health_status.h.

335 {
336 if (components.empty())
337 {
339 status_message = "No components registered";
340 return;
341 }
342
343 bool has_unhealthy = false;
344 bool has_degraded = false;
345 bool has_unknown = false;
346
347 for (const auto& comp : components)
348 {
349 switch (comp.state)
350 {
352 has_unhealthy = true;
353 break;
355 has_degraded = true;
356 break;
358 has_unknown = true;
359 break;
360 default:
361 break;
362 }
363 }
364
365 if (has_unhealthy)
366 {
368 status_message = "One or more components are unhealthy";
369 }
370 else if (has_degraded)
371 {
373 status_message = "One or more components are degraded";
374 }
375 else if (has_unknown)
376 {
378 status_message = "One or more components have unknown status";
379 }
380 else
381 {
383 status_message = "All components are healthy";
384 }
385 }
@ degraded
Component is operational but with reduced capacity/performance.
@ unknown
Health state cannot be determined.
@ unhealthy
Component is not operational or failing.
std::vector< component_health > components
Health status of individual components.
health_state overall_status
Overall health state of the thread pool.
std::string status_message
Human-readable message about overall status.

References components, kcenon::thread::diagnostics::degraded, kcenon::thread::diagnostics::healthy, overall_status, status_message, kcenon::thread::diagnostics::unhealthy, and kcenon::thread::diagnostics::unknown.

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check().

Here is the caller graph for this function:

◆ find_component()

auto kcenon::thread::diagnostics::health_status::find_component ( const std::string & name) const -> const component_health*
inlinenodiscard

Finds a component by name.

Parameters
nameComponent name to search for.
Returns
Pointer to component health, or nullptr if not found.

Definition at line 312 of file health_status.h.

314 {
315 for (const auto& comp : components)
316 {
317 if (comp.name == name)
318 {
319 return &comp;
320 }
321 }
322 return nullptr;
323 }

References components.

◆ http_status_code()

auto kcenon::thread::diagnostics::health_status::http_status_code ( ) const -> int
inlinenodiscard

Gets HTTP status code for this health status.

Returns
Appropriate HTTP status code.

Definition at line 302 of file health_status.h.

303 {
305 }

References kcenon::thread::diagnostics::health_state_to_http_code(), and overall_status.

Referenced by to_json(), and to_string().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ is_healthy()

auto kcenon::thread::diagnostics::health_status::is_healthy ( ) const -> bool
inlinenodiscard

Checks if the thread pool is fully healthy.

Returns
true if overall status is healthy.

Definition at line 293 of file health_status.h.

294 {
296 }

References kcenon::thread::diagnostics::healthy, and overall_status.

◆ is_operational()

auto kcenon::thread::diagnostics::health_status::is_operational ( ) const -> bool
inlinenodiscard

Checks if the thread pool is operational.

Returns
true if overall status is healthy or degraded.

Definition at line 283 of file health_status.h.

References kcenon::thread::diagnostics::degraded, kcenon::thread::diagnostics::healthy, and overall_status.

◆ to_json()

auto kcenon::thread::diagnostics::health_status::to_json ( ) const -> std::string
inlinenodiscard

Converts health status to JSON string.

Output format is compatible with standard health check endpoints and monitoring tools like Kubernetes, Spring Boot Actuator, etc.

Returns
JSON string representation of the health status.

Definition at line 399 of file health_status.h.

400 {
401 std::ostringstream oss;
402 oss << std::fixed;
403
404 oss << "{\n";
405 oss << " \"status\": \"" << health_state_to_string(overall_status) << "\",\n";
406 oss << " \"message\": \"" << status_message << "\",\n";
407 oss << " \"http_code\": " << http_status_code() << ",\n";
408
409 // Metrics
410 oss << " \"metrics\": {\n";
411 oss << " \"uptime_seconds\": " << std::setprecision(2) << uptime_seconds << ",\n";
412 oss << " \"total_jobs_processed\": " << total_jobs_processed << ",\n";
413 oss << " \"success_rate\": " << std::setprecision(4) << success_rate << ",\n";
414 oss << " \"avg_latency_ms\": " << std::setprecision(3) << avg_latency_ms << "\n";
415 oss << " },\n";
416
417 // Workers
418 oss << " \"workers\": {\n";
419 oss << " \"total\": " << total_workers << ",\n";
420 oss << " \"active\": " << active_workers << ",\n";
421 oss << " \"idle\": " << (total_workers - active_workers) << "\n";
422 oss << " },\n";
423
424 // Queue
425 oss << " \"queue\": {\n";
426 oss << " \"depth\": " << queue_depth << ",\n";
427 oss << " \"capacity\": " << queue_capacity << "\n";
428 oss << " },\n";
429
430 // Components
431 oss << " \"components\": [\n";
432 for (std::size_t i = 0; i < components.size(); ++i)
433 {
434 const auto& comp = components[i];
435 oss << " {\n";
436 oss << " \"name\": \"" << comp.name << "\",\n";
437 oss << " \"status\": \"" << health_state_to_string(comp.state) << "\",\n";
438 oss << " \"message\": \"" << comp.message << "\"";
439
440 if (!comp.details.empty())
441 {
442 oss << ",\n \"details\": {\n";
443 std::size_t detail_idx = 0;
444 for (const auto& [key, value] : comp.details)
445 {
446 oss << " \"" << key << "\": \"" << value << "\"";
447 if (++detail_idx < comp.details.size())
448 {
449 oss << ",";
450 }
451 oss << "\n";
452 }
453 oss << " }\n";
454 }
455 else
456 {
457 oss << "\n";
458 }
459
460 oss << " }";
461 if (i < components.size() - 1)
462 {
463 oss << ",";
464 }
465 oss << "\n";
466 }
467 oss << " ]\n";
468 oss << "}";
469
470 return oss.str();
471 }
auto health_state_to_string(health_state state) -> std::string
Converts health_state to human-readable string.
double success_rate
Job success rate (0.0 to 1.0).
std::size_t active_workers
Number of active workers.
std::size_t total_workers
Total number of workers.
auto http_status_code() const -> int
Gets HTTP status code for this health status.
std::uint64_t total_jobs_processed
Total number of jobs processed since startup.
std::size_t queue_capacity
Queue capacity (if bounded).
double uptime_seconds
Time since the thread pool was started (seconds).
std::size_t queue_depth
Current queue depth.
double avg_latency_ms
Average job latency in milliseconds.

References active_workers, avg_latency_ms, components, kcenon::thread::diagnostics::health_state_to_string(), http_status_code(), overall_status, queue_capacity, queue_depth, status_message, success_rate, total_jobs_processed, total_workers, and uptime_seconds.

Here is the call graph for this function:

◆ to_prometheus()

auto kcenon::thread::diagnostics::health_status::to_prometheus ( const std::string & pool_name = "default") const -> std::string
inlinenodiscard

Converts health status to Prometheus-compatible metrics format.

Produces metrics in Prometheus exposition format suitable for scraping by Prometheus or compatible monitoring systems.

Parameters
pool_nameName of the thread pool for metric labels.
Returns
Prometheus-formatted metrics string.

Output format:

# HELP thread_pool_health_status Health status (1=healthy, 0.5=degraded, 0=unhealthy)
# TYPE thread_pool_health_status gauge
thread_pool_health_status{pool="MyPool"} 1
# HELP thread_pool_uptime_seconds Total uptime in seconds
# TYPE thread_pool_uptime_seconds counter
thread_pool_uptime_seconds{pool="MyPool"} 3600.5

Definition at line 542 of file health_status.h.

544 {
545 std::ostringstream oss;
546 oss << std::fixed;
547
548 // Health status (1 = healthy, 0.5 = degraded, 0 = unhealthy/unknown)
549 double health_value = 0.0;
550 switch (overall_status)
551 {
552 case health_state::healthy: health_value = 1.0; break;
553 case health_state::degraded: health_value = 0.5; break;
554 default: health_value = 0.0; break;
555 }
556 oss << "# HELP thread_pool_health_status Health status (1=healthy, 0.5=degraded, 0=unhealthy)\n";
557 oss << "# TYPE thread_pool_health_status gauge\n";
558 oss << "thread_pool_health_status{pool=\"" << pool_name << "\"} "
559 << std::setprecision(1) << health_value << "\n\n";
560
561 // Uptime
562 oss << "# HELP thread_pool_uptime_seconds Total uptime in seconds\n";
563 oss << "# TYPE thread_pool_uptime_seconds counter\n";
564 oss << "thread_pool_uptime_seconds{pool=\"" << pool_name << "\"} "
565 << std::setprecision(2) << uptime_seconds << "\n\n";
566
567 // Jobs processed
568 oss << "# HELP thread_pool_jobs_total Total number of jobs processed\n";
569 oss << "# TYPE thread_pool_jobs_total counter\n";
570 oss << "thread_pool_jobs_total{pool=\"" << pool_name << "\"} "
571 << total_jobs_processed << "\n\n";
572
573 // Success rate
574 oss << "# HELP thread_pool_success_rate Ratio of successful jobs (0.0 to 1.0)\n";
575 oss << "# TYPE thread_pool_success_rate gauge\n";
576 oss << "thread_pool_success_rate{pool=\"" << pool_name << "\"} "
577 << std::setprecision(4) << success_rate << "\n\n";
578
579 // Average latency
580 oss << "# HELP thread_pool_latency_avg_ms Average job latency in milliseconds\n";
581 oss << "# TYPE thread_pool_latency_avg_ms gauge\n";
582 oss << "thread_pool_latency_avg_ms{pool=\"" << pool_name << "\"} "
583 << std::setprecision(3) << avg_latency_ms << "\n\n";
584
585 // Workers
586 oss << "# HELP thread_pool_workers_total Total number of workers\n";
587 oss << "# TYPE thread_pool_workers_total gauge\n";
588 oss << "thread_pool_workers_total{pool=\"" << pool_name << "\"} "
589 << total_workers << "\n\n";
590
591 oss << "# HELP thread_pool_workers_active Number of active workers\n";
592 oss << "# TYPE thread_pool_workers_active gauge\n";
593 oss << "thread_pool_workers_active{pool=\"" << pool_name << "\"} "
594 << active_workers << "\n\n";
595
596 oss << "# HELP thread_pool_workers_idle Number of idle workers\n";
597 oss << "# TYPE thread_pool_workers_idle gauge\n";
598 oss << "thread_pool_workers_idle{pool=\"" << pool_name << "\"} "
599 << (total_workers - active_workers) << "\n\n";
600
601 // Queue
602 oss << "# HELP thread_pool_queue_depth Current queue depth\n";
603 oss << "# TYPE thread_pool_queue_depth gauge\n";
604 oss << "thread_pool_queue_depth{pool=\"" << pool_name << "\"} "
605 << queue_depth << "\n\n";
606
607 if (queue_capacity > 0)
608 {
609 oss << "# HELP thread_pool_queue_capacity Maximum queue capacity\n";
610 oss << "# TYPE thread_pool_queue_capacity gauge\n";
611 oss << "thread_pool_queue_capacity{pool=\"" << pool_name << "\"} "
612 << queue_capacity << "\n\n";
613
614 double saturation = static_cast<double>(queue_depth) /
615 static_cast<double>(queue_capacity);
616 oss << "# HELP thread_pool_queue_saturation Queue saturation ratio (0.0 to 1.0)\n";
617 oss << "# TYPE thread_pool_queue_saturation gauge\n";
618 oss << "thread_pool_queue_saturation{pool=\"" << pool_name << "\"} "
619 << std::setprecision(4) << saturation << "\n\n";
620 }
621
622 // Component health
623 for (const auto& comp : components)
624 {
625 double comp_health = 0.0;
626 switch (comp.state)
627 {
628 case health_state::healthy: comp_health = 1.0; break;
629 case health_state::degraded: comp_health = 0.5; break;
630 default: comp_health = 0.0; break;
631 }
632 oss << "# HELP thread_pool_component_health Component health status\n";
633 oss << "# TYPE thread_pool_component_health gauge\n";
634 oss << "thread_pool_component_health{pool=\"" << pool_name
635 << "\",component=\"" << comp.name << "\"} "
636 << std::setprecision(1) << comp_health << "\n";
637 }
638
639 return oss.str();
640 }

References active_workers, avg_latency_ms, components, kcenon::thread::diagnostics::degraded, kcenon::thread::diagnostics::healthy, overall_status, queue_capacity, queue_depth, success_rate, total_jobs_processed, total_workers, and uptime_seconds.

◆ to_string()

auto kcenon::thread::diagnostics::health_status::to_string ( ) const -> std::string
inlinenodiscard

Converts health status to human-readable string.

Provides a formatted text representation suitable for logging or console output.

Returns
Human-readable string representation.

Definition at line 481 of file health_status.h.

482 {
483 std::ostringstream oss;
484 oss << std::fixed;
485
486 oss << "=== Health Status: " << health_state_to_string(overall_status)
487 << " (HTTP " << http_status_code() << ") ===\n";
488 oss << "Message: " << status_message << "\n\n";
489
490 oss << "Metrics:\n";
491 oss << " Uptime: " << std::setprecision(1) << uptime_seconds << " seconds\n";
492 oss << " Jobs processed: " << total_jobs_processed << "\n";
493 oss << " Success rate: " << std::setprecision(1) << (success_rate * 100.0) << "%\n";
494 oss << " Avg latency: " << std::setprecision(2) << avg_latency_ms << " ms\n\n";
495
496 oss << "Workers: " << active_workers << "/" << total_workers << " active";
497 if (total_workers > 0)
498 {
499 oss << " (" << (total_workers - active_workers) << " idle)";
500 }
501 oss << "\n";
502
503 oss << "Queue: " << queue_depth;
504 if (queue_capacity > 0)
505 {
506 double saturation = static_cast<double>(queue_depth) /
507 static_cast<double>(queue_capacity) * 100.0;
508 oss << "/" << queue_capacity << " (" << std::setprecision(1)
509 << saturation << "% full)";
510 }
511 oss << "\n\n";
512
513 oss << "Components:\n";
514 for (const auto& comp : components)
515 {
516 oss << " [" << health_state_to_string(comp.state) << "] "
517 << comp.name << ": " << comp.message << "\n";
518 }
519
520 return oss.str();
521 }

References active_workers, avg_latency_ms, components, kcenon::thread::diagnostics::health_state_to_string(), http_status_code(), overall_status, queue_capacity, queue_depth, status_message, success_rate, total_jobs_processed, total_workers, and uptime_seconds.

Here is the call graph for this function:

Member Data Documentation

◆ active_workers

std::size_t kcenon::thread::diagnostics::health_status::active_workers {0}

Number of active workers.

Definition at line 258 of file health_status.h.

258{0};

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check(), to_json(), to_prometheus(), and to_string().

◆ avg_latency_ms

double kcenon::thread::diagnostics::health_status::avg_latency_ms {0.0}

Average job latency in milliseconds.

Definition at line 253 of file health_status.h.

253{0.0};

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check(), to_json(), to_prometheus(), and to_string().

◆ check_time

std::chrono::steady_clock::time_point kcenon::thread::diagnostics::health_status::check_time

Time when this health check was performed.

Definition at line 224 of file health_status.h.

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check().

◆ components

std::vector<component_health> kcenon::thread::diagnostics::health_status::components

◆ overall_status

health_state kcenon::thread::diagnostics::health_status::overall_status {health_state::unknown}

Overall health state of the thread pool.

Aggregated from all component health states. If any component is unhealthy, overall is unhealthy. If any component is degraded, overall is degraded.

Definition at line 214 of file health_status.h.

Referenced by calculate_overall_status(), http_status_code(), is_healthy(), is_operational(), to_json(), to_prometheus(), and to_string().

◆ queue_capacity

std::size_t kcenon::thread::diagnostics::health_status::queue_capacity {0}

Queue capacity (if bounded).

Definition at line 273 of file health_status.h.

273{0};

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check(), to_json(), to_prometheus(), and to_string().

◆ queue_depth

std::size_t kcenon::thread::diagnostics::health_status::queue_depth {0}

Current queue depth.

Definition at line 268 of file health_status.h.

268{0};

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check(), to_json(), to_prometheus(), and to_string().

◆ status_message

std::string kcenon::thread::diagnostics::health_status::status_message

Human-readable message about overall status.

Definition at line 219 of file health_status.h.

Referenced by calculate_overall_status(), to_json(), and to_string().

◆ success_rate

double kcenon::thread::diagnostics::health_status::success_rate {1.0}

Job success rate (0.0 to 1.0).

Definition at line 248 of file health_status.h.

248{1.0};

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check(), to_json(), to_prometheus(), and to_string().

◆ total_jobs_processed

std::uint64_t kcenon::thread::diagnostics::health_status::total_jobs_processed {0}

Total number of jobs processed since startup.

Definition at line 243 of file health_status.h.

243{0};

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check(), to_json(), to_prometheus(), and to_string().

◆ total_workers

std::size_t kcenon::thread::diagnostics::health_status::total_workers {0}

Total number of workers.

Definition at line 263 of file health_status.h.

263{0};

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check(), to_json(), to_prometheus(), and to_string().

◆ uptime_seconds

double kcenon::thread::diagnostics::health_status::uptime_seconds {0.0}

Time since the thread pool was started (seconds).

Definition at line 238 of file health_status.h.

238{0.0};

Referenced by kcenon::thread::diagnostics::thread_pool_diagnostics::health_check(), to_json(), to_prometheus(), and to_string().


The documentation for this struct was generated from the following file: