APM Tools Investigation

Overview

Klaudia's APM Tools enable intelligent investigation and troubleshooting by correlating Kubernetes cluster state with application performance metrics from your observability platforms. When applications experience performance degradation, errors, or anomalies, Klaudia can cross-reference Kubernetes events, pod health, and deployment changes with APM telemetry to provide comprehensive root cause analysis.

Modern applications generate vast amounts of telemetry data. Klaudia bridges the gap between Kubernetes infrastructure insights and application-level observability by automatically querying your APM platforms during investigations. This correlation helps identify whether issues originate from infrastructure (Kubernetes, nodes, networking) or application code (bugs, performance regressions, dependency failures).

Supported Tools

ToolDescription
DatadogApplication performance monitoring, distributed tracing, and log analysis
New RelicFull-stack observability including APM, infrastructure, and more

What Klaudia Can Do

General Capabilities

  • Error Rate Correlation: Link Kubernetes events with application error spikes
  • Latency Analysis: Correlate performance degradation with infrastructure changes
  • Trace Investigation: Examine distributed traces for failing requests
  • Log Correlation: Query application logs in context of Kubernetes events
  • Deployment Impact: Assess application metrics before/after deployments

Datadog

  • Query APM traces for services affected by Kubernetes issues
  • Analyze service maps and dependencies
  • Soon - Retrieve error logs correlated with pod failures
  • Soon - Correlate deployment events with error rate changes

New Relic

  • Query APM traces for services affected by Kubernetes issues
  • Analyze service maps and dependencies
  • Soon - Retrieve error logs correlated with pod failures
  • Soon - Correlate deployment events with error rate changes

When Klaudia Uses APM Tools

Klaudia automatically engages APM investigation tools in the following scenarios:

Root Cause Analysis (RCA)

When Kubernetes issues may be affecting application performance:

  • Pod restarts correlating with error rate increases
  • Deployment changes coinciding with latency spikes
  • Resource constraints potentially causing timeouts

Troubleshooting Unhealthy Resources

When application-level data can help diagnose Kubernetes issues:

  • Health check failures that may be application-related
  • Services showing degraded availability
  • Pods terminated due to application errors

Chat Sessions

When you ask Klaudia questions requiring APM context:

  • "Is this pod restart causing user-facing errors?"
  • "Did the last deployment cause performance regression?"
  • "What application errors are correlating with this incident?"

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.