Overview
Klaudia's Kubernetes Extensions Tools enable deep integration with the ecosystem of tools that extend Kubernetes functionality. Beyond core Kubernetes resources, modern clusters rely on cert-manager for TLS, ExternalDNS for DNS automation, Karpenter for node provisioning, Helm for package management, ArgoCD for GitOps, and more. Klaudia can investigate issues across all these extensions, providing unified troubleshooting that spans your entire Kubernetes stack.
By enabling K8s Extensions Tools, Klaudia can:
- Investigate certificate issues and TLS failures with cert-manager
- Troubleshoot DNS record propagation and ExternalDNS problems
- Analyze node scaling decisions and Karpenter provisioner behavior
- Examine Helm release history and chart deployment issues
- Investigate ArgoCD sync failures and GitOps drift
- Debug Argo Workflows execution and pipeline failures
- Analyze Cluster API provisioning and infrastructure issues
Supported Kubernetes Extensions
| Extension | Description |
| cert-manager | Certificate management, TLS issuance, and renewal automation |
| ExternalDNS | Automatic DNS record management for Kubernetes services |
| Karpenter / Autoscalers | Node provisioning, cluster autoscaling, and capacity management |
| Helm | Package management, release history, and chart deployments |
| ArgoCD | GitOps continuous delivery, application sync, and drift detection |
| Argo Workflows | Workflow orchestration, pipeline execution, and job management |
| Cluster API | Kubernetes cluster lifecycle management and infrastructure provisioning |
Prerequisites
Each Kubernetes extension requires its Custom Resource Definitions (CRDs) to be installed in your cluster and the extension to be properly deployed.
What Kubernetes Extensions Tools Can Do
General Capabilities
When investigating issues, Klaudia can:
- Query extension CRDs: Read the status and spec of extension resources
- Analyze extension events: Correlate Kubernetes events with extension behavior
- Check controller logs: Access logs from extension controllers for detailed diagnostics
- Track resource dependencies: Understand relationships between extensions and workloads
- Correlate timelines: Match extension changes with application issues
Extension-Specific Capabilities
cert-manager
Klaudia can investigate:
- Certificate status: Check if certificates are ready, pending, or failed
- Issuance failures: Analyze why certificate requests fail (ACME challenges, issuer configuration)
- Renewal issues: Identify certificates at risk of expiration
- Challenge debugging: Investigate HTTP01 or DNS01 challenge failures
- Issuer health: Verify ClusterIssuers and Issuers are properly configured
ExternalDNS
Klaudia can investigate:
- DNS record sync status: Check if records are being created/updated
- Provider errors: Identify authentication or API failures with DNS providers
- Record propagation: Analyze why DNS changes aren't reflected
- Source filtering: Understand which services ExternalDNS is watching
- Ownership conflicts: Detect TXT record ownership issues
Karpenter / Autoscalers
Klaudia can investigate:
- Node provisioning decisions: Understand why nodes were or weren't provisioned
- Capacity issues: Identify scheduling failures due to resource constraints
- Consolidation events: Analyze node consolidation and disruption
- NodePool configuration: Review provisioner settings and constraints
- Scaling delays: Investigate slow scale-up or scale-down behavior
Helm
Klaudia can investigate:
- Release history: Review deployment history and rollback options
- Failed deployments: Analyze why Helm releases failed
- Value differences: Compare current values with previous releases
- Hook failures: Identify pre/post install/upgrade hook issues
- Resource drift: Detect manual changes to Helm-managed resources
ArgoCD
Klaudia can investigate:
- Sync status: Check if applications are synced, out-of-sync, or degraded
- Sync failures: Analyze why sync operations fail
- Health status: Review application and resource health
- Drift detection: Identify differences between Git and cluster state
- Refresh issues: Investigate repository access or manifest generation failures
Argo Workflows
Klaudia can investigate:
- Workflow status: Check running, succeeded, failed, or pending workflows
- Step failures: Identify which workflow steps failed and why
- Resource issues: Analyze pod scheduling or resource constraint problems
- Template errors: Debug workflow template syntax or reference issues
- Artifact problems: Investigate artifact storage or retrieval failures
Cluster API
Klaudia can investigate:
- Cluster provisioning: Analyze cluster creation or upgrade failures
- Machine health: Check machine status and remediation events
- Infrastructure issues: Identify cloud provider API failures
- Control plane status: Verify control plane machine health
- Scaling operations: Investigate MachineDeployment scaling issues
When Klaudia Uses K8s Extensions Tools
Klaudia automatically leverages extension tools in the following scenarios:
Root Cause Analysis (RCA) Investigations
When investigating unhealthy resources, Klaudia:
- Checks cert-manager if TLS/certificate errors are detected
- Queries ExternalDNS if DNS resolution issues are suspected
- Examines Karpenter/autoscaler logs if pods are pending due to node capacity
- Reviews Helm release history if deployment issues correlate with chart upgrades
- Checks ArgoCD sync status if GitOps-managed applications are degraded
- Analyzes Argo Workflow executions if pipeline failures impact deployments
Troubleshooting Unhealthy Resources
Based on the symptoms detected:
- 503 errors or TLS failures → cert-manager certificate investigation
- DNS resolution failures → ExternalDNS record sync check
- Pods stuck in Pending → Karpenter provisioning analysis
- Deployment rollout issues → Helm release history review
- Application drift or config mismatch → ArgoCD sync status check
- CI/CD pipeline failures → Argo Workflows step analysis
Chat Sessions with Klaudia
You can explicitly ask Klaudia to investigate extensions:
- "Check if the TLS certificate for api.example.com is valid"
- "Why isn't ExternalDNS creating the DNS record for my LoadBalancer?"
- "Why aren't new nodes being provisioned by Karpenter?"
- "Show me the Helm release history for the payments chart"
- "Is the frontend application synced in ArgoCD?"
- "Why did the nightly-batch workflow fail last night?"
Comments
0 comments
Please sign in to leave a comment.