You've deployed applications with ArgoCD and automated their sync strategies. But deployments fail silently. A Pod crashes, the sync stalls, and you don't know until a customer reports the issue. This chapter teaches you to monitor health status and get alerts when things go wrong.
ArgoCD continuously watches your running applications. It assesses whether each resource is healthy, degraded, or unknown. When status changes, it can notify Slack, send webhooks, or trigger other integrations. By the end of this chapter, you'll understand how ArgoCD evaluates health and how to configure notifications for critical events.
Every resource in ArgoCD has a health status. ArgoCD evaluates health by examining resource state, checking conditions, and running custom health checks.
ArgoCD includes out-of-the-box health logic for standard Kubernetes resources. Let's see how health is determined for each type:
Deployment Health: ArgoCD checks if desired replicas match ready replicas
Output: Deployment health status
When 3 of 3 replicas are ready, status is Healthy. If only 1 of 3 is ready (others starting or crashed), it's Progressing. If 0 of 3 are ready after 10 minutes, it's Degraded.
Pod Health: ArgoCD checks if all containers are running and conditions are true
Output: Pod with all conditions healthy
StatefulSet Health: ArgoCD verifies all replicas are ready and updated
Output: StatefulSet health when rolling out
Service Health: Services are always considered healthy (they don't fail)
Output: Service health status
Job Health: ArgoCD checks if the job completed successfully
Output: Job health during execution
PersistentVolumeClaim Health: ArgoCD checks if the claim is bound to actual storage
Output: PVC health when storage unavailable
ArgoCD assigns every resource one of these states:
ArgoCD doesn't just report resource health—it aggregates across all resources in an Application:
Example: You deploy a web app (Deployment) and database (StatefulSet). If the database Pod is Degraded but the web app Deployment is Healthy, the whole Application is Degraded. This forces you to fix the root cause.
Output: Application marked Degraded due to one failed resource
Built-in health checks work for standard Kubernetes resources. But if you use CustomResources, Operators, or non-standard resources, you need custom health logic. ArgoCD allows you to define health checks in Lua—a lightweight scripting language.
Scenario 1: Operator-Managed Resources
You install an ArgoCD Operator that manages ArgoCD instances. The CRD looks like:
Output: Custom resource without built-in health understanding
To fix this, you define a custom Lua health check.
Custom health checks live in a ConfigMap that ArgoCD reads at startup:
Output: Custom health check applied to ArgoCD CRD
Key pattern: resource.customizations.health.{GROUP}_{KIND}
Where:
Imagine you use a PostgreSQL Operator that creates database instances:
Define health as "Ready condition true AND last backup within 24 hours":
Output: Custom health check for PostgreSQL cluster
ArgoCD fires notifications when certain events occur. Understanding these triggers helps you configure the right notifications for your needs.
On Sync Success: When a sync completes without errors
Output: ArgoCD logs sync success
On Sync Failure: When a sync fails (manifest invalid, image pull fails, etc.)
Output: ArgoCD logs sync failure
On Sync Started: When a sync begins (triggered manually or by auto-sync)
Output: ArgoCD logs sync initiation
On Health Change to Degraded: When a resource health changes from Healthy to Degraded
Output: Health status change in ArgoCD
On Application Health Degraded: When overall Application health becomes Degraded
Slack is the most common notification destination. ArgoCD integrates with Slack using the argocd-notifications application.
ArgoCD doesn't include notifications by default. Install the controller:
Output: Notification controller running
Before configuring ArgoCD, create an Incoming Webhook in your Slack workspace:
Example webhook URL format:
Create a ConfigMap in the argocd-notifications namespace:
Output: ConfigMap created and applied
The Slack webhook URL must be in a Secret:
Output: Secret created
Add labels to your Application to subscribe to notifications:
Output: Application receives notifications
Slack is common, but some teams use custom systems: PagerDuty, Opsgenie, DataDog, or internal webhooks. Webhooks let you send HTTP POST requests to any endpoint.
In argocd-notifications-cm, add webhook service and template:
Output: Webhook request sent to custom endpoint
Add webhook subscription to your Application:
Output: Notification sent to webhook endpoint
Let's deploy a real example: an agent application with health checks and Slack notifications.
Output: Custom health check registered
Output: Notification controller ready
Output: Secret stored
Output: Notifications configured
Output: Application created with notification subscriptions
Output: Application healthy and notifications active
Now you understand how ArgoCD monitors application health and sends notifications when deployment events occur. In Chapter 17, you'll collaborate with Claude to design sophisticated notification systems and troubleshoot health issues.
For now, explore your understanding with these prompts:
Setup: If you have an ArgoCD Application deployed from Chapter 8, you can test these scenarios:
Prompt 1: Diagnose Current Health
Ask Claude to analyze your application's health status:
Prompt 2: Design Custom Health Check
If you use a custom resource, ask Claude to generate a health check:
Prompt 3: Configure Notifications
Ask Claude to build your notification ConfigMap:
You built a gitops-deployment skill in Chapter 0. Test and improve it based on what you learned.
Ask yourself:
If you found gaps: