Skip to main content
This guide covers common troubleshooting scenarios and tools for diagnosing and resolving Argo CD issues.

Troubleshooting Tools

Argo CD provides argocd admin subcommands to validate settings and troubleshoot connectivity issues.

Settings Validation

Validate Argo CD configuration before applying to production:
argocd admin settings validate
This command performs basic validation of:
  • ConfigMap settings (argocd-cm)
  • RBAC policies (argocd-rbac-cm)
  • Resource customizations
  • Repository credentials

Common Issues

Application Sync Failures

Symptoms: Application shows OutOfSync but sync operation fails or doesn’t start.Diagnosis:
# Check application status
argocd app get <app-name>

# View detailed sync status
kubectl get application <app-name> -n argocd -o yaml

# Check application controller logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller
Common causes:
  • Invalid manifests in Git repository
  • Resource quota exceeded in target cluster
  • RBAC permissions preventing resource creation
  • Cluster connectivity issues
Solutions:
# Validate manifests locally
kubectl apply --dry-run=client -f manifest.yaml

# Check resource quotas
kubectl describe resourcequota -n <namespace>

# Test cluster connectivity
argocd cluster get <cluster-url>

# Force refresh and sync
argocd app sync <app-name> --force
Symptoms: Application reconciliation fails with Context deadline exceeded.Root cause: Manifest generation is taking too long, exceeding the controller timeout.Solutions:
1

Increase repo server timeout

containers:
- name: argocd-application-controller
  command:
  - argocd-application-controller
  - --repo-server-timeout-seconds=300  # Increase from default 60s
2

Scale repo server

kubectl scale deployment argocd-repo-server -n argocd --replicas=3
3

Optimize repository

  • Use shallow clones for large repositories
  • Enable manifest path annotations for monorepos
  • Reduce parallelism limit if resource-constrained
Symptoms: Sync fails with “insufficient permissions” or RBAC errors.Diagnosis:
# Check AppProject permissions
kubectl get appproject <project-name> -n argocd -o yaml

# Verify cluster RBAC
kubectl auth can-i create deployment -n <namespace> \
  --as=system:serviceaccount:argocd:argocd-application-controller
Solution: Update AppProject to allow resources:
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: my-project
spec:
  clusterResourceWhitelist:
  - group: '*'
    kind: '*'
  destinations:
  - namespace: '*'
    server: '*'

Git Repository Issues

Symptoms: Applications can’t connect to Git repositories.Diagnosis:
# Test repository connection
argocd repo get <repo-url>

# Check repo server logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-repo-server
Common causes:
  • Invalid credentials
  • Network connectivity issues
  • SSH key not configured
  • Certificate validation failures
Solutions:For HTTPS repositories:
# Update credentials
argocd repo add https://github.com/org/repo \
  --username <username> \
  --password <token>

# Skip TLS verification (not recommended for production)
argocd repo add https://github.com/org/repo \
  --insecure-skip-server-verification
For SSH repositories:
# Add SSH private key
argocd repo add [email protected]:org/repo.git \
  --ssh-private-key-path ~/.ssh/id_rsa

# Add SSH known hosts
kubectl edit configmap argocd-ssh-known-hosts-cm -n argocd
Symptoms: Intermittent failures resolving Git references (branches, tags).Solution: Increase retry count for Git operations:
containers:
- name: argocd-repo-server
  env:
  - name: ARGOCD_GIT_ATTEMPTS_COUNT
    value: "5"  # Retry failed Git operations

Cluster Connectivity Issues

Symptoms: Managed cluster shows as “Unreachable” or “Unknown” in UI.Diagnosis:
1

SSH into application controller

kubectl exec -n argocd -it \
  $(kubectl get pods -n argocd \
    -l app.kubernetes.io/name=argocd-application-controller \
    -o jsonpath='{.items[0].metadata.name}') -- bash
2

Export kubeconfig from cluster secret

argocd admin cluster kubeconfig https://<api-server-url> \
  /tmp/kubeconfig --namespace argocd
3

Test connectivity

export KUBECONFIG=/tmp/kubeconfig
kubectl get pods -v 9
Common issues:
  • Expired certificates
  • Invalid bearer tokens
  • Network policies blocking traffic
  • API server URL changed
Solution: Update cluster credentials:
argocd cluster add <context-name> --name <cluster-name> --upsert

Resource Customization Issues

Custom health checks can be tested before applying to production:
argocd admin settings resource-overrides health \
  ./deployment.yaml \
  --argocd-cm-path ./argocd-cm.yaml
Example health check (Lua):
resource.customizations: |
  argoproj.io/Rollout:
    health.lua: |
      hs = {}
      if obj.status ~= nil then
        if obj.status.phase == "Healthy" then
          hs.status = "Healthy"
          hs.message = "Rollout is healthy"
          return hs
        end
      end
      hs.status = "Progressing"
      hs.message = "Waiting for rollout to complete"
      return hs
Test ignore differences configurations:
argocd admin settings resource-overrides ignore-differences \
  ./deployment.yaml \
  --argocd-cm-path ./argocd-cm.yaml
Shows which fields will be ignored during diff operations.
Execute custom resource actions:
# List available actions
argocd admin settings resource-overrides list-actions \
  /tmp/deployment.yaml \
  --argocd-cm-path /tmp/argocd-cm.yaml

# Run action
argocd admin settings resource-overrides run-action \
  /tmp/deployment.yaml restart \
  --argocd-cm-path /tmp/argocd-cm.yaml

Performance Issues

Symptoms: Applications take a long time to reconcile and sync.Diagnosis:
# Check reconciliation duration (Prometheus query)
histogram_quantile(0.95, 
  rate(argocd_app_reconcile_bucket[5m])
)

# Check for high K8s API requests
rate(argocd_app_k8s_request_total[5m])
Solutions:
1

Increase controller processors

containers:
- name: argocd-application-controller
  command:
  - argocd-application-controller
  - --status-processors=50
  - --operation-processors=25
2

Enable controller sharding

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: argocd-application-controller
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: argocd-application-controller
        env:
        - name: ARGOCD_CONTROLLER_REPLICAS
          value: "2"
3

Optimize monorepo performance

Use manifest path annotations:
metadata:
  annotations:
    argocd.argoproj.io/manifest-generate-paths: .
Symptoms: Argo CD components OOMKilled or using excessive memory.Common causes:
  • Too many cached resources
  • Large repositories
  • Too many applications per controller
Solutions:For repo server:
spec:
  template:
    spec:
      containers:
      - name: argocd-repo-server
        resources:
          requests:
            memory: 1Gi
          limits:
            memory: 2Gi
        env:
        - name: ARGOCD_EXEC_TIMEOUT
          value: "180s"
For application controller:
env:
- name: ARGOCD_CONTROLLER_REPLICAS
  value: "3"  # Shard applications across replicas
Mount persistent volume for repo server:
volumeMounts:
- mountPath: /tmp
  name: tmp-dir
volumes:
- name: tmp-dir
  persistentVolumeClaim:
    claimName: argocd-repo-server-pvc
Symptoms: High argocd_repo_pending_request_total metric.Cause: Multiple applications in same repository causing sequential processing.Solutions:
  • Enable concurrent processing (create .argocd-allow-concurrency file)
  • Scale repo server horizontally
  • Split applications into separate repositories
  • Use manifest path annotations

Application Health Issues

Cause: No health check defined for resource type.Solution: Add custom health check:
apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cm
  namespace: argocd
data:
  resource.customizations: |
    my.custom.resource/MyKind:
      health.lua: |
        hs = {}
        if obj.status ~= nil and obj.status.ready then
          hs.status = "Healthy"
        else
          hs.status = "Progressing"
        end
        return hs
Diagnosis:
# Check resource status
argocd app get <app-name> --show-operation

# Check individual resource health
kubectl get <resource> -n <namespace>
kubectl describe <resource> <name> -n <namespace>
Common causes:
  • Pods stuck in ImagePullBackOff
  • Insufficient resources (CPU/memory)
  • Failing health checks
  • Init containers not completing

Debugging Commands

Log Collection

# Application controller logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller --tail=100

# Repo server logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-repo-server --tail=100

# API server logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-server --tail=100

# Follow logs in real-time
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller -f

Resource Inspection

# Get application details
argocd app get <app-name>

# Get application as YAML
kubectl get application <app-name> -n argocd -o yaml

# Get application history
argocd app history <app-name>

# Get application events
kubectl get events -n argocd --field-selector involvedObject.name=<app-name>

# Get all applications
argocd app list

# Get application resources
argocd app resources <app-name>

Configuration Verification

# Check ConfigMaps
kubectl get configmap -n argocd argocd-cm -o yaml
kubectl get configmap -n argocd argocd-rbac-cm -o yaml
kubectl get configmap -n argocd argocd-cmd-params-cm -o yaml

# Check secrets
kubectl get secret -n argocd argocd-secret -o yaml

# Validate settings
argocd admin settings validate

Getting Help

GitHub Issues

Search existing issues or create new ones: argoproj/argo-cd

Slack Community

Join the Argo CD community: CNCF Slack #argo-cd

Documentation

Official Argo CD docs: argo-cd.readthedocs.io

Stack Overflow

Ask questions with the argocd tag: stackoverflow.com

Build docs developers (and LLMs) love