Troubleshooting

This guide covers common troubleshooting scenarios and tools for diagnosing and resolving Argo CD issues.

Troubleshooting Tools

Argo CD provides argocd admin subcommands to validate settings and troubleshoot connectivity issues.

Settings Validation

Validate Argo CD configuration before applying to production:

argocd admin settings validate

This command performs basic validation of:

ConfigMap settings (argocd-cm)
RBAC policies (argocd-rbac-cm)
Resource customizations
Repository credentials

Common Issues

Application Sync Failures

Application stuck in 'OutOfSync' state

Symptoms: Application shows OutOfSync but sync operation fails or doesn’t start.Diagnosis:

# Check application status
argocd app get <app-name>

# View detailed sync status
kubectl get application <app-name> -n argocd -o yaml

# Check application controller logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller

Common causes:

Invalid manifests in Git repository
Resource quota exceeded in target cluster
RBAC permissions preventing resource creation
Cluster connectivity issues

Solutions:

# Validate manifests locally
kubectl apply --dry-run=client -f manifest.yaml

# Check resource quotas
kubectl describe resourcequota -n <namespace>

# Test cluster connectivity
argocd cluster get <cluster-url>

# Force refresh and sync
argocd app sync <app-name> --force

Context deadline exceeded errors

Symptoms: Application reconciliation fails with Context deadline exceeded.Root cause: Manifest generation is taking too long, exceeding the controller timeout.Solutions:

Increase repo server timeout

containers:
- name: argocd-application-controller
  command:
  - argocd-application-controller
  - --repo-server-timeout-seconds=300  # Increase from default 60s

Scale repo server

kubectl scale deployment argocd-repo-server -n argocd --replicas=3

Optimize repository

Use shallow clones for large repositories
Enable manifest path annotations for monorepos
Reduce parallelism limit if resource-constrained

Sync operation permissions errors

Symptoms: Sync fails with “insufficient permissions” or RBAC errors.Diagnosis:

# Check AppProject permissions
kubectl get appproject <project-name> -n argocd -o yaml

# Verify cluster RBAC
kubectl auth can-i create deployment -n <namespace> \
  --as=system:serviceaccount:argocd:argocd-application-controller

Solution: Update AppProject to allow resources:

apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: my-project
spec:
  clusterResourceWhitelist:
  - group: '*'
    kind: '*'
  destinations:
  - namespace: '*'
    server: '*'

Git Repository Issues

Failed to fetch repository

Symptoms: Applications can’t connect to Git repositories.Diagnosis:

# Test repository connection
argocd repo get <repo-url>

# Check repo server logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-repo-server

Common causes:

Invalid credentials
Network connectivity issues
SSH key not configured
Certificate validation failures

Solutions:For HTTPS repositories:

# Update credentials
argocd repo add https://github.com/org/repo \
  --username <username> \
  --password <token>

# Skip TLS verification (not recommended for production)
argocd repo add https://github.com/org/repo \
  --insecure-skip-server-verification

For SSH repositories:

# Add SSH private key
argocd repo add [email protected]:org/repo.git \
  --ssh-private-key-path ~/.ssh/id_rsa

# Add SSH known hosts
kubectl edit configmap argocd-ssh-known-hosts-cm -n argocd

Git ls-remote failures

Symptoms: Intermittent failures resolving Git references (branches, tags).Solution: Increase retry count for Git operations:

containers:
- name: argocd-repo-server
  env:
  - name: ARGOCD_GIT_ATTEMPTS_COUNT
    value: "5"  # Retry failed Git operations

Cluster Connectivity Issues

Cluster connection failed

Symptoms: Managed cluster shows as “Unreachable” or “Unknown” in UI.Diagnosis:

SSH into application controller

kubectl exec -n argocd -it \
  $(kubectl get pods -n argocd \
    -l app.kubernetes.io/name=argocd-application-controller \
    -o jsonpath='{.items[0].metadata.name}') -- bash

Export kubeconfig from cluster secret

argocd admin cluster kubeconfig https://<api-server-url> \
  /tmp/kubeconfig --namespace argocd

Test connectivity

export KUBECONFIG=/tmp/kubeconfig
kubectl get pods -v 9

Common issues:

Expired certificates
Invalid bearer tokens
Network policies blocking traffic
API server URL changed

Solution: Update cluster credentials:

argocd cluster add <context-name> --name <cluster-name> --upsert

Resource Customization Issues

Test custom health checks

Custom health checks can be tested before applying to production:

argocd admin settings resource-overrides health \
  ./deployment.yaml \
  --argocd-cm-path ./argocd-cm.yaml

Example health check (Lua):

resource.customizations: |
  argoproj.io/Rollout:
    health.lua: |
      hs = {}
      if obj.status ~= nil then
        if obj.status.phase == "Healthy" then
          hs.status = "Healthy"
          hs.message = "Rollout is healthy"
          return hs
        end
      end
      hs.status = "Progressing"
      hs.message = "Waiting for rollout to complete"
      return hs

Test diff customizations

Test ignore differences configurations:

argocd admin settings resource-overrides ignore-differences \
  ./deployment.yaml \
  --argocd-cm-path ./argocd-cm.yaml

Shows which fields will be ignored during diff operations.

Test resource actions

Execute custom resource actions:

# List available actions
argocd admin settings resource-overrides list-actions \
  /tmp/deployment.yaml \
  --argocd-cm-path /tmp/argocd-cm.yaml

# Run action
argocd admin settings resource-overrides run-action \
  /tmp/deployment.yaml restart \
  --argocd-cm-path /tmp/argocd-cm.yaml

Performance Issues

Slow reconciliation times

Symptoms: Applications take a long time to reconcile and sync.Diagnosis:

# Check reconciliation duration (Prometheus query)
histogram_quantile(0.95, 
  rate(argocd_app_reconcile_bucket[5m])
)

# Check for high K8s API requests
rate(argocd_app_k8s_request_total[5m])

Solutions:

Increase controller processors

containers:
- name: argocd-application-controller
  command:
  - argocd-application-controller
  - --status-processors=50
  - --operation-processors=25

Enable controller sharding

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: argocd-application-controller
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: argocd-application-controller
        env:
        - name: ARGOCD_CONTROLLER_REPLICAS
          value: "2"

Optimize monorepo performance

Use manifest path annotations:

metadata:
  annotations:
    argocd.argoproj.io/manifest-generate-paths: .

High memory usage

Symptoms: Argo CD components OOMKilled or using excessive memory.Common causes:

Too many cached resources
Large repositories
Too many applications per controller

Solutions:For repo server:

spec:
  template:
    spec:
      containers:
      - name: argocd-repo-server
        resources:
          requests:
            memory: 1Gi
          limits:
            memory: 2Gi
        env:
        - name: ARGOCD_EXEC_TIMEOUT
          value: "180s"

For application controller:

env:
- name: ARGOCD_CONTROLLER_REPLICAS
  value: "3"  # Shard applications across replicas

Mount persistent volume for repo server:

volumeMounts:
- mountPath: /tmp
  name: tmp-dir
volumes:
- name: tmp-dir
  persistentVolumeClaim:
    claimName: argocd-repo-server-pvc

Repository contention

Symptoms: High argocd_repo_pending_request_total metric.Cause: Multiple applications in same repository causing sequential processing.Solutions:

Enable concurrent processing (create .argocd-allow-concurrency file)
Scale repo server horizontally
Split applications into separate repositories
Use manifest path annotations

Application Health Issues

Application showing 'Unknown' health

Cause: No health check defined for resource type.Solution: Add custom health check:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cm
  namespace: argocd
data:
  resource.customizations: |
    my.custom.resource/MyKind:
      health.lua: |
        hs = {}
        if obj.status ~= nil and obj.status.ready then
          hs.status = "Healthy"
        else
          hs.status = "Progressing"
        end
        return hs

Application stuck in 'Progressing' state

Diagnosis:

# Check resource status
argocd app get <app-name> --show-operation

# Check individual resource health
kubectl get <resource> -n <namespace>
kubectl describe <resource> <name> -n <namespace>

Common causes:

Pods stuck in ImagePullBackOff
Insufficient resources (CPU/memory)
Failing health checks
Init containers not completing

Debugging Commands

Log Collection

# Application controller logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller --tail=100

# Repo server logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-repo-server --tail=100

# API server logs
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-server --tail=100

# Follow logs in real-time
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller -f

Resource Inspection

# Get application details
argocd app get <app-name>

# Get application as YAML
kubectl get application <app-name> -n argocd -o yaml

# Get application history
argocd app history <app-name>

# Get application events
kubectl get events -n argocd --field-selector involvedObject.name=<app-name>

# Get all applications
argocd app list

# Get application resources
argocd app resources <app-name>

Configuration Verification

# Check ConfigMaps
kubectl get configmap -n argocd argocd-cm -o yaml
kubectl get configmap -n argocd argocd-rbac-cm -o yaml
kubectl get configmap -n argocd argocd-cmd-params-cm -o yaml

# Check secrets
kubectl get secret -n argocd argocd-secret -o yaml

# Validate settings
argocd admin settings validate

Getting Help

GitHub Issues

Search existing issues or create new ones: argoproj/argo-cd

Slack Community

Join the Argo CD community: CNCF Slack #argo-cd

Documentation

Official Argo CD docs: argo-cd.readthedocs.io

Stack Overflow

Ask questions with the argocd tag: stackoverflow.com

Get Started

Installation

Core Features

Configuration

Application Management

ApplicationSet

Manifest Sources

Operations

Security

Troubleshooting