Kubernetes Core Concepts
What is a Headless Service in Kubernetes, and when do you use it?
What is a Headless Service in Kubernetes, and when do you use it?
- Cassandra
- MongoDB
- Kafka
pod-0, pod-1, etc.).How can a Pod talk to a Service in another namespace?
How can a Pod talk to a Service in another namespace?
dev namespace accessing a service in prod namespace:What is a Deployment in Kubernetes?
What is a Deployment in Kubernetes?
- Declarative updates for Pods and ReplicaSets
- Rolling updates with zero downtime
- Easy rollback to previous versions
- Scaling up/down replicas
- Self-healing (recreates failed Pods)
Explain a Kubernetes Pod to a 5-year-old
Explain a Kubernetes Pod to a 5-year-old
- A Pod is the smallest deployable unit in Kubernetes
- It can contain one or more containers
- Containers in a Pod share network and storage
- They’re scheduled together on the same node
What is a StatefulSet?
What is a StatefulSet?
- Databases (MySQL, PostgreSQL, MongoDB)
- Distributed systems (Kafka, Cassandra, Elasticsearch)
- Applications requiring stable hostnames
- Ordered, graceful deployment and scaling
- Stable, persistent storage
- Stable network identifiers (pod-0, pod-1, pod-2)
- Ordered, automated rolling updates
What is a DaemonSet?
What is a DaemonSet?
- Monitoring agents (Prometheus Node Exporter, Datadog)
- Log collectors (Fluentd, Filebeat)
- Storage daemons (Ceph, GlusterFS)
- Network proxies (kube-proxy)
What are Kubernetes Operators?
What are Kubernetes Operators?
- Deployment
- Upgrades
- Healing
- Backup/restore
- Prometheus Operator
- MySQL Operator
- Kafka Operator
What are Admission Controllers?
What are Admission Controllers?
- Validating → Validate requests (allow/deny)
- Mutating → Modify requests (inject sidecars, set defaults)
- Enforce security policies
- Inject sidecars automatically
- Deny privileged pods
- Resource quotas
- Limit ranges
- Image scanning enforcement
Services & Networking
What is a Service in Kubernetes?
What is a Service in Kubernetes?
- ClusterIP (default) - Internal cluster access only
- NodePort - Exposes service on each node’s IP at a static port
- LoadBalancer - Creates external load balancer (cloud providers)
- ExternalName - Maps service to external DNS name
LoadBalancer vs Ingress Controller
LoadBalancer vs Ingress Controller
- LoadBalancer Service
- Ingress Controller
- Layer 4 traffic distribution (TCP/UDP)
- Allocates a cloud load balancer (AWS ELB, GCP LB)
- One LB per service → expensive
- Simple setup, but costly at scale
Why do we need an Ingress Controller?
Why do we need an Ingress Controller?
- NGINX
- Traefik
- HAProxy
- AWS ALB
- Istio Gateway
What are network policies in Kubernetes?
What are network policies in Kubernetes?
Troubleshooting
Pod stuck in CrashLoopBackOff — how do you troubleshoot?
Pod stuck in CrashLoopBackOff — how do you troubleshoot?
- Incorrect image or tag
- Missing ConfigMap/Secret
- Application runtime error
- Failing liveness probe
- Insufficient resources
- Wrong command/args in Pod spec
ConfigMap changes not showing in Pods — why?
ConfigMap changes not showing in Pods — why?
- If mounted as a volume → updates auto-refresh (with small delay)
- If used as environment variables → Pod restart required
Can you create a Pod without a Deployment?
Can you create a Pod without a Deployment?
- Scaling capabilities
- Self-healing (automatic restart)
- Rolling updates
- Rollback functionality
- Declarative management
Scheduling & Affinity
What is Node Affinity and when to use it?
What is Node Affinity and when to use it?
requiredDuringSchedulingIgnoredDuringExecution→ hard rule (must match)preferredDuringSchedulingIgnoredDuringExecution→ soft rule (prefer but not required)
- Schedule GPU workloads on GPU nodes
- Cost optimization using spot vs on-demand nodes
- Separate production and development workloads
- Place workloads in specific availability zones
Node Selector vs Node Affinity
Node Selector vs Node Affinity
| Feature | Node Selector | Node Affinity |
|---|---|---|
| Complexity | Simple | Advanced |
| Operators | Exact match only | In, NotIn, Exists, DoesNotExist |
| Soft Rules | No | Yes (preferred) |
| Flexibility | Low | High |
| Multiple conditions | No | Yes |
When to use Pod Anti-Affinity?
When to use Pod Anti-Affinity?
- High Availability
- Spread replicas across nodes to avoid single-node failure
- Performance Isolation
- Prevent two heavy workloads from competing on the same node
- Security
- Keep sensitive workloads separate from less trusted workloads
app: myapp run on the same node.Deployment Strategies
Types of Kubernetes Deployments
Types of Kubernetes Deployments
- Recreate
- Rolling Update
- Canary
- Blue-Green
- Terminates all old Pods before creating new ones
- Downtime occurs
- Simplest strategy
- Use when downtime is acceptable
How do you handle rollbacks in Kubernetes?
How do you handle rollbacks in Kubernetes?
- Kubernetes keeps previous ReplicaSets
- Tools like ArgoCD, Helm, Spinnaker can handle rollbacks
- By default, keeps last 10 revisions (configurable via
revisionHistoryLimit)
AWS & Cloud
How would you isolate a network within an AWS VPC?
How would you isolate a network within an AWS VPC?
- Use separate public/private subnets with different route tables
- Configure NACLs (Network Access Control Lists) at subnet level
- Use Security Groups at instance level
- Optionally use AWS Network Firewall for advanced filtering
- Use Transit Gateway for centralized network management
- Implement VPC Peering or PrivateLink for controlled cross-VPC communication
Architectural differences between GCP VPC and AWS VPC
Architectural differences between GCP VPC and AWS VPC
| Feature | AWS VPC | GCP VPC |
|---|---|---|
| Scope | Region-scoped | Global |
| Subnets | AZ-specific | Regional (span zones) |
| Cross-region | Requires VPC Peering | Built-in |
| Default routing | Explicit | Automatic cross-region |
| Peering | Explicit setup required | Simpler setup |
How do you connect two VPCs in AWS?
How do you connect two VPCs in AWS?
- VPC Peering
- Direct network connection between two VPCs
- Simple 1-to-1 connection
- Non-transitive (A↔B, B↔C doesn’t mean A↔C)
- Transit Gateway
- Hub-and-spoke model
- Connects multiple VPCs and on-premises networks
- Scalable, centralized management
- Best for complex network topologies
- PrivateLink
- Expose services privately
- No VPC peering required
- Service-level access, not network-level
How to set up Kubernetes on AWS using EKS?
How to set up Kubernetes on AWS using EKS?
- EKS Control Plane
- Worker node groups (EC2 instances or Fargate)
- VPC with public/private subnets
- IAM roles for cluster and nodes
- Security groups
Infrastructure as Code
How do you bring an existing resource into Terraform state?
How do you bring an existing resource into Terraform state?
What is Terraform state drift? How can you detect and solve it?
What is Terraform state drift? How can you detect and solve it?
-
Apply Terraform changes (bring infrastructure back to desired state)
-
Import manual changes (accept manual changes into state)
-
Prevention:
- Use Terraform Cloud/Enterprise sentinel policies
- Implement proper RBAC
- Use cloud provider service control policies
- Regular
terraform planin CI/CD
Backup & Disaster Recovery
How would you back up a cluster? What tools would you use?
How would you back up a cluster? What tools would you use?
- Velero - Most popular K8s backup tool
- Backs up cluster resources and persistent volumes
- Supports disaster recovery and cluster migration
- AWS Backup - Centralized backup service
- Database-specific tools (pg_dump, mysqldump)
- Cloud provider snapshots (EBS, RDS automated backups)
- PVC snapshots using CSI drivers
- Volume snapshots via cloud provider APIs
Cluster Management
How would you upgrade a Kubernetes cluster?
How would you upgrade a Kubernetes cluster?
- Test in staging first
- Upgrade control plane → then nodes
- Cordon & drain nodes before upgrading:
- Upgrade one minor version at a time (1.26 → 1.27 → 1.28)
- Control plane upgrade is managed by cloud provider
- Manually upgrade node groups
Cost Optimization
What practices would you propose to reduce compute costs across the org?
What practices would you propose to reduce compute costs across the org?
- Horizontal Pod Autoscaler (HPA)
- Cluster Autoscaler
- Vertical Pod Autoscaler (VPA)
- Reserved Instances for predictable workloads (save 30-70%)
- Spot Instances for fault-tolerant workloads (save up to 90%)
- Savings Plans for flexible commitments
- Analyze resource utilization
- Right-size Pods and instances
- Remove resource limits where appropriate
- Use tools like Kubecost, Goldilocks
- AWS Lambda for event-driven workloads
- Fargate for containerized workloads without managing nodes
- Pay only for actual usage
- Delete unused resources (old snapshots, unattached volumes)
- Shut down dev/test environments off-hours
- Use cloud provider cost explorer and recommendations
- Implement tagging strategy for cost allocation
Security
What would you do if you accidentally pushed credentials to a remote repo?
What would you do if you accidentally pushed credentials to a remote repo?
-
Revoke/rotate keys immediately
- AWS: Deactivate and delete access keys
- Generate new credentials
-
Clean Git history
-
Force push
-
Prevention:
- Add
.env,credentials.jsonto.gitignore - Use git-secrets or pre-commit hooks
- Implement secret scanning in CI/CD (GitHub Advanced Security, GitGuardian)
- Use secret management (AWS Secrets Manager, HashiCorp Vault)
- Add
Docker & Containerization
What is Containerization?
What is Containerization?
- Consistent environments (dev, staging, prod)
- Faster deployments
- Resource efficiency (compared to VMs)
- Isolation and security
- Portability across platforms
- Containers share OS kernel → lighter weight
- VMs include full OS → heavier but more isolated
What is a Dockerfile?
What is a Dockerfile?
FROM- Base imageWORKDIR- Set working directoryCOPY- Copy files from host to imageRUN- Execute commands during buildEXPOSE- Document ports (doesn’t publish)CMD- Default command when container startsENTRYPOINT- Configures container as executable
What is a Docker Network?
What is a Docker Network?
- Other containers
- The host system
- External systems
- bridge (default) - Private network for containers on same host
- host - Container uses host’s network directly
- overlay - Multi-host networking for Swarm
- macvlan - Assign MAC address to container
- none - Disable networking