DockerHub is a cornerstone of containerized development, but its rate-limiting policies often create bottlenecks in workflows. Teams frequently encounter issues pulling container images, especially in CI/CD pipelines, where frequent requests can exceed DockerHub’s pull limits.
In this article, we’ll explore how to set up Harbor as a proxy cache in a Kubernetes environment to address DockerHub rate limits. By using Harbor, a secure, open-source container registry, we created a scalable caching solution to reduce dependency on DockerHub and improve workflow efficiency.
Challenge: DockerHub Rate Limiting
DockerHub enforces the following rate limits:
- Anonymous users: 100 pulls per 6 hours per IP address
- Authenticated users: 200 pulls per 6 hours per user
These limits are restrictive in Kubernetes environments where:
- Multiple pods often pull the same image in parallel
- CI/CD pipelines involve frequent pulls of updated images
- Teams share a common IP address, exacerbating the rate limit problem
Common errors include:
toomanyrequests: Too Many Requests. Please wait and try again later
Our goal was to create a self-sufficient ecosystem that avoided DockerHub rate limits while maintaining seamless operations for developers and CI/CD systems
The Solution: Harbor Proxy Cache in Kubernetes
Harbor provides a proxy cache feature that caches remote registry images locally. By deploying Harbor in Kubernetes, we were able to:
- Cache frequently pulled images, bypassing DockerHub rate limits
- Improve image pull speed by avoiding external network calls
- Centralize image management with robust security features
Step-by-Step Setup of Harbor Proxy Cache in Kubernetes
1. Prerequisites
Ensure you have:
- A Kubernetes cluster
- Helm installed on your local machine.
- Access to a DockerHub account (optional, for authenticated pulls).
2. Deploy Harbor in Kubernetes
Harbor can be deployed in Kubernetes using its official Helm chart.
a. Add the Harbor Helm Repository
helm repo add harbor https://helm.goharbor.io
helm repo update
b. Create a Namespace for Harbor
kubectl create namespace harbor
c. Install Harbor Using Helm
Customize the installation with a values.yaml file. Below is a sample configuration for a minimal setup with proxy cache enabled:
yaml
expose:
type: ingress
ingress:
hosts:
core: harbor.example.com
notary: notary.harbor.example.com
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
externalURL: https://harbor.example.com
harborAdminPassword: yourpassword
proxy:
httpProxy: http://proxy.example.com:3128
httpsProxy: https://proxy.example.com:3128
noProxy: 127.0.0.1,localhost,.example.com
persistence:
persistentVolumeClaim:
registry:
storageClass: "default"
size: 20Gi
Install Harbor using this configuration:
helm install harbor harbor/harbor -n harbor -f values.yaml
3. Configure Harbor Proxy Cache for DockerHub
After deployment, access Harbor using the URL configured in values.yaml (e.g., https://harbor.example.com).
a. Log into the Harbor UI
- Open the Harbor web interface
- Log in with the admin credentials:
- Username: admin
- Password: (from values.yaml, HarborAdmin12345 in this example)
b. Add a Proxy Cache Endpoint for DockerHub
- Go to Administration > Registries.
- Click + New Endpoint
- Configure the endpoint as follows:
- Registry Type: Docker Hub
- Name: dockerhub-proxy
- Registry URL: https://registry-1.docker.io
- Access ID and Secret: (Optional, for authenticated pulls)
- Save the configuration
c. Enable Proxy Cache for a Project
- Navigate to Projects in the Harbor UI
- Create or edit a project
- Enable Proxy Cache for the project, linking it to the DockerHub proxy endpoint
4. Configure Kubernetes Nodes to Use Harbor
To ensure all Kubernetes nodes pull images through Harbor, update the Docker daemon on each node.
a. Add Harbor as a Registry Mirror
Edit the Docker daemon configuration (/etc/docker/daemon.json) on each node:
{
"registry-mirrors": ["https://harbor.example.com"]
}
b. Restart Docker
sudo systemctl restart docker
5. Test the Setup
Pull a DockerHub image through Harbor:
- Log in to Harbor from your Docker client:
bash
docker login harbor.example.com
- Pull an image:
bash
docker pull harbor.example.com/<project-name>/library/nginx:latest
- Check Harbor to confirm the image is cached
In Kubernetes, update your deployment manifests to use images from Harbor:
image: harbor.example.com/<project-name>/library/nginx:latest
Additional Tips for Kubernetes Deployments
Use ImagePullSecrets for Authentication
If Harbor requires authentication, create a Kubernetes secret for the credentials:
kubectl create secret docker-registry harbor-secret \
--docker-server=harbor.example.com \
--docker-username=<username> \
--docker-password=<password> \
--docker-email=<email>
Add the secret to your pods:
yaml
Copy code
imagePullSecrets:
- name: harbor-secret
Automate Proxy Cache Cleanup
Harbor allows you to configure retention policies to manage cached images. Set up a policy to remove unused images periodically to conserve storage.
Results After Implementation
After deploying Harbor as a proxy cache in Kubernetes, we achieved the following:
- Bypassed DockerHub Rate Limits: Cached images significantly reduced external requests to DockerHub
- Faster Image Pulls: Local caching improved speed, especially in CI/CD pipelines
- Centralized Image Management: Developers could access and manage images through a single, secure interface