Kubernetes
Deploy chmonitor on Kubernetes with the vendored Helm chart or kustomize overlays, with health probes, autoscaling, and secrets management.
Run chmonitor on Kubernetes with the vendored Helm chart or raw kustomize manifests. Both use the same image (ghcr.io/chmonitor/chmonitor:vX.Y.Z), expose port 3000, run as the non-root app user (uid/gid 1001), and wire the same health probes.
Prerequisites
- A Kubernetes cluster and
kubectlcontext. - Helm 3 (for the chart) or
kubectlwith kustomize (for raw manifests). - A reachable ClickHouse endpoint with a monitoring user.
The chart is published in two registries:
| Registry | Install command |
|---|---|
| Helm repo (Cloudflare Pages) | helm repo add chmonitor https://charts.chmonitor.dev |
| OCI (GHCR) | helm install my-chm oci://ghcr.io/chmonitor/chmonitor --version X.Y.Z |
Setup
Add the repo and install
helm repo add chmonitor https://charts.chmonitor.dev
helm repo update
helm install my-chm chmonitor/chmonitor \
--set clickhouse.host="https://clickhouse.example.com:8443" \
--set clickhouse.user="monitoring" \
--set clickhouse.password="change-me"Install with a values file (optional)
helm install my-chm chmonitor/chmonitor -f values.yamlExample values.yaml:
image:
tag: "vX.Y.Z" # use the latest release tag from https://github.com/chmonitor/chmonitor/releases
clickhouse:
host: "https://clickhouse.example.com:8443"
user: "monitoring"
password: "change-me"
ingress:
enabled: true
className: nginx
hosts:
- host: chmonitor.example.com
paths:
- path: /
pathType: Prefix
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512MiUpgrade and uninstall:
helm upgrade my-chm chmonitor/chmonitor -f values.yaml
helm uninstall my-chmReplace vX.Y.Z with the latest release tag from GitHub Releases.
helm install my-chm oci://ghcr.io/chmonitor/chmonitor --version vX.Y.Z \
--set clickhouse.host="https://clickhouse.example.com:8443" \
--set clickhouse.user="monitoring" \
--set clickhouse.password="change-me"Pull and inspect the chart before installing:
helm pull oci://ghcr.io/chmonitor/chmonitor --version vX.Y.Z --untar
helm show values ./chmonitorClone the repo and install the chart directly — useful when you want to patch the chart before installing:
git clone https://github.com/chmonitor/chmonitor.git
cd clickhouse-monitoring
helm install my-chm ./deploy/helm/chmonitor \
--set clickhouse.host="https://clickhouse.example.com:8443" \
--set clickhouse.user="monitoring" \
--set clickhouse.password="change-me"kubectl kustomize deploy/kubernetes/base
# Apply
kubectl apply -k deploy/kubernetes/base
kubectl port-forward svc/chmonitor 3000:3000Keep environment differences in an overlay:
# deploy/kubernetes/overlays/prod/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: monitoring
resources:
- ../../base
images:
- name: ghcr.io/chmonitor/chmonitor
newTag: vX.Y.Z
replicas:
- name: chmonitor
count: 2Verify
kubectl port-forward svc/my-chm-chmonitor 3000:3000
# open http://localhost:3000Configure
ClickHouse connection
Store credentials in a Secret, not a ConfigMap
ClickHouse credentials contain passwords. Always use a Kubernetes Secret, not a ConfigMap.
kubectl create secret generic chmonitor-clickhouse \
--from-literal=CLICKHOUSE_HOST='https://clickhouse.example.com:8443' \
--from-literal=CLICKHOUSE_USER='monitoring' \
--from-literal=CLICKHOUSE_PASSWORD='change-me'Reference it in your Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: chmonitor
spec:
replicas: 1
selector:
matchLabels:
app: chmonitor
template:
metadata:
labels:
app: chmonitor
spec:
containers:
- name: chmonitor
image: ghcr.io/chmonitor/chmonitor:vX.Y.Z
ports:
- containerPort: 3000
envFrom:
- secretRef:
name: chmonitor-clickhouse
livenessProbe:
httpGet:
path: /healthz
port: 3000
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /api/healthz
port: 3000
initialDelaySeconds: 5
periodSeconds: 10Multiple hosts
CLICKHOUSE_HOST defines the host count. CLICKHOUSE_USER and CLICKHOUSE_PASSWORD may be a single value (applied to all hosts) or one value per host position. CLICKHOUSE_NAME is optional. Position N maps to host N.
kubectl create secret generic chmonitor-clickhouse \
--from-literal=CLICKHOUSE_HOST='https://ch1:8443,https://ch2:8443' \
--from-literal=CLICKHOUSE_USER='monitoring,monitoring' \
--from-literal=CLICKHOUSE_PASSWORD='pass1,pass2' \
--from-literal=CLICKHOUSE_NAME='shard-1,shard-2'Query / pool tuning
Add these to a ConfigMap (non-secret values):
apiVersion: v1
kind: ConfigMap
metadata:
name: chmonitor-config
data:
CLICKHOUSE_MAX_EXECUTION_TIME: "30"
CLICKHOUSE_TZ: "UTC"
CLICKHOUSE_DATABASE: "system"
CLICKHOUSE_POOL_SIZE: "10"
CLICKHOUSE_POOL_TIMEOUT: "300000"
CLICKHOUSE_POOL_CLEANUP_INTERVAL: "60000"Reference both in the Deployment:
envFrom:
- secretRef:
name: chmonitor-clickhouse
- configMapRef:
name: chmonitor-configFeature permissions
# chmonitor-config ConfigMap additions
CHM_DISABLED_FEATURES: "peerdb,actions"
CHM_AUTH_REQUIRED_FEATURES: "agent,settings,mcp"
CHM_FEATURE_AGENT_ACCESS: "authenticated"Recommended for complex rules. Create a ConfigMap with a TOML file:
apiVersion: v1
kind: ConfigMap
metadata:
name: chmonitor-features
data:
chmonitor.toml: |
[features.agent]
access = "authenticated"
[features.settings]
enabled = false
[features.mcp]
access = "authenticated"
[features.actions]
enabled = falseMount it and point CHM_CONFIG_FILE at it:
containers:
- name: chmonitor
image: ghcr.io/chmonitor/chmonitor:vX.Y.Z
env:
- name: CHM_CONFIG_FILE
value: /config/chmonitor.toml
volumeMounts:
- name: features-config
mountPath: /config
readOnly: true
volumes:
- name: features-config
configMap:
name: chmonitor-featuresFeature ids: overview, agent, insights, health, queries, tables, metrics, dashboard, security, logs, settings, cluster, operations, actions, mcp, docs, about.
Authentication
Open access:
# in ConfigMap
CHM_AUTH_PROVIDER: "none"Can combine with any provider:
kubectl create secret generic chmonitor-auth \
--from-literal=CHM_API_KEY_SECRET='a-long-random-secret'kubectl create secret generic chmonitor-clerk \
--from-literal=CLERK_SECRET_KEY='sk_live_...'# in ConfigMap — set the canonical CHM_* names once
CHM_AUTH_PROVIDER: "clerk"
CHM_CLERK_PUBLISHABLE_KEY: "pk_live_..." # public publishable keyThe client VITE_AUTH_PROVIDER / VITE_CLERK_PUBLISHABLE_KEY derive from these at build time.
Clerk needs a custom image
The client half of these settings is inlined at build time. The pre-built GHCR image is built with auth none, so enabling Clerk means building a custom image with CHM_AUTH_PROVIDER / CHM_CLERK_PUBLISHABLE_KEY set at build time, then setting them (plus the CLERK_SECRET_KEY Secret) at runtime. See Authentication.
Cloudflare Access:
# in ConfigMap
CHM_AUTH_PROVIDER: "proxy"
CHM_CF_ACCESS_TEAM_DOMAIN: "https://yourteam.cloudflareaccess.com"
CHM_CF_ACCESS_AUD: "<audience-tag>"nginx ingress / sidecar:
kubectl create secret generic chmonitor-proxy \
--from-literal=CHM_PROXY_AUTH_SECRET='a-long-random-secret'# in ConfigMap
CHM_AUTH_PROVIDER: "proxy"
CHM_PROXY_AUTH_HEADER: "X-Forwarded-User"
CHM_PROXY_SHARED_SECRET_HEADER: "X-Chm-Proxy-Secret"Without CHM_PROXY_AUTH_SECRET, the trusted-header provider is disabled. Configure your ingress to set the header and the secret. See Authentication.
AI agent
kubectl create secret generic chmonitor-agent \
--from-literal=LLM_API_KEY='sk-...' \
--from-literal=AGENT_API_TOKEN='bearer-token-for-agent-api'# in ConfigMap
LLM_API_BASE: "https://openrouter.ai/api/v1"
LLM_MODEL: "openrouter/free"
AGENT_ENABLE_CONTROL_TOOLS: "false"Set CHM_FEATURE_AGENT_ACCESS=authenticated in the ConfigMap to require login. Keep LLM_API_KEY in the Secret — never in a VITE_* var or ConfigMap.
Conversation store
Default: browser localStorage — no server config needed.
Server-side persistence requires CHM_FEATURE_CONVERSATION_DB=true baked into the image at build time (its client VITE_FEATURE_CONVERSATION_DB is derived at build). Build a custom image with this flag to enable it. At runtime, set the backend via CONVERSATION_STORE_BACKEND in a ConfigMap.
On Kubernetes, use postgres or agentstate for the conversation store. D1 and Durable Object stores are Cloudflare-only.
kubectl create secret generic chmonitor-postgres \
--from-literal=DATABASE_URL='postgresql://user:pass@host:5432/dbname'# in ConfigMap
CONVERSATION_STORE_BACKEND: "postgres"kubectl create secret generic chmonitor-agentstate \
--from-literal=AGENTSTATE_API_KEY='as_live_...'# in ConfigMap
CONVERSATION_STORE_BACKEND: "agentstate"Health alerting
The health sweep runs at GET /api/cron/health-sweep. Trigger it from a Kubernetes CronJob:
apiVersion: batch/v1
kind: CronJob
metadata:
name: chmonitor-health-sweep
spec:
schedule: "*/5 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: sweep
image: curlimages/curl:latest
env:
- name: CRON_SECRET
valueFrom:
secretKeyRef:
name: chmonitor-cron
key: CRON_SECRET
command:
- sh
- -c
- 'curl -sf -H "Authorization: Bearer $CRON_SECRET" http://chmonitor:3000/api/cron/health-sweep'
restartPolicy: OnFailureSet the webhook and secret:
# in ConfigMap
HEALTH_ALERT_ENABLED: "true"
HEALTH_ALERT_MIN_SEVERITY: "warning"# Health-alerting credentials live in their own secret so re-running this
# never touches chmonitor-auth (which holds CHM_API_KEY_SECRET).
kubectl create secret generic chmonitor-cron \
--from-literal=CRON_SECRET='a-random-secret' \
--from-literal=HEALTH_ALERT_WEBHOOK_URL='https://hooks.slack.com/services/...' \
--dry-run=client -o yaml | kubectl apply -f -Reference HEALTH_ALERT_WEBHOOK_URL from the chmonitor-cron secret in the
Deployment using an explicit secretKeyRef (do not add the whole cron secret
via envFrom — that would also expose CRON_SECRET to the app container):
env:
- name: HEALTH_ALERT_WEBHOOK_URL
valueFrom:
secretKeyRef:
name: chmonitor-cron
key: HEALTH_ALERT_WEBHOOK_URLBranding
Branding vars are inlined at build time. For a pre-built image, customize by building your own image with these set:
VITE_TITLE_SHORT=MyCluster
VITE_LOGO=/logo.pngThese are inlined at build time — set them in CI when building the image, not as runtime env vars.
Health probes
- Liveness —
GET /healthz— always200while the process runs. - Readiness —
GET /api/healthz— returns503when no ClickHouse host is reachable.
Autoscaling
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80The dashboard is stateless, so scaling out is safe. The readiness probe keeps traffic off pods until ClickHouse is reachable.
Secrets management
For GitOps workflows, do not commit real passwords. Use:
- External Secrets — sync from AWS Secrets Manager, GCP Secret Manager, Vault, etc.
- SOPS — encrypt secrets in Git.
- Sealed Secrets — encrypt for a specific cluster.
Upgrading
Update the image tag
Update the image tag in your values.yaml or kustomize overlay.
Apply the change
# Helm
helm upgrade my-chm ./deploy/helm/chmonitor -f values.yaml
# kustomize
kubectl apply -k deploy/kubernetes/overlays/prodVerify the rollout
kubectl rollout status deployment/chmonitorFor breaking changes between major versions, see Migrating to v0.3.
Troubleshooting
Validate the chart and manifests before applying:
helm lint ./deploy/helm/chmonitor
helm template release ./deploy/helm/chmonitor | kubeconform -strict -summary
kubectl kustomize deploy/kubernetes/base | kubeconform -strict -summary