chmonitor

Grafana bridge

Query ClickHouse system tables from Grafana panels alongside chmonitor — copy-paste SQL for running queries, slow queries, merges, parts, and replication lag.

Run chmonitor and Grafana side by side on the same ClickHouse cluster. There is no official integration and no special bridge process — both tools query ClickHouse system tables directly using the ClickHouse HTTP or native protocol. This page walks through the setup and gives you copy-paste panel queries derived from the same SQL chmonitor uses internally.

Community recipe

This is a community starting point, not an officially maintained integration. The queries are adapted from chmonitor's query configs and target ClickHouse v23.8+. Use them as a baseline and adjust to your schema version.

How it works

Both tools connect as separate clients. chmonitor uses the @clickhouse/client library; Grafana uses its ClickHouse data source plugin. Neither has any awareness of the other — they share nothing except the ClickHouse endpoint and credentials.

You can give each tool the same credentials, or create separate ClickHouse users with GRANT SELECT ON system.* (see Least-privilege connection presets).

Setup

Install the ClickHouse data source

Install the official plugin from Grafana Labs:

grafana-cli plugins install grafana-clickhouse-datasource

Or via the Grafana UI: Configuration → Plugins → Search "ClickHouse".

Point it at your ClickHouse host

Configure the data source to point at the same ClickHouse host that chmonitor uses:

FieldValue
Server addressYour CLICKHOUSE_HOST value (without protocol)
Server port8123 (HTTP) or 9000 (native)
UsernameSame as CLICKHOUSE_USER
PasswordSame as CLICKHOUSE_PASSWORD
Default databasesystem
TLSMatch your ClickHouse TLS settings

Add the panel queries

Paste the panel queries below into a Grafana panel's "Query" field with the ClickHouse data source selected, or import the starter dashboard JSON.

Panel queries

The five queries below are adapted directly from chmonitor's query configs. They are plain SQL — paste them into a Grafana panel's "Query" field with the ClickHouse data source selected.

1. Running queries (system.processes)

Shows queries currently executing on the server. Equivalent to chmonitor's Running Queries page.

SELECT
    query_id,
    user,
    elapsed,
    formatReadableTimeDelta(elapsed) AS readable_elapsed,
    read_rows,
    formatReadableQuantity(read_rows) AS readable_read_rows,
    formatReadableSize(read_bytes) AS readable_read_bytes,
    formatReadableSize(memory_usage) AS readable_memory_usage,
    length(thread_ids) AS thread_count,
    multiIf(
        interface = 1, 'TCP',
        interface = 2, 'HTTP',
        interface = 3, 'gRPC',
        interface = 4, 'MySQL',
        interface = 5, 'PostgreSQL',
        interface = 6, 'Local',
        interface = 7, 'Interserver',
        toString(interface)
    ) AS interface_label,
    substr(query, 1, 200) AS query_preview
FROM system.processes
WHERE is_cancelled = 0
ORDER BY elapsed DESC

Suggested panel type: Table. Sort by elapsed descending. Color elapsed with thresholds at 5 s (yellow) and 30 s (red).

2. Slow queries (system.query_log)

Top slowest completed queries from the last 24 hours. Equivalent to chmonitor's Slow Queries page.

SELECT
    query_id,
    query_start_time,
    query_duration_ms,
    query_duration_ms / 1000 AS query_duration_s,
    user,
    read_rows,
    formatReadableQuantity(read_rows) AS readable_read_rows,
    formatReadableSize(read_bytes) AS readable_read_bytes,
    formatReadableSize(memory_usage) AS readable_memory_usage,
    substr(replace(query, '\n', ' '), 1, 300) AS query_preview
FROM system.query_log
WHERE
    type = 'QueryFinish'
    AND query_duration_ms >= 5000
    AND event_time > now() - INTERVAL 24 HOUR
ORDER BY query_duration_ms DESC
LIMIT 50

Requires system.query_log

system.query_log must be enabled in your ClickHouse server config. The query_cache_usage column is available from ClickHouse v24.1+; omit it on older versions.

Suggested panel type: Table. Use a bar gauge overlay on query_duration_s.

3. Active merges (system.merges)

Merges and mutations currently in progress. Equivalent to chmonitor's Merges page.

SELECT
    database,
    table,
    partition_id,
    elapsed,
    round(progress * 100, 1) AS pct_progress,
    num_parts,
    formatReadableQuantity(rows_read) AS readable_rows_read,
    formatReadableQuantity(rows_written) AS readable_rows_written,
    formatReadableSize(memory_usage) AS readable_memory_usage,
    is_mutation,
    merge_type,
    merge_algorithm
FROM system.merges
ORDER BY progress DESC

Suggested panel type: Table. Add a stat panel with COUNT(*) for an at-a-glance merge count.

4. Parts per table (system.parts)

Row and byte counts per active part, aggregated to the table level. Derived from chmonitor's part-info query.

SELECT
    database,
    table,
    count() AS part_count,
    sum(rows) AS total_rows,
    formatReadableQuantity(sum(rows)) AS readable_rows,
    formatReadableSize(sum(data_compressed_bytes)) AS readable_compressed,
    formatReadableSize(sum(data_uncompressed_bytes)) AS readable_uncompressed,
    round(
        sum(data_uncompressed_bytes) / nullIf(sum(data_compressed_bytes), 0),
        2
    ) AS compression_ratio
FROM system.parts
WHERE active = 1
GROUP BY database, table
ORDER BY sum(data_compressed_bytes) DESC
LIMIT 50

Suggested panel type: Table. Highlight tables where part_count is unusually high (many small parts indicate a merge backlog).

5. Replication lag (system.replicas)

Replication queue depth and delay per replicated table. Equivalent to chmonitor's Replicas page.

SELECT
    database,
    table,
    engine,
    is_leader,
    is_readonly,
    absolute_delay,
    queue_size,
    inserts_in_queue,
    merges_in_queue,
    future_parts,
    total_replicas,
    active_replicas
FROM system.replicas
ORDER BY absolute_delay DESC, queue_size DESC

Replicated tables only

system.replicas is only populated for tables using a Replicated*MergeTree engine. Returns empty on standalone (non-replicated) ClickHouse instances.

Suggested panel type: Table. Add a stat panel with max(absolute_delay) and a threshold alert at 60 s.

Starter dashboard JSON

A minimal Grafana dashboard JSON is available at docs/assets/grafana/chmonitor-overview.json in this repository.

Import it via Dashboards → Import → Upload JSON file. Select your ClickHouse data source when prompted.

The starter includes all five panels above with suggested layouts and thresholds. It is a community baseline — not an official release artifact.

Credentials and access control

chmonitor and Grafana can share a ClickHouse user or use separate users. For production, separate least-privilege users are recommended so each tool's query traffic is identifiable in system.query_log.

Example: create a read-only Grafana user:

CREATE USER grafana_ro IDENTIFIED BY 'your-password';
GRANT SELECT ON system.* TO grafana_ro;

See Least-privilege connection presets for a complete example including network restrictions.

Limitations

  • No real-time streaming. Grafana polls ClickHouse on a configurable interval (minimum ~5 s). chmonitor's running-queries view also polls every 5 s. Neither tool provides push-based streaming from ClickHouse.
  • system.query_log retention. Slow query history depends on your query_log TTL. Default retention is 30 days; adjust via system.query_log flush_interval_milliseconds and partition TTL in your ClickHouse config.
  • system.replicas requires ZooKeeper/Keeper. If your cluster does not use replication, the replicas panel returns no rows.
  • ClickHouse version. Queries target v23.8+. Some columns (query_cache_usage, peak_threads_usage) are only available on newer versions — the queries fall back gracefully or note the version requirement inline.

On this page