chmonitor
Getting Started

Enable system tables

Enable and configure ClickHouse system log tables for full chmonitor feature coverage, including query_log, metric_log, and optional thread logs.

chmonitor reads from ClickHouse system.* log tables. Most are enabled by default in a fresh installation, but some deployments disable them with <... remove="1"/> to save disk. This page lists which tables are needed and how to enable them.

Required tables

These tables are needed for core monitoring features. Missing any of them degrades the corresponding pages.

TableWhat it powersDefault
system.query_logQuery history, performance analysis, history pagesEnabled
system.metric_logHistorical CPU, memory, network metricsEnabled
system.asynchronous_metric_logPeriodic async metrics chartsEnabled
system.part_logPart/merge operation historyEnabled
system.error_logServer error history (v23.8+)Enabled

Optional tables

These unlock specific features. chmonitor handles their absence gracefully — the related page shows an empty state or a notice instead of crashing.

TableFeatureDefaultNotes
system.query_thread_logThread analysis, parallelization statsDisabledNeeds log_query_threads=1 in profile
system.processors_profile_logQuery profilerDisabledNeeds explicit config entry
system.query_views_logQuery views logDisabledNeeds explicit config entry
system.text_logServer text log viewerDisabledNeeds explicit config entry
system.session_logLogin and session historyDisabledNeeds explicit config entry
system.crash_logCrash log viewerDisabledNeeds explicit config entry
system.trace_logTrace-level profilingDisabledNeeds explicit config entry
system.backup_logBackup/restore historyNot presentOnly exists when backup is configured
system.zookeeperZooKeeper / Keeper statusNot presentOnly available when Keeper/ZooKeeper is configured

Enable the tables

Add the config block

Create or edit /etc/clickhouse-server/config.d/system-logs.xml:

<clickhouse>
  <!-- Required: core monitoring tables -->
  <query_log>
    <database>system</database>
    <table>query_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </query_log>

  <metric_log>
    <database>system</database>
    <table>metric_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <collect_interval_milliseconds>60000</collect_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </metric_log>

  <asynchronous_metric_log>
    <database>system</database>
    <table>asynchronous_metric_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <collect_interval_milliseconds>60000</collect_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </asynchronous_metric_log>

  <part_log>
    <database>system</database>
    <table>part_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </part_log>

  <error_log>
    <database>system</database>
    <table>error_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <collect_interval_milliseconds>1000</collect_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </error_log>

  <!-- Optional: thread analysis -->
  <query_thread_log>
    <database>system</database>
    <table>query_thread_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </query_thread_log>

  <!-- Optional: query profiler -->
  <processors_profile_log>
    <database>system</database>
    <table>processors_profile_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </processors_profile_log>

  <!-- Optional: server text log viewer -->
  <text_log>
    <database>system</database>
    <table>text_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </text_log>

  <!-- Optional: session/login history -->
  <session_log>
    <database>system</database>
    <table>session_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </session_log>

  <!-- Optional: trace log -->
  <trace_log>
    <database>system</database>
    <table>trace_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <max_size_rows>1048576</max_size_rows>
    <reserved_size_rows>8192</reserved_size_rows>
    <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </trace_log>

  <!-- Optional: crash log -->
  <crash_log>
    <database>system</database>
    <table>crash_log</table>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    <max_size_rows>1024</max_size_rows>
    <reserved_size_rows>1024</reserved_size_rows>
    <buffer_size_rows_flush_threshold>512</buffer_size_rows_flush_threshold>
    <flush_on_crash>true</flush_on_crash>
  </crash_log>
</clickhouse>

Enable log_query_threads

For query_thread_log to collect data, set log_query_threads=1 in the profile used by the monitoring user:

<!-- /etc/clickhouse-server/users.d/profiles.xml -->
<clickhouse>
  <profiles>
    <monitoring_profile>
      <log_query_threads>1</log_query_threads>
    </monitoring_profile>
  </profiles>
</clickhouse>

Restart and verify

After applying the config, restart ClickHouse and check which log tables are present:

sudo systemctl restart clickhouse-server

clickhouse-client --query \
  "SELECT name FROM system.tables WHERE database = 'system' AND name LIKE '%_log' ORDER BY name"

Re-enabling disabled tables

If your config has entries like:

<metric_log remove="1"/>
<asynchronous_metric_log remove="1"/>
<text_log remove="1"/>
<query_thread_log remove="1"/>

Remove those lines and add the corresponding config block from the section above.

The remove attribute prevents table creation

The remove="1" attribute completely prevents the table from being created — removing the attribute alone is not enough if no config block exists.

Controlling disk usage

System log tables can grow large. Useful levers:

  • max_size_rows: Caps the table size (default ~1M rows). Rows are dropped when the limit is hit.
  • TTL: Configure a TTL to prune old rows automatically. This is set in config.xml (or a file in config.d/) under the relevant log block — for example, <ttl>event_date + INTERVAL 30 DAY DELETE</ttl>. Direct ALTER TABLE on system tables is generally restricted.
  • collect_interval_milliseconds: Higher values mean less frequent metric collection. 60000 (1 minute) is a reasonable default for most deployments.

On this page