Hologres Slow Query Analysis

Other

Hologres slow query log analysis and diagnosis skill. Use for analyzing slow queries, failed queries, query performance diagnosis, and log management in Alibaba Cloud Hologres. Triggers: "hologres slow query", "hg_query_log", "query diagnosis", "慢Query分析", "Hologres性能诊断"

Install

openclaw skills install hologres-slow-query-analysis

Prerequisites

This skill requires hologres-cli to be installed first:

pip install hologres-cli
export HOLOGRES_SKILL=hologres-slow-query-analysis

All SQL execution and GUC parameter operations depend on hologres-cli commands (hologres sql run, hologres guc set).

Hologres Slow Query Analysis

Diagnose and analyze slow/failed queries in Alibaba Cloud Hologres using the hologres.hg_query_log system table.

Version Requirements

Hologres VersionFeature
V0.10+Basic slow query log
V2.2+SQL fingerprint (digest)
V2.2.7+Default threshold 100ms
V3.0.2+Aggregated records for <100ms queries

Quick Start

1. Check Permissions

-- Superuser: view all DB logs
ALTER USER "cloud_account_id" SUPERUSER;

-- Or join pg_read_all_stats group
GRANT pg_read_all_stats TO "cloud_account_id";

-- For current DB only (SPM model)
CALL spm_grant('<db_name>_admin', 'cloud_account_id');

2. Basic Query Count

SELECT count(*) FROM hologres.hg_query_log;

3. Recent Slow Queries (10 min)

SELECT status AS "Status",
       duration AS "Duration(ms)",
       query_start AS "Start Time",
       (read_bytes/1048576)::text || ' MB' AS "Read",
       (memory_bytes/1048576)::text || ' MB' AS "Memory",
       (cpu_time_ms/1000)::text || ' s' AS "CPU",
       query_id AS "QueryID",
       query::char(50) AS "Query"
FROM hologres.hg_query_log
WHERE query_start >= now() - interval '10 min'
ORDER BY duration DESC
LIMIT 100;

Core Diagnostic Workflows

Workflow 1: Find Resource-Heavy Queries

Use when CPU/memory usage is high.

-- Top 10 CPU-consuming queries (past day)
SELECT digest, avg(cpu_time_ms), sum(cpu_time_ms)
FROM hologres.hg_query_log
WHERE query_start >= CURRENT_DATE - INTERVAL '1 day'
  AND digest IS NOT NULL AND usename != 'system'
GROUP BY 1 ORDER BY 3 DESC LIMIT 10;

Workflow 2: Find Failed Queries

SELECT status, message::char(100), duration, query_start, query_id, query::char(80)
FROM hologres.hg_query_log
WHERE query_start BETWEEN '2024-01-01 00:00:00'::timestamptz 
      AND '2024-01-01 01:00:00'::timestamptz
  AND status = 'FAILED'
ORDER BY query_start ASC LIMIT 100;

Workflow 3: Query Phase Analysis

Identify bottleneck phase (optimization/startup/execution).

SELECT status, duration AS "Total(ms)",
       optimization_cost AS "Optimize(ms)",
       start_query_cost AS "Startup(ms)",
       get_next_cost AS "Execute(ms)",
       duration - optimization_cost - start_query_cost - get_next_cost AS "Other(ms)",
       query_id, query::char(50)
FROM hologres.hg_query_log
WHERE query_start >= now() - interval '10 min'
ORDER BY duration DESC LIMIT 100;

Workflow 4: Compare with Yesterday

SELECT query_date, count(1), sum(read_bytes), sum(cpu_time_ms)
FROM hologres.hg_query_log
WHERE query_start >= now() - interval '3 h'
GROUP BY query_date
UNION ALL
SELECT query_date, count(1), sum(read_bytes), sum(cpu_time_ms)
FROM hologres.hg_query_log
WHERE query_start >= now() - interval '1d 3h' AND query_start <= now() - interval '1d'
GROUP BY query_date;

Key Fields Reference

FieldDescription
query_idUnique query identifier
digestSQL fingerprint (MD5 hash)
durationTotal query time (ms)
cpu_time_msCPU time consumed
memory_bytesPeak memory usage
read_bytesData read volume
engine_typeQuery engine (HQE/PQE/SDK/PG)
optimization_costPlan generation time
start_query_costQuery startup time
get_next_costExecution time

Configuration

-- Set slow query threshold (DB level, superuser only)
ALTER DATABASE dbname SET log_min_duration_statement = '250ms';

-- Session level
SET log_min_duration_statement = '250ms';

-- Set log retention (V3.0.27+, 3-30 days)
ALTER DATABASE dbname SET hg_query_log_retention_time_sec = 2592000;

Or use the CLI for database-level settings:

hologres guc set log_min_duration_statement '250ms'
hologres guc set hg_query_log_retention_time_sec 2592000

References

DocumentContent
diagnostic-queries.mdComplete diagnostic SQL collection
log-export.mdExport logs to internal/external tables
configuration.mdConfiguration parameters

Best Practices

  1. Always filter by query_start for better performance
  2. Use digest to group similar queries for pattern analysis
  3. Check engine_type - PQE queries may need optimization
  4. For start_query_cost high: check locks or resource contention
  5. For get_next_cost high: optimize SQL or add indexes
  6. Regular cleanup: set appropriate retention period