Skip to content

AI-Powered SQL Generation

LogChef leverages the power of Large Language Models (LLMs) to translate your natural language questions about your logs directly into ClickHouse SQL queries. This feature is designed to lower the barrier to entry for log exploration, allowing users less familiar with SQL to gain insights quickly, and speeding up query construction for experienced users.

Simply type your query in plain English, and LogChef will do its best to generate the appropriate SQL to fetch the data you need from your connected ClickHouse log sources.

Natural Language to SQL

Transform plain English questions into optimized ClickHouse SQL queries instantly.

Schema-Aware Generation

AI understands your log structure and generates queries using actual field names and types.

Multiple Provider Support

Works with OpenAI, OpenRouter, Azure OpenAI, and other OpenAI-compatible APIs.

Context-Aware Suggestions

AI considers your current query context to provide better suggestions and refinements.

How it Works

When you use the AI SQL generation feature for a specific log source, LogChef provides the underlying table schema and your natural language query to an AI model. The model then constructs a ClickHouse SQL query tailored to your request and the schema of your data.

Example Natural Language Queries

Below are examples of natural language queries you can use with LogChef, demonstrating the versatility of the AI SQL generation feature. These are tailored for common log exploration use cases.


🔍 Severity & Error Investigation

  1. “Show all error logs from the last 6 hours.”
  2. “List logs with severity higher than warning in the last day.”
  3. “Find critical logs from service auth-service today.”
  4. “Show logs with severity_text equal to ‘ERROR’ and body containing ‘timeout’.”

🕒 Time-based Exploration

  1. “Show the most recent 20 logs from the api-gateway service.”
  2. “List all logs between 2 PM and 3 PM yesterday for the syslog namespace.”
  3. “Count logs per hour for the last 12 hours.”

🧩 Trace & Debugging

  1. “Find all logs with trace_id abc123.”
  2. “Show logs grouped by trace_id for the last 30 minutes.”
  3. “Get logs where trace_flags is not zero.”

📦 Service/Namespace Filtering

  1. “List logs from the payments namespace with severity ERROR.”
  2. “Show logs for checkout-service in the default namespace.”
  3. “Get all log bodies from the syslog namespace for the last 12 hours.”

  1. “Find logs where log_attributes contain key user_id and value 42.”
  2. “Search for logs where body mentions ‘database connection failed’.”
  3. “List logs where log_attributes include env=prod.”

  1. “Count logs by severity_text for the last 24 hours.”
  2. “Top 5 services by number of ERROR logs today.”
  3. “How many logs were generated by each namespace in the last hour?”

🛠️ Miscellaneous Debug Scenarios

  1. “Show logs where span_id is missing or empty.”
  2. “Find logs that contain both retry and failed in the body.”
  3. “List logs where severity_number is greater than 13 and contains ‘exception’.”

Tips for Effective Queries

  • Be Specific: The more context you provide, the better the AI can understand your intent. Mention specific services, namespaces, error messages, or timeframes.
  • Use Natural Language: Don’t try to force SQL-like syntax. Plain, clear English works best.
  • Iterate: If the first query doesn’t give you exactly what you want, try rephrasing it or adding more detail.

Configuration

To enable AI SQL generation, configure the following settings in your config.toml:

[ai]
# Enable AI features
enabled = true
# OpenAI API key (required)
api_key = "sk-your_api_key_here"
# Model to use (default: gpt-4o)
model = "gpt-4o"
# Optional: Custom API endpoint for OpenRouter, Azure OpenAI, etc.
# base_url = "https://openrouter.ai/api/v1"
# Model parameters
max_tokens = 1024
temperature = 0.1

Supported Providers

The AI integration supports OpenAI-compatible APIs including:

  • OpenAI: Standard OpenAI API (default)
  • OpenRouter: Access to multiple models via https://openrouter.ai/api/v1
  • Azure OpenAI: Microsoft’s OpenAI service
  • Local Models: Any OpenAI-compatible endpoint

Environment Variables

For production deployments, use environment variables for sensitive configuration:

Terminal window
export LOGCHEF_AI__ENABLED=true
export LOGCHEF_AI__API_KEY="sk-your_api_key_here"
export LOGCHEF_AI__MODEL="gpt-4o"

Using the AI Assistant

In the Query Editor

  1. Navigate to any log source in your team
  2. Look for the AI assistant button in the query editor
  3. Type your natural language question
  4. Click “Generate SQL” to get a ClickHouse query
  5. Review and execute the generated query

Example Workflow

Natural Language: "Show me all error logs from the last 6 hours"
Generated SQL:
SELECT *
FROM logs_table
WHERE severity_text = 'ERROR'
AND timestamp >= now() - INTERVAL 6 HOUR
ORDER BY timestamp DESC
LIMIT 100

Context-Aware Suggestions

If you already have a query in the editor, the AI will consider it as context:

Current Query: SELECT * FROM logs WHERE service = 'api-gateway'
Natural Language: "Add a filter for errors in the last hour"
Enhanced SQL:
SELECT *
FROM logs
WHERE service = 'api-gateway'
AND severity_text = 'ERROR'
AND timestamp >= now() - INTERVAL 1 HOUR
ORDER BY timestamp DESC

MCP Integration

For AI assistant integration outside the web interface, use the LogChef MCP server. This enables natural language log analysis through AI assistants like Claude Desktop:

  • Ask questions about your logs in plain English
  • AI assistants understand your log schema automatically
  • Query across multiple sources and teams
  • Create and manage saved queries through conversation

A Note on Accuracy

While powerful, AI-generated SQL should always be reviewed, especially for critical or complex queries. The feature is a tool to assist and accelerate your log exploration, not a complete replacement for understanding your data and query logic.

Best Practices

  • Review Generated Queries: Always check the SQL before execution
  • Start Simple: Begin with basic queries and add complexity gradually
  • Provide Context: Mention specific field names, time ranges, and conditions
  • Iterate: Refine your natural language prompts based on results