Building a Production-Ready ClickHouse Cluster on Hetzner Cloud

Introduction

If you're dealing with analytics workloads, log processing, or any scenario that requires fast aggregations over billions of rows, you've probably heard of ClickHouse. Originally developed at Yandex, ClickHouse is a column-oriented database management system that can process analytical queries at incredible speeds.

However, setting up a production-ready ClickHouse cluster isn't trivial. You need to consider:

High availability: What happens when a node fails?
Data replication: How do you prevent data loss?
Coordination: How do nodes communicate and elect leaders?
Monitoring: How do you know when something goes wrong?
Security: How do you protect your data?
Cost: Can you do all this without breaking the bank?

In this guide, I'll walk you through building a complete production-ready ClickHouse cluster on Hetzner Cloud for approximately €37/month. We'll use:

Terraform for infrastructure provisioning
Ansible for configuration management
ClickHouse Keeper for distributed coordination
Cloudflare for SSL and DDoS protection
Grafana + Prometheus for monitoring
GitHub Actions for CI/CD

By the end, you'll have a fully automated, scalable, and secure analytics infrastructure.

Why This Stack?

Why ClickHouse?

ClickHouse excels at:

Speed: Queries that take minutes in traditional databases complete in seconds
Compression: Column storage achieves 10-20x compression ratios
Real-time ingestion: Handle millions of rows per second
SQL compatibility: Familiar query language with powerful extensions

Real-world example: At scale, a query like "count all events from last month grouped by country" might scan 50 billion rows in under 2 seconds.

Why Hetzner Cloud?

While AWS, GCP, and Azure dominate the cloud market, Hetzner offers compelling advantages:

Provider	4 vCPU, 8GB RAM	Egress (1TB)	Total/Month
AWS (EC2)	~$120	$90	~$210
GCP	~$100	$120	~$220
Hetzner	€8.99	Free (20TB)	~€9

For a 3-node cluster, Hetzner saves you ~€500/month compared to hyperscalers. That's significant for startups and side projects.

Why Terraform + Ansible?

This combination follows the separation of concerns principle:

Terraform: "What infrastructure exists" (declarative)
Ansible: "How infrastructure is configured" (procedural)

Terraform creates servers, networks, and load balancers. Ansible installs software, manages configs, and handles services. Each tool does what it's best at.

Why ClickHouse Keeper instead of ZooKeeper?

Historically, ClickHouse used Apache ZooKeeper for coordination. ClickHouse Keeper is a drop-in replacement that:

Runs as a single binary (simpler operations)
Has the same protocol (compatible with existing configs)
Uses less memory
Is maintained by the ClickHouse team

Architecture Overview

Here's what we're building:

                                    Internet
                                        │
                                        ▼
                            ┌───────────────────────┐
                            │      Cloudflare       │
                            │   (SSL Termination)   │
                            │   grafana.domain.com  │
                            │ clickhouse.domain.com │
                            └───────────┬───────────┘
                                        │ HTTPS
                                        ▼
┌───────────────────────────────────────────────────────────────────────────────┐
│                               Hetzner Cloud                                    │
│                                                                                │
│  ┌─────────────────────────────────────────────────────────────────────────┐  │
│  │                    Hetzner Load Balancer (LB11)                          │  │
│  │                    Health checks: /ping on port 8123                     │  │
│  └─────────────────────────────────┬───────────────────────────────────────┘  │
│                                    │                                           │
│         ┌──────────────────────────┼──────────────────────────┐               │
│         │                          │                          │               │
│         ▼                          ▼                          ▼               │
│   ┌───────────┐              ┌───────────┐              ┌───────────┐         │
│   │ CH Node 1 │              │ CH Node 2 │              │ CH Node 3 │         │
│   ├───────────┤              ├───────────┤              ├───────────┤         │
│   │ClickHouse │              │ClickHouse │              │ClickHouse │         │
│   │  Server   │◄────────────►│  Server   │◄────────────►│  Server   │         │
│   ├───────────┤  Replication ├───────────┤  Replication ├───────────┤         │
│   │  Keeper   │              │  Keeper   │              │  Keeper   │         │
│   │ (Raft)    │◄────────────►│ (Raft)    │◄────────────►│ (Raft)    │         │
│   └─────┬─────┘   Consensus  └─────┬─────┘   Consensus  └─────┬─────┘         │
│         │                          │                          │               │
│         └──────────────────────────┼──────────────────────────┘               │
│                                    │                                           │
│                          Private Network                                       │
│                           10.0.1.0/24                                          │
│                                    │                                           │
│                          ┌─────────┴─────────┐                                │
│                          │    Monitoring     │                                │
│                          │  ┌─────────────┐  │                                │
│                          │  │   Grafana   │  │                                │
│                          │  │  Prometheus │  │                                │
│                          │  │    Nginx    │  │                                │
│                          │  └─────────────┘  │                                │
│                          └───────────────────┘                                │
│                                                                                │
└───────────────────────────────────────────────────────────────────────────────┘

Component Breakdown

Component	Purpose	Why We Need It
ClickHouse Server	Analytics database	The core of our system
ClickHouse Keeper	Distributed coordination	Leader election, distributed locks, replication coordination
Hetzner Load Balancer	Traffic distribution	Round-robin queries across healthy nodes
Private Network	Secure communication	Cluster traffic never leaves Hetzner's network
Prometheus	Metrics collection	Time-series data for monitoring
Grafana	Visualization	Dashboards and alerting
Nginx	Reverse proxy	SSL termination for Cloudflare
Cloudflare	DNS, SSL, DDoS protection	Free SSL and protection

Data Flow

Writes: Client → Load Balancer → ClickHouse Node → Replicated to other nodes via Keeper
Reads: Client → Load Balancer → Any healthy ClickHouse Node
Metrics: All nodes → Prometheus → Grafana

Why 3 Nodes?

The number 3 isn't arbitrary. For consensus algorithms like Raft (used by Keeper):

3 nodes: Tolerates 1 failure (need majority: 2)
5 nodes: Tolerates 2 failures (need majority: 3)

With 3 nodes, we get high availability while keeping costs reasonable.

Prerequisites

Before we begin, ensure you have:

Accounts

Hetzner Cloud - Sign up here
Cloudflare - Sign up here (free tier works)
Terraform Cloud - Sign up here (free tier)
GitHub - For CI/CD

Local Tools

# macOS
brew install terraform ansible
 
# Ubuntu/Debian
sudo apt update
sudo apt install -y terraform ansible
 
# Verify installations
terraform version  # Should be 1.6+
ansible --version  # Should be 2.15+

API Tokens

You'll need to generate:

Token	Where	Permissions
Hetzner API Token	Console → Security → API Tokens	Read & Write
Cloudflare API Token	Profile → API Tokens	Zone:DNS:Edit, Zone:Settings:Edit
Terraform Cloud Token	User Settings → Tokens	N/A

Domain Name

You need a domain managed by Cloudflare. We'll create:

clickhouse.yourdomain.com
grafana.yourdomain.com

Part 1: Infrastructure with Terraform

Let's start by creating the infrastructure. Create a new project directory:

mkdir clickhouse-cluster && cd clickhouse-cluster
mkdir -p terraform ansible secrets

Understanding the Terraform Structure

terraform/
├── main.tf           # Core resources (servers, network, LB)
├── variables.tf      # Input variables
├── outputs.tf        # Output values
├── cloudflare.tf     # DNS and SSL settings
├── backend.tf        # Terraform Cloud configuration
└── templates/
    └── inventory.tpl # Ansible inventory template

Step 1: Provider Configuration

Create terraform/main.tf:

# terraform/main.tf
 
terraform {
  required_version = ">= 1.6.0"
 
  required_providers {
    hcloud = {
      source  = "hetznercloud/hcloud"
      version = "~> 1.45"
    }
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4.0"
    }
  }
}
 
provider "hcloud" {
  token = var.hcloud_token
}
 
provider "cloudflare" {
  api_token = var.cloudflare_api_token
}

What's happening here?

We declare our required Terraform version (1.6+)
We specify two providers: hcloud for Hetzner and cloudflare for DNS
Providers are authenticated via variables (we'll define those next)

Step 2: Variables

Create terraform/variables.tf:

# terraform/variables.tf
 
# =============================================================================
# REQUIRED VARIABLES
# =============================================================================
 
variable "hcloud_token" {
  description = "Hetzner Cloud API token"
  type        = string
  sensitive   = true
}
 
variable "cloudflare_api_token" {
  description = "Cloudflare API token"
  type        = string
  sensitive   = true
}
 
variable "cloudflare_zone_name" {
  description = "Your domain name in Cloudflare"
  type        = string
}
 
# =============================================================================
# CLUSTER CONFIGURATION
# =============================================================================
 
variable "cluster_name" {
  description = "Name prefix for all resources"
  type        = string
  default     = "clickhouse"
}
 
variable "location" {
  description = "Hetzner datacenter location"
  type        = string
  default     = "nbg1"  # Nuremberg, Germany
  
  validation {
    condition     = contains(["fsn1", "nbg1", "hel1", "ash", "hil"], var.location)
    error_message = "Location must be: fsn1, nbg1, hel1, ash, or hil"
  }
}
 
variable "clickhouse_node_count" {
  description = "Number of ClickHouse nodes"
  type        = number
  default     = 3
  
  validation {
    condition     = var.clickhouse_node_count >= 1
    error_message = "Must have at least 1 node"
  }
}
 
variable "clickhouse_server_type" {
  description = "Server type for ClickHouse nodes"
  type        = string
  default     = "cx33"  # 4 vCPU, 8GB RAM
}
 
variable "monitoring_server_type" {
  description = "Server type for monitoring"
  type        = string
  default     = "cx23"  # 2 vCPU, 4GB RAM
}
 
variable "ssh_allowed_ips" {
  description = "IPs allowed to SSH (CIDR notation)"
  type        = list(string)
  default     = ["0.0.0.0/0"]  # Restrict in production!
}

Key decisions explained:

Location: nbg1 (Nuremberg) is often the cheapest and has good availability
Server types: cx33 gives us 4 vCPU and 8GB RAM—enough for moderate workloads
SSH access: Defaults to open, but you should restrict to your IP

Step 3: SSH Key Generation

We need SSH access to our servers. Terraform can generate keys for us:

# Add to terraform/main.tf
 
# =============================================================================
# SSH KEY
# =============================================================================
 
resource "tls_private_key" "cluster_key" {
  algorithm = "ED25519"
}
 
resource "hcloud_ssh_key" "cluster" {
  name       = "${var.cluster_name}-key"
  public_key = tls_private_key.cluster_key.public_key_openssh
}
 
resource "local_file" "private_key" {
  content         = tls_private_key.cluster_key.private_key_openssh
  filename        = "${path.module}/../secrets/${var.cluster_name}-key"
  file_permission = "0600"
}

Why ED25519?

Shorter keys (256-bit vs 4096-bit RSA)
Faster operations
Modern and secure

Step 4: Network Infrastructure

Now let's create the private network:

# Add to terraform/main.tf
 
# =============================================================================
# NETWORK
# =============================================================================
 
resource "hcloud_network" "cluster" {
  name     = "${var.cluster_name}-network"
  ip_range = "10.0.0.0/16"
}
 
resource "hcloud_network_subnet" "cluster" {
  network_id   = hcloud_network.cluster.id
  type         = "cloud"
  network_zone = "eu-central"
  ip_range     = "10.0.1.0/24"
}

Why a private network?

Security: Inter-node traffic stays private
Performance: Lower latency than public internet
Cost: Private traffic is free

Step 5: Firewall Rules

Security first! Let's define what traffic is allowed:

# Add to terraform/main.tf
 
# =============================================================================
# FIREWALL
# =============================================================================
 
resource "hcloud_firewall" "cluster" {
  name = "${var.cluster_name}-firewall"
 
  # SSH access (restrict to your IP in production!)
  dynamic "rule" {
    for_each = var.ssh_allowed_ips
    content {
      direction  = "in"
      protocol   = "tcp"
      port       = "22"
      source_ips = [rule.value]
    }
  }
 
  # HTTP for Cloudflare proxy
  rule {
    direction  = "in"
    protocol   = "tcp"
    port       = "80"
    source_ips = ["0.0.0.0/0", "::/0"]
  }
 
  # HTTPS for Cloudflare proxy
  rule {
    direction  = "in"
    protocol   = "tcp"
    port       = "443"
    source_ips = ["0.0.0.0/0", "::/0"]
  }
 
  # Grafana
  rule {
    direction  = "in"
    protocol   = "tcp"
    port       = "3000"
    source_ips = ["0.0.0.0/0", "::/0"]
  }
 
  # ClickHouse HTTP API
  rule {
    direction  = "in"
    protocol   = "tcp"
    port       = "8123"
    source_ips = ["0.0.0.0/0", "::/0"]
  }
 
  # ClickHouse Native Protocol
  rule {
    direction  = "in"
    protocol   = "tcp"
    port       = "9000"
    source_ips = ["0.0.0.0/0", "::/0"]
  }
 
  # Allow all outbound
  rule {
    direction       = "out"
    protocol        = "tcp"
    port            = "1-65535"
    destination_ips = ["0.0.0.0/0", "::/0"]
  }
 
  rule {
    direction       = "out"
    protocol        = "udp"
    port            = "1-65535"
    destination_ips = ["0.0.0.0/0", "::/0"]
  }
}

Step 6: ClickHouse Servers

Now the main event—creating our ClickHouse nodes:

# Add to terraform/main.tf
 
# =============================================================================
# CLICKHOUSE NODES
# =============================================================================
 
resource "hcloud_server" "clickhouse" {
  count       = var.clickhouse_node_count
  name        = "${var.cluster_name}-ch-${count.index + 1}"
  image       = "ubuntu-24.04"
  server_type = var.clickhouse_server_type
  location    = var.location
  ssh_keys    = [hcloud_ssh_key.cluster.id]
 
  labels = {
    cluster = var.cluster_name
    role    = "clickhouse"
  }
 
  # Cloud-init: disable unattended-upgrades (prevents apt lock issues)
  user_data = <<-EOF
    #cloud-config
    package_update: false
    package_upgrade: false
    runcmd:
      - systemctl stop unattended-upgrades || true
      - systemctl disable unattended-upgrades || true
      - apt-get remove -y unattended-upgrades || true
      - echo "vm.max_map_count=262144" >> /etc/sysctl.conf
      - sysctl -p
  EOF
 
  lifecycle {
    ignore_changes = [user_data]
  }
}
 
# Attach to private network
resource "hcloud_server_network" "clickhouse" {
  count     = var.clickhouse_node_count
  server_id = hcloud_server.clickhouse[count.index].id
  subnet_id = hcloud_network_subnet.cluster.id
  ip        = "10.0.1.${count.index + 2}"  # .2, .3, .4
}
 
# Attach firewall
resource "hcloud_firewall_attachment" "clickhouse" {
  firewall_id = hcloud_firewall.cluster.id
  server_ids  = hcloud_server.clickhouse[*].id
}

Important cloud-init settings:

Disable unattended-upgrades: Ubuntu's automatic updates can lock apt during Ansible runs, causing failures
vm.max_map_count: ClickHouse needs this increased for memory-mapped files

Step 7: Monitoring Server

# Add to terraform/main.tf
 
# =============================================================================
# MONITORING NODE
# =============================================================================
 
resource "hcloud_server" "monitoring" {
  name        = "${var.cluster_name}-monitoring"
  image       = "ubuntu-24.04"
  server_type = var.monitoring_server_type
  location    = var.location
  ssh_keys    = [hcloud_ssh_key.cluster.id]
 
  labels = {
    cluster = var.cluster_name
    role    = "monitoring"
  }
 
  user_data = <<-EOF
    #cloud-config
    package_update: false
    package_upgrade: false
    runcmd:
      - systemctl stop unattended-upgrades || true
      - systemctl disable unattended-upgrades || true
      - apt-get remove -y unattended-upgrades || true
  EOF
 
  lifecycle {
    ignore_changes = [user_data]
  }
}
 
resource "hcloud_server_network" "monitoring" {
  server_id = hcloud_server.monitoring.id
  subnet_id = hcloud_network_subnet.cluster.id
  ip        = "10.0.1.10"
}
 
resource "hcloud_firewall_attachment" "monitoring" {
  firewall_id = hcloud_firewall.cluster.id
  server_ids  = [hcloud_server.monitoring.id]
}

Step 8: Load Balancer

The load balancer distributes traffic across healthy nodes:

# Add to terraform/main.tf
 
# =============================================================================
# LOAD BALANCER
# =============================================================================
 
resource "hcloud_load_balancer" "cluster" {
  name               = "${var.cluster_name}-lb"
  load_balancer_type = "lb11"
  location           = var.location
}
 
resource "hcloud_load_balancer_network" "cluster" {
  load_balancer_id = hcloud_load_balancer.cluster.id
  subnet_id        = hcloud_network_subnet.cluster.id
  ip               = "10.0.1.100"
}
 
# HTTP API service
resource "hcloud_load_balancer_service" "http" {
  load_balancer_id = hcloud_load_balancer.cluster.id
  protocol         = "tcp"
  listen_port      = 8123
  destination_port = 8123
 
  health_check {
    protocol = "http"
    port     = 8123
    interval = 5
    timeout  = 3
    retries  = 3
 
    http {
      path         = "/ping"
      status_codes = ["200"]
    }
  }
}
 
# Native protocol service
resource "hcloud_load_balancer_service" "native" {
  load_balancer_id = hcloud_load_balancer.cluster.id
  protocol         = "tcp"
  listen_port      = 9000
  destination_port = 9000
 
  health_check {
    protocol = "tcp"
    port     = 9000
    interval = 5
    timeout  = 3
    retries  = 3
  }
}
 
# Attach ClickHouse nodes
resource "hcloud_load_balancer_target" "clickhouse" {
  count            = var.clickhouse_node_count
  type             = "server"
  load_balancer_id = hcloud_load_balancer.cluster.id
  server_id        = hcloud_server.clickhouse[count.index].id
  use_private_ip   = true
 
  depends_on = [
    hcloud_load_balancer_network.cluster,
    hcloud_server_network.clickhouse
  ]
}

Health check explained:

ClickHouse exposes /ping endpoint that returns "Ok." when healthy. The load balancer checks this every 5 seconds and removes unhealthy nodes from rotation.

Step 9: Cloudflare DNS

Create terraform/cloudflare.tf:

# terraform/cloudflare.tf
 
data "cloudflare_zone" "domain" {
  name = var.cloudflare_zone_name
}
 
# Grafana DNS record
resource "cloudflare_record" "grafana" {
  zone_id = data.cloudflare_zone.domain.id
  name    = "grafana"
  content = hcloud_server.monitoring.ipv4_address
  type    = "A"
  ttl     = 1  # Auto TTL when proxied
  proxied = true
}
 
# ClickHouse DNS record
resource "cloudflare_record" "clickhouse" {
  zone_id = data.cloudflare_zone.domain.id
  name    = "clickhouse"
  content = hcloud_server.monitoring.ipv4_address  # Goes through Nginx
  type    = "A"
  ttl     = 1
  proxied = true
}
 
# SSL settings
resource "cloudflare_zone_settings_override" "ssl_settings" {
  zone_id = data.cloudflare_zone.domain.id
 
  settings {
    ssl                      = "full"
    always_use_https         = "on"
    min_tls_version         = "1.2"
    automatic_https_rewrites = "on"
  }
}

Why route through monitoring node?

Cloudflare only proxies HTTP (ports 80/443). Our Nginx on the monitoring node:

Terminates SSL
Proxies to ClickHouse via private network
Provides a single entry point

Step 10: Outputs

Create terraform/outputs.tf:

# terraform/outputs.tf
 
output "load_balancer_ip" {
  description = "Public IP of the load balancer"
  value       = hcloud_load_balancer.cluster.ipv4
}
 
output "clickhouse_https_url" {
  description = "ClickHouse HTTPS URL"
  value       = "https://clickhouse.${var.cloudflare_zone_name}"
}
 
output "grafana_https_url" {
  description = "Grafana HTTPS URL"
  value       = "https://grafana.${var.cloudflare_zone_name}"
}
 
output "monitoring_public_ip" {
  description = "Public IP of monitoring node"
  value       = hcloud_server.monitoring.ipv4_address
}
 
output "ansible_inventory" {
  description = "Generated Ansible inventory"
  value = templatefile("${path.module}/templates/inventory.tpl", {
    cluster_name     = var.cluster_name
    clickhouse_nodes = hcloud_server.clickhouse
    clickhouse_ips   = hcloud_server_network.clickhouse[*].ip
    monitoring_node  = hcloud_server.monitoring
    monitoring_ip    = hcloud_server_network.monitoring.ip
  })
}

Step 11: Inventory Template

Create terraform/templates/inventory.tpl:

---
# Auto-generated by Terraform
 
all:
  vars:
    ansible_user: root
    ansible_ssh_common_args: '-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null'
    cluster_name: ${cluster_name}
 
  children:
    clickhouse:
      hosts:
%{ for i, node in clickhouse_nodes ~}
        ${node.name}:
          ansible_host: ${node.ipv4_address}
          private_ip: ${clickhouse_ips[i]}
          keeper_id: ${i + 1}
          shard: ${(i % 2) + 1}
          replica: ${floor(i / 2) + 1}
%{ endfor ~}
 
    keeper:
      hosts:
%{ for i, node in clickhouse_nodes ~}
        ${node.name}:
          ansible_host: ${node.ipv4_address}
          private_ip: ${clickhouse_ips[i]}
          keeper_id: ${i + 1}
%{ endfor ~}
 
    monitoring:
      hosts:
        ${monitoring_node.name}:
          ansible_host: ${monitoring_node.ipv4_address}
          private_ip: ${monitoring_ip}

Step 12: Backend Configuration

Create terraform/backend.tf:

# terraform/backend.tf
 
terraform {
  cloud {
    organization = "your-org-name"  # Change this
    
    workspaces {
      name = "clickhouse-cluster"
    }
  }
}

Step 13: Deploy Infrastructure

Now let's deploy:

cd terraform
 
# Create your variables file
cat > terraform.tfvars << 'EOF'
hcloud_token           = "your-hetzner-token"
cloudflare_api_token   = "your-cloudflare-token"
cloudflare_zone_name   = "yourdomain.com"
cluster_name           = "clickhouse"
location               = "nbg1"
clickhouse_server_type = "cx33"
monitoring_server_type = "cx23"
EOF
 
# Login to Terraform Cloud
terraform login
 
# Initialize
terraform init
 
# Plan (review changes)
terraform plan
 
# Apply
terraform apply

After ~2 minutes, you'll have:

3 ClickHouse servers
1 Monitoring server
1 Load balancer
Private network
Firewall rules
DNS records

Generate the Ansible inventory:

terraform output -raw ansible_inventory > ../ansible/inventory/hosts.yml

Part 2: Configuration with Ansible

Now we configure our servers. Ansible will:

Install system dependencies
Configure ClickHouse Keeper (coordination)
Configure ClickHouse Server
Set up Prometheus + Grafana
Configure Nginx for SSL

Ansible Structure

ansible/
├── ansible.cfg          # Configuration
├── site.yml             # Main playbook
├── group_vars/
│   └── all.yml          # Global variables
├── inventory/
│   └── hosts.yml        # From Terraform
└── roles/
    ├── common/          # Base configuration
    ├── clickhouse-keeper/
    ├── clickhouse/
    ├── grafana/
    └── nginx/

Step 1: Ansible Configuration

Create ansible/ansible.cfg:

[defaults]
inventory = inventory/hosts.yml
host_key_checking = False
private_key_file = ../secrets/clickhouse-key
retry_files_enabled = False
stdout_callback = yaml
deprecations_warnings = False
 
[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no

Step 2: Global Variables

Create ansible/group_vars/all.yml:

---
# Cluster configuration
cluster_name: production_cluster
 
# ClickHouse users
clickhouse_users:
  - name: default
    password: ""
    networks:
      - "127.0.0.1"
      - "10.0.0.0/8"
    profile: default
    quota: default
  - name: admin
    password: "{{ lookup('env', 'CLICKHOUSE_ADMIN_PASSWORD') | default('ChangeMe123!', true) }}"
    networks:
      - "::/0"
    profile: default
    quota: default
    access_management: true
  - name: readonly
    password: "{{ lookup('env', 'CLICKHOUSE_READONLY_PASSWORD') | default('ReadOnly123!', true) }}"
    networks:
      - "::/0"
    profile: readonly
    quota: default
    access_management: false
 
# Keeper configuration
keeper_port: 9181
keeper_raft_port: 9234
 
# Monitoring
prometheus_retention: "15d"
grafana_admin_password: "{{ lookup('env', 'GRAFANA_ADMIN_PASSWORD') | default('admin', true) }}"
node_exporter_version: "1.7.0"
 
# System tuning
system_settings:
  vm_max_map_count: 262144
  net_core_somaxconn: 65535
  fs_file_max: 2097152
 
# Domain
domain_name: yourdomain.com
 
# Python interpreter
ansible_python_interpreter: /usr/bin/python3

User roles explained:

User	Purpose	Permissions
`default`	Internal cluster communication	No external access
`admin`	Application writes, DDL	Full access
`readonly`	Dashboards, reporting	SELECT only

Step 3: Main Playbook

Create ansible/site.yml:

---
- name: Apply common configuration
  hosts: all
  become: true
  roles:
    - common
 
- name: Configure ClickHouse Keeper
  hosts: keeper
  become: true
  roles:
    - clickhouse-keeper
 
- name: Configure ClickHouse servers
  hosts: clickhouse
  become: true
  roles:
    - clickhouse
 
- name: Configure monitoring
  hosts: monitoring
  become: true
  roles:
    - grafana
    - nginx
 
- name: Verify cluster
  hosts: clickhouse[0]
  become: true
  tasks:
    - name: Wait for ClickHouse
      wait_for:
        port: 8123
        delay: 5
        timeout: 60
 
    - name: Check cluster status
      shell: clickhouse-client --query "SELECT cluster, count() FROM system.clusters WHERE cluster='{{ cluster_name }}' GROUP BY cluster"
      register: cluster_status
 
    - name: Display status
      debug:
        msg: "{{ cluster_status.stdout }}"

Step 4: Common Role

Create ansible/roles/common/tasks/main.yml:

---
- name: Wait for apt locks
  shell: |
    TIMEOUT=60
    COUNTER=0
    while fuser /var/lib/dpkg/lock-frontend >/dev/null 2>&1; do
      if [ $COUNTER -ge $TIMEOUT ]; then
        pkill -9 unattended-upgr || true
        pkill -9 apt || true
        rm -f /var/lib/dpkg/lock-frontend
        break
      fi
      sleep 2
      COUNTER=$((COUNTER + 2))
    done
    dpkg --configure -a || true
  changed_when: false
 
- name: Disable unattended-upgrades
  systemd:
    name: unattended-upgrades
    state: stopped
    enabled: false
  ignore_errors: true
 
- name: Install common packages
  apt:
    name:
      - apt-transport-https
      - ca-certificates
      - curl
      - gnupg
      - htop
      - net-tools
      - vim
      - wget
      - jq
    state: present
    update_cache: true
 
- name: Configure sysctl
  sysctl:
    name: "{{ item.name }}"
    value: "{{ item.value }}"
    state: present
    reload: true
  loop:
    - { name: "vm.max_map_count", value: "{{ system_settings.vm_max_map_count }}" }
    - { name: "net.core.somaxconn", value: "{{ system_settings.net_core_somaxconn }}" }
    - { name: "fs.file-max", value: "{{ system_settings.fs_file_max }}" }
 
- name: Configure /etc/hosts
  lineinfile:
    path: /etc/hosts
    line: "{{ hostvars[item]['private_ip'] }} {{ item }}"
    state: present
  loop: "{{ groups['all'] }}"
 
- name: Install Node Exporter
  include_tasks: node_exporter.yml

Step 5: ClickHouse Keeper Role

Create ansible/roles/clickhouse-keeper/tasks/main.yml:

---
- name: Add ClickHouse repository key
  apt_key:
    keyserver: keyserver.ubuntu.com
    id: 8919F6BD2B48D754
 
- name: Add ClickHouse repository
  apt_repository:
    repo: "deb https://packages.clickhouse.com/deb lts main"
    state: present
 
- name: Install ClickHouse Keeper
  apt:
    name: clickhouse-keeper
    state: present
    update_cache: true
 
- name: Create directories
  file:
    path: "{{ item }}"
    state: directory
    owner: clickhouse
    group: clickhouse
    mode: '0750'
  loop:
    - /var/lib/clickhouse-keeper
    - /var/log/clickhouse-keeper
 
- name: Configure Keeper
  template:
    src: keeper_config.xml.j2
    dest: /etc/clickhouse-keeper/keeper_config.xml
    owner: clickhouse
    group: clickhouse
  notify: Restart clickhouse-keeper
 
- name: Start Keeper
  systemd:
    name: clickhouse-keeper
    state: started
    enabled: true

Create ansible/roles/clickhouse-keeper/templates/keeper_config.xml.j2:

<?xml version="1.0"?>
<clickhouse>
    <logger>
        <level>information</level>
        <log>/var/log/clickhouse-keeper/clickhouse-keeper.log</log>
        <errorlog>/var/log/clickhouse-keeper/clickhouse-keeper.err.log</errorlog>
        <size>100M</size>
        <count>3</count>
    </logger>
 
    <listen_host>0.0.0.0</listen_host>
 
    <keeper_server>
        <tcp_port>{{ keeper_port }}</tcp_port>
        <server_id>{{ keeper_id }}</server_id>
 
        <log_storage_path>/var/lib/clickhouse-keeper/log</log_storage_path>
        <snapshot_storage_path>/var/lib/clickhouse-keeper/snapshots</snapshot_storage_path>
 
        <coordination_settings>
            <operation_timeout_ms>10000</operation_timeout_ms>
            <session_timeout_ms>30000</session_timeout_ms>
            <raft_logs_level>warning</raft_logs_level>
        </coordination_settings>
 
        <raft_configuration>
{% for host in groups['keeper'] %}
            <server>
                <id>{{ hostvars[host]['keeper_id'] }}</id>
                <hostname>{{ hostvars[host]['private_ip'] }}</hostname>
                <port>{{ keeper_raft_port }}</port>
            </server>
{% endfor %}
        </raft_configuration>
    </keeper_server>
 
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>9363</port>
        <metrics>true</metrics>
    </prometheus>
</clickhouse>

Key configuration points:

server_id: Unique ID for each Keeper node (1, 2, 3)
raft_configuration: Lists all Keeper nodes for consensus
prometheus: Exposes metrics on port 9363

Step 6: ClickHouse Server Role

Create ansible/roles/clickhouse/tasks/main.yml:

---
- name: Add ClickHouse repository
  apt_repository:
    repo: "deb https://packages.clickhouse.com/deb lts main"
    state: present
 
- name: Install ClickHouse
  apt:
    name:
      - clickhouse-server
      - clickhouse-client
    state: present
    update_cache: true
 
- name: Configure cluster
  template:
    src: config.xml.j2
    dest: /etc/clickhouse-server/config.d/cluster.xml
    owner: clickhouse
    group: clickhouse
  notify: Restart clickhouse-server
 
- name: Configure users
  template:
    src: users.xml.j2
    dest: /etc/clickhouse-server/users.d/custom_users.xml
    owner: clickhouse
    group: clickhouse
  notify: Restart clickhouse-server
 
- name: Configure macros
  template:
    src: macros.xml.j2
    dest: /etc/clickhouse-server/config.d/macros.xml
    owner: clickhouse
    group: clickhouse
  notify: Restart clickhouse-server
 
- name: Configure Keeper connection
  template:
    src: zookeeper.xml.j2
    dest: /etc/clickhouse-server/config.d/zookeeper.xml
    owner: clickhouse
    group: clickhouse
  notify: Restart clickhouse-server
 
- name: Enable ClickHouse
  systemd:
    name: clickhouse-server
    enabled: true
 
- name: Start ClickHouse (async)
  shell: systemctl start clickhouse-server &
  async: 30
  poll: 0
 
- name: Wait for ClickHouse
  wait_for:
    port: 8123
    delay: 5
    timeout: 120

Create ansible/roles/clickhouse/templates/config.xml.j2:

<?xml version="1.0"?>
<clickhouse>
    <listen_host>0.0.0.0</listen_host>
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <interserver_http_port>9009</interserver_http_port>
 
    <remote_servers>
        <{{ cluster_name }}>
            <shard>
                <internal_replication>true</internal_replication>
{% for host in groups['clickhouse'] if hostvars[host]['shard'] == 1 %}
                <replica>
                    <host>{{ hostvars[host]['private_ip'] }}</host>
                    <port>9000</port>
                </replica>
{% endfor %}
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
{% for host in groups['clickhouse'] if hostvars[host]['shard'] == 2 %}
                <replica>
                    <host>{{ hostvars[host]['private_ip'] }}</host>
                    <port>9000</port>
                </replica>
{% endfor %}
            </shard>
        </{{ cluster_name }}>
    </remote_servers>
 
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>9364</port>
        <metrics>true</metrics>
    </prometheus>
 
    <distributed_ddl>
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
</clickhouse>

Cluster topology:

2 shards: Data is distributed across shards
Replicas: Each shard can have multiple replicas for redundancy
With 3 nodes: Shard 1 has 2 replicas, Shard 2 has 1 replica

Create ansible/roles/clickhouse/templates/macros.xml.j2:

<?xml version="1.0"?>
<clickhouse>
    <macros>
        <cluster>{{ cluster_name }}</cluster>
        <shard>{{ shard }}</shard>
        <replica>{{ inventory_hostname }}</replica>
    </macros>
</clickhouse>

Macros explained:

Macros are variables used in ReplicatedMergeTree table definitions:

CREATE TABLE events ON CLUSTER 'production_cluster'
(
    event_id UUID,
    timestamp DateTime
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
ORDER BY timestamp;

The {shard} and {replica} are automatically replaced with values from macros.

Step 7: Run Ansible

Set your passwords and run:

cd ansible
 
# Create .env file
cat > .env << 'EOF'
export CLICKHOUSE_ADMIN_PASSWORD="YourSecureAdminPass123"
export CLICKHOUSE_READONLY_PASSWORD="YourSecureReadOnlyPass123"
export GRAFANA_ADMIN_PASSWORD="YourGrafanaPass123"
EOF
 
# Source it
source .env
 
# Run playbook
ansible-playbook site.yml

This takes about 5-10 minutes. You'll see each task execute across all hosts.

Part 3: SSL with Cloudflare

Cloudflare provides free SSL, but it only proxies ports 80/443. We need Nginx to:

Listen on port 80/443
Terminate SSL (with self-signed cert—Cloudflare handles public SSL)
Proxy requests to internal services

Nginx Role

Create ansible/roles/nginx/tasks/main.yml:

---
- name: Install Nginx
  apt:
    name: nginx
    state: present
    update_cache: true
 
- name: Create SSL directory
  file:
    path: /etc/nginx/ssl
    state: directory
    mode: '0700'
 
- name: Generate self-signed certificate
  command: >
    openssl req -x509 -nodes -days 365 -newkey rsa:2048
    -keyout /etc/nginx/ssl/private.key
    -out /etc/nginx/ssl/certificate.crt
    -subj "/C=US/ST=State/L=City/O=Org/CN={{ domain_name }}"
  args:
    creates: /etc/nginx/ssl/certificate.crt
 
- name: Configure Grafana proxy
  template:
    src: grafana.conf.j2
    dest: /etc/nginx/sites-available/grafana.conf
  notify: Reload nginx
 
- name: Configure ClickHouse proxy
  template:
    src: clickhouse.conf.j2
    dest: /etc/nginx/sites-available/clickhouse.conf
  notify: Reload nginx
 
- name: Enable sites
  file:
    src: "/etc/nginx/sites-available/{{ item }}"
    dest: "/etc/nginx/sites-enabled/{{ item }}"
    state: link
  loop:
    - grafana.conf
    - clickhouse.conf
  notify: Reload nginx
 
- name: Remove default site
  file:
    path: /etc/nginx/sites-enabled/default
    state: absent
 
- name: Start Nginx
  systemd:
    name: nginx
    state: started
    enabled: true

Create ansible/roles/nginx/templates/clickhouse.conf.j2:

server {
    listen 80;
    server_name clickhouse.{{ domain_name }};
    return 301 https://$host$request_uri;
}
 
server {
    listen 443 ssl;
    server_name clickhouse.{{ domain_name }};
 
    ssl_certificate /etc/nginx/ssl/certificate.crt;
    ssl_certificate_key /etc/nginx/ssl/private.key;
    ssl_protocols TLSv1.2 TLSv1.3;
 
    location / {
        proxy_pass http://{{ hostvars[groups['clickhouse'][0]]['private_ip'] }}:8123;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-Proto https;
        proxy_buffering off;
        proxy_read_timeout 300s;
    }
}

Why self-signed certificates?

Cloudflare's "Full" SSL mode:

User → Cloudflare (valid public SSL) → Your Server (self-signed OK)

Cloudflare encrypts traffic to your origin, but doesn't verify the certificate. This is secure enough for most use cases and avoids the complexity of Let's Encrypt.

Part 4: CI/CD with GitHub Actions

Automate deployments with GitHub Actions:

Create .github/workflows/deploy.yml:

name: Deploy ClickHouse Cluster
 
on:
  push:
    branches: [main]
    paths:
      - 'terraform/**'
      - 'ansible/**'
      - '.github/workflows/**'
  workflow_dispatch:
    inputs:
      action:
        description: 'Action'
        required: true
        default: 'plan'
        type: choice
        options: [plan, apply, destroy]
 
env:
  TF_VERSION: '1.6.0'
  HCLOUD_TOKEN: ${{ secrets.HCLOUD_TOKEN }}
  CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }}
 
jobs:
  terraform:
    name: Terraform
    runs-on: ubuntu-latest
    environment: production
 
    steps:
      - uses: actions/checkout@v4
 
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: ${{ env.TF_VERSION }}
          cli_config_credentials_token: ${{ secrets.TF_API_TOKEN }}
          terraform_wrapper: false
 
      - name: Terraform Init
        working-directory: terraform
        run: terraform init
 
      - name: Terraform Plan
        working-directory: terraform
        run: |
          terraform plan \
            -var="hcloud_token=$HCLOUD_TOKEN" \
            -var="cloudflare_api_token=$CLOUDFLARE_API_TOKEN"
 
      - name: Terraform Apply
        if: github.event_name == 'push' || github.event.inputs.action == 'apply'
        working-directory: terraform
        run: |
          terraform apply -auto-approve \
            -var="hcloud_token=$HCLOUD_TOKEN" \
            -var="cloudflare_api_token=$CLOUDFLARE_API_TOKEN"
 
      - name: Generate Inventory
        if: github.event_name == 'push' || github.event.inputs.action == 'apply'
        working-directory: terraform
        run: |
          mkdir -p ../ansible/inventory
          terraform output -raw ansible_inventory > ../ansible/inventory/hosts.yml
 
      - name: Upload Inventory
        if: github.event_name == 'push' || github.event.inputs.action == 'apply'
        uses: actions/upload-artifact@v4
        with:
          name: inventory
          path: ansible/inventory/hosts.yml
 
  ansible:
    name: Ansible
    runs-on: ubuntu-latest
    needs: terraform
    if: github.event_name == 'push' || github.event.inputs.action == 'apply'
    environment: production
 
    steps:
      - uses: actions/checkout@v4
 
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
 
      - name: Install Ansible
        run: pip install ansible
 
      - name: Download Inventory
        uses: actions/download-artifact@v4
        with:
          name: inventory
          path: ansible/inventory
 
      - name: Setup SSH
        run: |
          mkdir -p ~/.ssh
          echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/cluster_key
          chmod 600 ~/.ssh/cluster_key
 
      - name: Wait for servers
        run: sleep 120
 
      - name: Run Ansible
        working-directory: ansible
        env:
          ANSIBLE_HOST_KEY_CHECKING: 'false'
          CLICKHOUSE_ADMIN_PASSWORD: ${{ secrets.CLICKHOUSE_ADMIN_PASSWORD }}
          CLICKHOUSE_READONLY_PASSWORD: ${{ secrets.CLICKHOUSE_READONLY_PASSWORD }}
          GRAFANA_ADMIN_PASSWORD: ${{ secrets.GRAFANA_ADMIN_PASSWORD }}
        run: |
          ansible-playbook \
            -i inventory/hosts.yml \
            --private-key ~/.ssh/cluster_key \
            site.yml

Required GitHub Secrets

Secret	Description
`HCLOUD_TOKEN`	Hetzner API token
`CLOUDFLARE_API_TOKEN`	Cloudflare API token
`TF_API_TOKEN`	Terraform Cloud token
`SSH_PRIVATE_KEY`	Contents of `secrets/clickhouse-key`
`CLICKHOUSE_ADMIN_PASSWORD`	Admin password
`CLICKHOUSE_READONLY_PASSWORD`	Readonly password
`GRAFANA_ADMIN_PASSWORD`	Grafana password

Part 5: Monitoring with Grafana

The monitoring stack includes:

Prometheus: Scrapes metrics from all nodes
Grafana: Visualizes metrics
Node Exporter: System metrics (CPU, RAM, disk)

Prometheus Configuration

Create ansible/roles/grafana/templates/prometheus.yml.j2:

global:
  scrape_interval: 15s
 
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
 
  - job_name: 'node'
    static_configs:
      - targets:
{% for host in groups['all'] %}
          - '{{ hostvars[host]["private_ip"] }}:9100'
{% endfor %}
 
  - job_name: 'clickhouse'
    static_configs:
      - targets:
{% for host in groups['clickhouse'] %}
          - '{{ hostvars[host]["private_ip"] }}:9364'
{% endfor %}
 
  - job_name: 'clickhouse-keeper'
    static_configs:
      - targets:
{% for host in groups['keeper'] %}
          - '{{ hostvars[host]["private_ip"] }}:9363'
{% endfor %}

Key Metrics to Monitor

Metric	Description	Alert Threshold
`ClickHouseMetrics_Query`	Active queries	> 100
`ClickHouseMetrics_MemoryTracking`	Memory usage	> 80%
`ClickHouseProfileEvents_FailedQuery`	Failed queries	> 0
`node_cpu_seconds_total`	CPU usage	> 80%
`node_memory_MemAvailable_bytes`	Available RAM	< 1GB

Security Hardening

1. Restrict SSH Access

# terraform.tfvars
ssh_allowed_ips = ["YOUR.IP.ADDRESS/32"]

2. Use Strong Passwords

# Generate secure passwords
openssl rand -base64 24

3. Network Segmentation

All inter-node traffic uses the private network (10.0.1.0/24). Only necessary ports are exposed publicly.

4. Firewall Rules

Only these ports are accessible from the internet:

22 (SSH) - restricted to your IP
80/443 (HTTP/HTTPS) - for Cloudflare proxy
8123/9000 (ClickHouse) - consider restricting to known IPs

5. User Permissions

The readonly user cannot modify data:

-- This will fail for readonly user
INSERT INTO events VALUES (...);
-- Error: readonly: Cannot execute query in readonly mode

Connecting to Your Cluster

HTTP API (via Cloudflare)

# Simple query
curl "https://clickhouse.yourdomain.com/" \
  -u admin:YourAdminPassword \
  -d "SELECT version()"
 
# JSON format
curl "https://clickhouse.yourdomain.com/?default_format=JSON" \
  -u admin:YourAdminPassword \
  -d "SELECT * FROM system.clusters"

Native Protocol (direct)

clickhouse-client \
  --host <LOAD_BALANCER_IP> \
  --user admin \
  --password 'YourAdminPassword'

Creating Tables

-- Create a replicated table
CREATE TABLE events ON CLUSTER 'production_cluster'
(
    event_id UUID DEFAULT generateUUIDv4(),
    timestamp DateTime DEFAULT now(),
    user_id UInt64,
    event_type String,
    properties String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
PARTITION BY toYYYYMM(timestamp)
ORDER BY (timestamp, user_id);
 
-- Create distributed table for querying across shards
CREATE TABLE events_distributed ON CLUSTER 'production_cluster'
AS events
ENGINE = Distributed('production_cluster', 'default', 'events', rand());
 
-- Insert data (goes to one shard)
INSERT INTO events (user_id, event_type, properties)
VALUES (1, 'page_view', '{"page": "/home"}');
 
-- Query all shards
SELECT count() FROM events_distributed;

Troubleshooting Guide

Issue: Ansible hangs on apt install

Cause: Ubuntu's unattended-upgrades is locking apt.

Solution:

ansible -i inventory/hosts.yml all -m shell -a "pkill -9 unattended-upgr; systemctl disable unattended-upgrades"

Issue: ClickHouse won't start

Cause: Usually port conflict or config error.

Solution:

# Check logs
ssh root@<node-ip> "cat /var/log/clickhouse-server/clickhouse-server.err.log | tail -50"
 
# Common fix: port 9363 conflict between Keeper and Server
# Server should use 9364, Keeper uses 9363

Issue: Cloudflare 521 error

Cause: Origin server not responding on port 80/443.

Solution:

# Check Nginx
ansible -i inventory/hosts.yml monitoring -m shell -a "systemctl status nginx"
 
# Check port 80
ansible -i inventory/hosts.yml monitoring -m shell -a "ss -tlnp | grep :80"

Issue: Authentication failed

Cause: Passwords don't match or env vars weren't set.

Solution:

source .env
ansible-playbook site.yml --limit clickhouse
ansible -i inventory/hosts.yml clickhouse -m shell -a "systemctl restart clickhouse-server &"

Cost Analysis

Monthly Breakdown

Resource	Type	Cost
3x ClickHouse Nodes	CX33 (4 vCPU, 8GB)	€26.97
1x Monitoring	CX23 (2 vCPU, 4GB)	€4.49
Load Balancer	LB11	€5.39
Network	Private	Free
Traffic	20TB included	Free
Total		€36.85

Scaling Costs

Configuration	Nodes	Monthly Cost
Small (this guide)	3x CX33	€37
Medium	3x CX43	€59
Large	5x CX43	€95
Enterprise	5x CCX33	€270

Conclusion

You now have a production-ready ClickHouse cluster with:

✅ High Availability: 3-node Keeper quorum
✅ Data Replication: Automatic via ReplicatedMergeTree
✅ Load Balancing: Health-checked traffic distribution
✅ SSL/TLS: Free via Cloudflare
✅ Monitoring: Prometheus + Grafana dashboards
✅ CI/CD: Automated deployments via GitHub Actions
✅ Security: Firewall, private network, user roles
✅ Cost: ~€37/month

Next Steps

Add alerting: Configure Grafana alerts for critical metrics
Backup strategy: Set up S3-compatible backup with Hetzner's Object Storage
Scale horizontally: Add more nodes as data grows
Tune performance: Adjust memory settings based on workload

Building a Production-Ready ClickHouse Cluster on Hetzner Cloud

Introduction

Why This Stack?

Why ClickHouse?

Why Hetzner Cloud?

Why Terraform + Ansible?

Why ClickHouse Keeper instead of ZooKeeper?

Architecture Overview

Component Breakdown

Data Flow

Why 3 Nodes?

Prerequisites

Accounts

Local Tools

API Tokens

Domain Name

Part 1: Infrastructure with Terraform

Understanding the Terraform Structure

Step 1: Provider Configuration

Step 2: Variables

Step 3: SSH Key Generation

Step 4: Network Infrastructure

Step 5: Firewall Rules

Step 6: ClickHouse Servers

Step 7: Monitoring Server

Step 8: Load Balancer

Step 9: Cloudflare DNS

Step 10: Outputs

Step 11: Inventory Template

Step 12: Backend Configuration

Step 13: Deploy Infrastructure

Part 2: Configuration with Ansible

Ansible Structure

Step 1: Ansible Configuration

Step 2: Global Variables

Step 3: Main Playbook

Step 4: Common Role

Step 5: ClickHouse Keeper Role

Step 6: ClickHouse Server Role

Step 7: Run Ansible

Part 3: SSL with Cloudflare

Nginx Role

Part 4: CI/CD with GitHub Actions

Required GitHub Secrets

Part 5: Monitoring with Grafana

Prometheus Configuration

Key Metrics to Monitor

Security Hardening

1. Restrict SSH Access

2. Use Strong Passwords

3. Network Segmentation

4. Firewall Rules

5. User Permissions

Connecting to Your Cluster

HTTP API (via Cloudflare)

Native Protocol (direct)

Creating Tables

Troubleshooting Guide

Issue: Ansible hangs on apt install

Issue: ClickHouse won't start

Issue: Cloudflare 521 error

Issue: Authentication failed

Cost Analysis

Monthly Breakdown

Scaling Costs

Conclusion

Next Steps

Resources

Ready to know your users?