ClickHouse Cloud Partner-Built Private Preview

The quickest and easiest way to get up and running with ClickHouse Cloud is to create a new service in ClickHouse Cloud.

This destination is partner-built. For any questions related to ClickHouse Cloud destination and its documentation, contact ClickHouse Cloud Support.

Supported deployment models

We support the SaaS Deployment model for the destination.

Setup guide

Follow our setup guide to configure your Fivetran destination for ClickHouse Cloud.

Type transformation mapping

The data types in ClickHouse Cloud follow Fivetran's standard data type storage.

We use the following data type conversions:

Fivetran type	ClickHouse Cloud type
BOOLEAN	Bool
SHORT	Int16
INT	Int32
LONG	Int64
BIGDECIMAL	Decimal(P, S)
FLOAT	Float32
DOUBLE	Float64
LOCALDATE	Date
LOCALDATETIME	DateTime
INSTANT	DateTime64(9, 'UTC')
STRING	String
BINARY	String *
XML	String *
JSON	String *

* NOTE: The ClickHouse Cloud String type can be used to represent an arbitrary set of bytes. The ClickHouse Cloud destination adds a column comment to the JSON, BINARY, and XML types to indicate the original data type. JSON data type is not used as it is marked as obsolete, and was never recommended for production usage.

Destination tables

The ClickHouse Cloud destination uses Replacing engine type of SharedMergeTree family (specifically, SharedReplacingMergeTree), versioned by the _fivetran_synced column.

Every column except primary (ordering) keys and Fivetran metadata columns is created as Nullable(T), where T is a ClickHouse Cloud type based on the data types mapping.

Single primary key in the source table

For example, source table users has a primary key column id (INT) and a regular column name (STRING). The destination table will be defined as follows:

CREATE TABLE `users`
(
    `id`                Int32,
    `name`              Nullable(String),
    `_fivetran_synced`  DateTime64(9, 'UTC'),
    `_fivetran_deleted` Bool
) ENGINE = SharedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}', _fivetran_synced)
ORDER BY id
SETTINGS index_granularity = 8192

In this case, the id column is chosen as a table sorting key.

Multiple primary keys in the source table

If the source table has multiple primary keys, they are used in order of their appearance in the Fivetran source table definition.

For example, there is a source table items with primary key columns id (INT) and name (STRING), plus an additional regular column description (STRING). The destination table will be defined as follows:

CREATE TABLE `items`
(
    `id`                Int32,
    `name`              String,
    `description`       Nullable(String),
    `_fivetran_synced`  DateTime64(9, 'UTC'),
    `_fivetran_deleted` Bool
) ENGINE = SharedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}', _fivetran_synced)
ORDER BY (id, name)
SETTINGS index_granularity = 8192

In this case, id and name columns are chosen as table sorting keys.

No primary keys in the source table

If the source table has no primary keys, a unique identifier will be added by Fivetran as a _fivetran_id column. Consider an events table that only has the event (STRING) and timestamp (LOCALDATETIME) columns in the source. The destination table in that case is as follows:

CREATE TABLE events
(
    `event`             Nullable(String),
    `timestamp`         Nullable(DateTime),
    `_fivetran_id`      String,
    `_fivetran_synced`  DateTime64(9, 'UTC'),
    `_fivetran_deleted` Bool
) ENGINE = SharedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}', _fivetran_synced)
ORDER BY _fivetran_id
SETTINGS index_granularity = 8192

Since _fivetran_id is unique and there are no other primary key options, it is used as a table sorting key.

Selecting the latest version of the data without duplicates

SharedReplacingMergeTree performs background data deduplication only during merges at an unknown time. However, selecting the latest version of the data without duplicates ad-hoc is possible with the FINAL keyword and select_sequential_consistency setting:

SELECT *
FROM example FINAL LIMIT 1000 
SETTINGS select_sequential_consistency = 1;

Retries on network failures

The ClickHouse Cloud destination retries transient network errors using the exponential backoff algorithm. This is safe even when the destination inserts the data, as any potential duplicates are handled by the SharedReplacingMergeTree table engine, either during background merges, or when querying the data with SELECT FINAL.

Limitations

Adding, removing or modifying primary key columns is not supported.
The custom ClickHouse Cloud settings configuration (for example, for the CREATE TABLE statements) is not supported.
History mode is not supported for ClickHouse destinations.