ClickHouse Cloud Private Preview
The quickest and easiest way to get up and running with ClickHouse Cloud is to create a new service in ClickHouse Cloud.
NOTE: This destination is partner-built. For any questions related to ClickHouse Cloud destination and its documentation, contact ClickHouse Cloud Support.
Setup guide
Follow our setup guide to configure your Fivetran destination for ClickHouse Cloud.
Type transformation mapping
The data types in ClickHouse Cloud follow Fivetran's standard data type storage.
We use the following data type conversions:
Fivetran type | ClickHouse Cloud type |
---|---|
BOOLEAN | Bool |
SHORT | Int16 |
INT | Int32 |
LONG | Int64 |
BIGDECIMAL | Decimal(P, S) |
FLOAT | Float32 |
DOUBLE | Float64 |
LOCALDATE | Date |
LOCALDATETIME | DateTime |
INSTANT | DateTime64(9, 'UTC') |
STRING | String |
BINARY | String * |
XML | String * |
JSON | String * |
* NOTE: The ClickHouse Cloud String type can be used to represent an arbitrary set of bytes. The ClickHouse Cloud destination adds a column comment to the
JSON
,BINARY
, andXML
types to indicate the original data type. JSON data type is not used as it is marked as obsolete, and was never recommended for production usage.
Destination tables
The ClickHouse Cloud destination uses Replacing engine type of SharedMergeTree family (specifically, SharedReplacingMergeTree
), versioned by the _fivetran_synced
column.
Every column except primary (ordering) keys and Fivetran metadata columns is created as Nullable(T), where T
is a ClickHouse Cloud type based on the data types mapping.
Single primary key in the source table
For example, source table users
has a primary key column id
(INT
) and a regular column name
(STRING
). The destination table will be defined as follows:
CREATE TABLE `users`
(
`id` Int32,
`name` Nullable(String),
`_fivetran_synced` DateTime64(9, 'UTC'),
`_fivetran_deleted` Bool
) ENGINE = SharedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}', _fivetran_synced)
ORDER BY id
SETTINGS index_granularity = 8192
In this case, the id
column is chosen as a table sorting key.
Multiple primary keys in the source table
If the source table has multiple primary keys, they are used in order of their appearance in the Fivetran source table definition.
For example, there is a source table items
with primary key columns id
(INT
) and name
(STRING
), plus an additional regular column description
(STRING
). The destination table will be defined as follows:
CREATE TABLE `items`
(
`id` Int32,
`name` String,
`description` Nullable(String),
`_fivetran_synced` DateTime64(9, 'UTC'),
`_fivetran_deleted` Bool
) ENGINE = SharedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}', _fivetran_synced)
ORDER BY (id, name)
SETTINGS index_granularity = 8192
In this case, id
and name
columns are chosen as table sorting keys.
No primary keys in the source table
If the source table has no primary keys, a unique identifier will be added by Fivetran as a _fivetran_id
column. Consider an events
table that only has the event
(STRING
) and timestamp
(LOCALDATETIME
) columns in the source. The destination table in that case is as follows:
CREATE TABLE events
(
`event` Nullable(String),
`timestamp` Nullable(DateTime),
`_fivetran_id` String,
`_fivetran_synced` DateTime64(9, 'UTC'),
`_fivetran_deleted` Bool
) ENGINE = SharedReplacingMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}', _fivetran_synced)
ORDER BY _fivetran_id
SETTINGS index_granularity = 8192
Since _fivetran_id
is unique and there are no other primary key options, it is used as a table sorting key.
Selecting the latest version of the data without duplicates
SharedReplacingMergeTree
performs background data deduplication only during merges at an unknown time. However, selecting the latest version of the data without duplicates ad-hoc is possible with the FINAL
keyword and select_sequential_consistency setting:
SELECT *
FROM example FINAL LIMIT 1000
SETTINGS select_sequential_consistency = 1;
Retries on network failures
The ClickHouse Cloud destination retries transient network errors using the exponential backoff algorithm. This is safe even when the destination inserts the data, as any potential duplicates are handled by the SharedReplacingMergeTree
table engine, either during background merges, or when querying the data with SELECT FINAL
.
Limitations
- Adding, removing or modifying primary key columns is not supported.
- The custom ClickHouse Cloud settings configuration (for example, for the
CREATE TABLE
statements) is not supported. - History mode is not supported for ClickHouse destinations.