What Import Strategies Does Fivetran Use for PostgreSQL Connections?
Question
What import strategies does Fivetran use for PostgreSQL connections, and in which scenarios are they applied?
Environment
All PostgreSQL connectors
Answer
We employ the following three strategies for importing data from PostgreSQL tables:
- Block-based import
- Sharded import
- Regular import
We apply each strategy based on specific conditions to optimize performance and reliability.
Block-based import
We use the block-based import strategy for all tables hosted on PostgreSQL version 14 or later. This strategy identifies and processes table pages (blocks) in chunks using the following query:
SELECT * FROM <table> WHERE ctid >= '(k,0)'::tid AND ctid < '(k + <chunk_size>,0)'::tid
The default chunk size is 5 GB of data per query. If a query fails, we reduce the chunk size to 1 GB for retries. This strategy leverages table block metadata, available in newer PostgreSQL versions, to transfer data efficiently.
Sharded import
We use the sharded import strategy for tables with optimized indexes, such as those that support efficient filtering and sorting. We determine the best index to use by analyzing the table's metadata. This strategy divides the table into smaller chunks based on the index. We use the following query structure:
SELECT * FROM <table> WHERE ind_key <= k + <page_size> AND ind_key > k
The default chunk size is 1 million rows or 100 chunks, whichever is smaller. We default to the block-based import strategy if a table lacks a suitable index.
Regular import
We use the regular import strategy when we can't use the block-based or sharded import strategies, typically because table statistics or metadata are unavailable. This strategy processes data sequentially by consuming the table in a single cursor. We use the following query:
SELECT * FROM <table>
This approach is effective for tables smaller than 50 GB but unsuitable for larger tables due to potential performance limitations. For tables 50 GB or larger, we rely on other import strategies.