What Issues Might I Encounter If I Don’t Define a Primary Key?
Question
What issues might I encounter if my source table doesn’t have a primary key?
Environment
All database connectors
Answer
If your source table doesn't have a primary key, you may encounter the following issues in your destination:
- Duplicate records appear to represent the same source row
- Rows are regularly deleted and reinserted
- Updates are recorded as new rows instead of modifying the existing ones
Context
When a table lacks a primary key, Fivetran generates a _fivetran_id
value by hashing all column values in each row. This hash acts as a unique identifier. For more information, see our System columns documentation.
Because the _fivetran_id
value is based on the entire row, changes to any column value will result in a new hash. We interpret this as a new row and mark the original as deleted. In the destination, this results in two records:
- The old row marked with
_fivetran_deleted = TRUE
- The new row marked with
_fivetran_deleted = FALSE
This process ensures all data changes are captured, but can lead to apparent duplication and inefficient updates. When you define a primary key, we trigger a full re-sync and use the key for all future syncs. This allows us to track updates and deletes accurately, avoiding duplication.
Clean up duplicates after adding a primary key
When you add a primary key and perform a re-sync, any duplicates that existed before the key remain in your destination. To remove them, run DELETE
statements or drop the table and let us rebuild it during the next sync.