Supported Methods
The Fivetran Connector SDK supports two main specific methods: update() and schema(). The update() method is required for every connector. While the schema() method is optional, most production connectors need to implement this method.
update()
Required method that defines how your connector retrieves and delivers data during sync and stores progress in state.
Signature
update(configuration: dict, state: dict)
Parameters
| Name | Type | Description |
|---|---|---|
configuration | dict | Contains deployment secrets and payloads you specified in configuration.json as required by your connector. See our configuration example to learn how to use configuration values from this dictionary. |
state | dict | Tracks incremental sync position. It is empty for the first sync or for any full re-sync. In all other cases, it contains whatever state you have chosen to checkpoint during the prior sync. In some of our more complex examples, e.g., weather, you can see how this is used to track the state of your connection and achieve incremental syncs efficiently. |
Example
def update(configuration, state):
log.info("Starting sync.")
records = get_data(configuration)
for record in records:
op.upsert(table="data", data=record)
op.checkpoint(state={"table1_cursor": "2024-08-14T02:01:00Z"})
Notes
- Required for every connector.
- Called automatically at the start of each sync.
schema()
Optional method that defines your connector’s output schema. Most connectors need this method. If you don't provide primary key for table, we create surrogate primary key column named _fivetran_id. It is hashed value generated from full set of row's values. See our system columns documentation for more details.
If not defined, Fivetran infers schema from data.
Signature
schema(configuration: dict)
Parameters
| Name | Type | Description |
|---|---|---|
configuration | dict | Contains deployment secrets and payloads you specified in configuration.json as required by your connector. See our configuration example to learn how to use configuration values from this dictionary. |
Returns
Returns list of table definition dictionaries as JSON object.
The dictionary should use the following keys:
- The
tablekey is required and specifies the name of the table. - The
primary_keykey is optional but recommended. The value is a list of one or more primary key column names. The content of the list is used as the table's primary key; a single entry means a simple primary key while multiple entries are combined to create a composite primary key for the table. We recommend that you provide primary keys for your tables. - The
columnkey is optional, its value is a dictionary of key-value pairs where the key is the column name and the value is data type.
Example
def schema(configuration: dict):
return [{
"table": "company",
"primary_key": ["id"],
"columns": {"id": "INT", "name": "STRING"}
}]
Notes
- We infer the schema for data you send us if you do not define it. However, if you want to set a primary key for a table or configure columns to have specific data types, then use this method.
- Our Connector SDK GitHub Repo has many examples of how to use the schema() method.
- For an example of defining a schema with one table, see our
weatherexample. - For an example with multiple tables, see our
using_pd_dataframesexample. - For an example of a table with multiple keys, see our
records_with_no_created_at_timestampexample. - For an example of how columns in a schema response can be defined with specific data types, see our
specified_typesexample.
- For an example of defining a schema with one table, see our
- If you don't provide the primary key to use in a table, Fivetran creates a surrogate primary key column named
_fivetran_id, which is a hashed value generated based on the row's set of values. See our system columns documentation for more details. - If a new row is received with a different set of columns, we calculate the hash from the new row's values, including values from any new columns. This leads to all rows being treated and written as distinct rows, which may be perceived as data integrity issues in the destination. In this case, you may have to drop and re-sync the connection to preserve data integrity. Thus, we recommend defining primary keys for your tables to avoid unexpected behavior.
- The values you provide for the
table,primary_keyandcolumnkeys are renamed based on our renaming rules so that the corresponding names in the destination may differ from how they are set in your code. If you want the table and column names in the destination to exactly match the names you set in your code, we recommend adhering to the renaming rules ensuring the names align with the pattern and character set of transformed names. This means that names of the tables and columns in your source may not exactly match the corresponding names in the destination. - If you need to change primary key selections for a table, drop the table in your destination and then select Resync all historical data on the connection's Setup tab in your dashboard. Doing so maintains data integrity across all records.
Example of undefined table or primary key
Assume Fivetran receives the following row for a table not defined in the schema or defined without a primary key:
| _id | foo | name | _fivetran_id |
|---|---|---|---|
| 1 | abc | John Doe | 96DE69AE1728658394E4EAE664431F1A4E7857E4 |
The generated hashed value would be from the values of the three columns.
Consider we receive the same row with an additional column, bar:
| _id | foo | name | bar | _fivetran_id |
|---|---|---|---|---|
| 1 | abc | John Doe | 96DE69AE1728658394E4EAE664431F1A4E7857E4 | |
| 1 | abc | John Doe | xyz | 2AC47E18D9FCBC35B6DB94EA4FE4227A3A67A7F8 |
The generated hashed value would differ from the first row as the hashed value is calculated from the values of all the columns. This results in two distinct rows written in the destination.
Syncing empty tables and columns
Fivetran creates tables and columns in your destination for any column declared in the schema() method, even if there is no data sent for that column.
For more information, see our Features documentation.