Supported Methods

The Fivetran Connector SDK supports two main specific methods: update() and schema(). The update() method is required for every connector. While the schema() method is optional, most production connectors need to implement this method.

`update()`

Required method that defines how your connector retrieves and delivers data during sync and stores progress in state.

Signature

update(configuration: dict, state: dict)

Parameters

Name	Type	Description
`configuration`	dict	Contains deployment secrets and payloads you specified in `configuration.json` as required by your connector. See our configuration example to learn how to use configuration values from this dictionary.
`state`	dict	Tracks incremental sync position. It is empty for the first sync or for any full re-sync. In all other cases, it contains whatever state you have chosen to checkpoint during the prior sync. In some of our more complex examples, e.g., `weather`, you can see how this is used to track the state of your connection and achieve incremental syncs efficiently.

Example

def update(configuration, state):
    log.info("Starting sync.")
    records = get_data(configuration)
    for record in records:
        op.upsert(table="data", data=record)
    op.checkpoint(state={"table1_cursor": "2024-08-14T02:01:00Z"})

Notes

Required for every connector.
Called automatically at the start of each sync.

`schema()`

Optional method that defines your connector’s output schema. Most connectors need this method. If you don't provide primary key for table, we create surrogate primary key column named _fivetran_id. It is hashed value generated from full set of row's values. See our system columns documentation for more details.

If not defined, Fivetran infers schema from data.

Signature

schema(configuration: dict)

Parameters

Name	Type	Description
`configuration`	dict	Contains deployment secrets and payloads you specified in `configuration.json` as required by your connector. See our configuration example to learn how to use configuration values from this dictionary.

Returns

Returns list of table definition dictionaries as JSON object.

The dictionary should use the following keys:

The table key is required and specifies the name of the table.
The primary_key key is optional but recommended. The value is a list of one or more primary key column names. The content of the list is used as the table's primary key; a single entry means a simple primary key while multiple entries are combined to create a composite primary key for the table. We recommend that you provide primary keys for your tables.
The column key is optional, its value is a dictionary of key-value pairs where the key is the column name and the value is data type.

Example

def schema(configuration: dict):
    return [{
        "table": "company",
        "primary_key": ["id"],
        "columns": {"id": "INT", "name": "STRING"}
    }]

Notes

We infer the schema for data you send us if you do not define it. However, if you want to set a primary key for a table or configure columns to have specific data types, then use this method.
Our Connector SDK GitHub Repo has many examples of how to use the schema() method.
- For an example of defining a schema with one table, see our weather example.
- For an example with multiple tables, see our using_pd_dataframes example.
- For an example of a table with multiple keys, see our records_with_no_created_at_timestamp example.
- For an example of how columns in a schema response can be defined with specific data types, see our specified_types example.
If you don't provide the primary key to use in a table, Fivetran creates a surrogate primary key column named _fivetran_id, which is a hashed value generated based on the row's set of values. See our system columns documentation for more details.
If a new row is received with a different set of columns, we calculate the hash from the new row's values, including values from any new columns. This leads to all rows being treated and written as distinct rows, which may be perceived as data integrity issues in the destination. In this case, you may have to drop and re-sync the connection to preserve data integrity. Thus, we recommend defining primary keys for your tables to avoid unexpected behavior.
The values you provide for the table, primary_key and column keys are renamed based on our renaming rules so that the corresponding names in the destination may differ from how they are set in your code. If you want the table and column names in the destination to exactly match the names you set in your code, we recommend adhering to the renaming rules ensuring the names align with the pattern and character set of transformed names. This means that names of the tables and columns in your source may not exactly match the corresponding names in the destination.
If you need to change primary key selections for a table, drop the table in your destination and then select Resync all historical data on the connection's Setup tab in your dashboard. Doing so maintains data integrity across all records.
Fivetran maintains a list of system columns that are automatically created in your destination for every connector. We recommend that you do not use these reserved names for your own columns to avoid conflicts.

Example of undefined table or primary key

Assume Fivetran receives the following row for a table not defined in the schema or defined without a primary key:

_id	foo	name	_fivetran_id
1	`abc`	`John Doe`	`96DE69AE1728658394E4EAE664431F1A4E7857E4`

The generated hashed value would be from the values of the three columns.

Consider we receive the same row with an additional column, bar:

_id	foo	name	bar	_fivetran_id
1	`abc`	`John Doe`		`96DE69AE1728658394E4EAE664431F1A4E7857E4`
1	`abc`	`John Doe`	`xyz`	`2AC47E18D9FCBC35B6DB94EA4FE4227A3A67A7F8`

The generated hashed value would differ from the first row as the hashed value is calculated from the values of all the columns. This results in two distinct rows written in the destination.

Syncing empty tables and columns

Fivetran creates tables and columns in your destination for any column declared in the schema() method, even if there is no data sent for that column.

For more information, see our Features documentation.