Working with Connector SDK
Need to get your connection up and running quickly?
Our team of Professional Services experts is available to provide free advisory services to help you build your first Connector SDK connection. This includes guidance on setup, troubleshooting, and best practices. To get started, simply file a support ticket.
Save time nowFivetran Connector SDK
The Fivetran Connector SDK is available as a GitHub repository that provides examples that we recommend you consider and try out.
Follow the Connector SDK setup guide to start using the SDK. This page covers several topics that will enable you to successfully use the SDK to build your custom data pipelines.
SDK runtime environment
The Connector SDK runtime environment executes your submitted Python script in an isolated environment. It installs the libraries listed in the requirements.txt
file, and after installation, it runs the provided Python script.
Python version support
We support the following Python versions:
- 3.12.8 (Default version used when you did not explicitly specify the version when deploying the connection with the
--python
or--python-version
arguments) - 3.11.11
- 3.10.16
- 3.9.21
Pre-installed DB drivers
We pre-install the msodbcsql17
and msodbcsql18
drivers for connecting to Microsoft SQL server. We also pre-install libpq5 and libpq-dev, which is required to support psycopg and psycopg2 respectively.
Pre-installed packages
We pre-install the fivetran_connector_sdk:latest
and requests:2.32.3
package when running your code.
System resources
The environment running your code has:
- 1 GB of RAM
- Less than 0.5 vCPUs of an 8 Core N2 or equivalent machine
Our production infrastructure allows for some variation in resources available to your connections at any given moment, load balancing across multiple connections.
Working with requirements.txt
file
Usually, your connector's code will need to import additional Python libraries. To ensure the correct libraries are installed before we run your code in Fivetran, you must include a standard pip requirements.txt
file in the root directory of your connector project. In your requirements.txt
file, you must list all the Python libraries your connector uses along with any required versions. The weather example
provides an example.
You can use PyPI and Git sources to install packages. For example:
pandas
git+https://github.com/oracle/python-oracledb.git@main#egg=oracledb
You can also quickly install all the needed libraries when setting up your environment using the following command:
pip install -r requirements.txt
NOTE:
fivetran_connector_sdk:latest
andrequests:2.32.3
is available when executing your code. Do not declare them in yourrequirements.txt
to avoid dependency conflicts.
Working with configuration.json
file
When debugging or deploying a connector, you can pass specific configuration values to your code. Configuration values are stored securely and encrypted in the same way as other connectors' credentials. See our Data Credential Encryption documentation page for more details.
The easiest way to pass values for the initial deployment is as follows:
- Create a temporary
configuration.json
file in the root directory of your connector project. - Use it for the initial deployment.
- Delete the temporary file. We do not store this file. We recommend that you do not check it into your code repository.
NOTE: Once you have deployed your connector, you can edit the configuration values in the dashboard.
Your JSON file should be valid JSON containing key-value pairs with only STRING values. The following is the required format:
{
"key1" : "value1",
"key2" : "value2"
}
NOTE: You can pass any file containing valid JSON key-value pairs to the
configuration
parameter using an absolute or relative path to the file, for example,--configuration Users/john.doe/configurations/test-config1.txt
or--configuration ../../configuration
. In the examples provided in this section, the configuration file is calledconfiguration.json
and is stored in the root project folder.
NOTE: Using empty strings as values in the configuration key-value pairs allows the connection to be deployed with empty configuration values. This, in turn, lets you avoid temporarily storing secrets locally.
You can enter or edit the configuration values after the connection deployment in the setup form of your connection. The latest version of configuration values, whether set by editing the values in the setup form or by redeploying with configuration.json
, is used for all connection syncs. Any new value in the configuration.json
file, even an empty string, overwrites the existing value.
The configuration.json
file can include a maximum of 100 key-value pairs.
Only configuration keys are visible in your Fivetran dashboard after your connector is deployed and running.
Configuration values are fully or partially obfuscated depending on their length as this process is designed for managing sensitive credentials.
The values are only ever available to your code at runtime. You can be confident using this configuration to pass any credentials required by your connector, like a required key to make an API call. The configuration example
provides an example.
To test your code using a configuration.json
file in your project's root directory, use the following command:
fivetran debug --configuration configuration.json
To deploy your connector.py
using a configuration file in your project's root directory, use the following command:
fivetran deploy --api-key <FIVETRAN-API-KEY> --destination <DESTINATION-NAME> --connection <CONNECTION-NAME> --configuration configuration.json
NOTE: Any time you provide a
--configuration
argument and associated information, it completely replaces what was there before. If you omit--configuration
, no changes are made to the previously provided configuration.
NOTE: You can deploy the same
connector.py
file multiple times using different configuration files to create multiple connections.
Editing configuration values after deployment
Once you have deployed your connector, you can edit the configuration values in the dashboard by doing the following:
- Go to the Setup tab of your Connection Overview page.
- Click Edit Connecftion.
- Under Configuration(s), click Edit beside the relevant configuration parameter. The Edit modal window appears in the right part of the page.
- Enter the Configuration Value.
- Click Save.
- Repeat steps 4 to 5 for each configuration parameter you want to edit.
Configuration options
This section covers the casting rules for configuration field values to the following data types:
- boolean
- list
- integer
- dict
Cast to boolean
For example, let's say you have a RESYNC_ON
field in configuration.json
with value "SUNDAY"
. You can use it as a boolean in your Python code as follows:
if configuration['RESYNC_ON'].lower() == datetime.now().strftime('%A').lower():
... do something ...
One more example, let's say you have a include_all
field in configuration.json
with value "TRUE"
. You can use it as a boolean in your Python code as follows:
if configuration[‘include_all’] == 'TRUE':
... do something ...
Cast to list
For example, let's say you have a COUNTRIES
field in configuration.json
with value "NAM,APAC,EMEA"
. You can use it as a list in your Python code as follows:
fruits = configuration['COUNTRIES'].split(",")
... do something...
Cast to integer
For example, let's say you have an API_QUOTA
field in configuration.json
with value "12345"
. You can use it as an integer in your Python code as follows:
api_quota = int(configuration['API_QUOTA'])
...do something...
Cast to dict
For example, let's say you have a CURRENCIES
field in configuration.json
with value "[{\"From\": \"USD\",\"To\": \"EUR\"},{\"From\": \"USD\",\"To\": \"GBP\"}]"
. You can use it as a dict in your Python code as follows:
parsed_json = json.loads(configuration['CURRENCIES'])
...do something...
Working with state.json
file
When running a connector locally that uses the state variable, a state.json
is created in <project_directory>/files/state.json
. This file can be edited locally and is used as the starting state for the next local run of the connector. You can also manually create this file to start a connector Debug()
run from a particular state.
You can manage the state of Connector SDK connectors already deployed to Fivetran by using our API endpoints:
Code upload guidelines
When using fivetran deploy
, we support only .py
files and a requirements.txt
file as part of the code upload. No other code files are supported or uploaded during the deployment process. Ensure that your code is structured accordingly and all dependencies are listed in requirements.txt
.
Working with environment variables
When developing a connector, it is much more convenient to use environment variables for parameters you would otherwise have to enter repeatedly into the command line. We recommend using an environment variable for your FIVETRAN_API_KEY
to avoid having to find and copy it every time you deploy your connector. To use an environment variable, create an .env
file in the root directory of your connector project. Your file should be in the following format:
FIVETRAN_API_KEY=<your 64 bit encoded Fivetran API key>
Load your .env
file for use in your terminal by running the following command:
export $(grep -v '^#' .env | xargs)
Check that your environment variables are now available for use by running the following command:
echo $FIVETRAN_API_KEY
You should use configuration files for any configuration used within your connector.py
code so that it will be available when the code is deployed and running in Fivetran and stored securely within Fivetran.
NOTE:
.env
files are not deployed to Fivetran with the rest of your code.
To deploy your connector.py
using an .env
file to set a local environment variable for your FIVETRAN_API_KEY
, use the following command:
fivetran deploy --destination <DESTINATION-NAME> --connection <CONNECTION-NAME>
You can use .env
files to set both your destination and connection names, and to allow to quickly switch between different connections.
For example, create another test.env
file that contains the parameters in the following format:
FIVETRAN_API_KEY=<your 64 bit encoded Fivetran API key>
FIVETRAN_DESTINATION_NAME=<your destination name>
FIVETRAN_CONNECTION_NAME=<your connection name>
Load it by running the following command:
export $(grep -v '^#' test.env | xargs)
Deploy this connector by running the following command:
fivetran deploy
or, alternatively, if your connector needs configuration, by running the following command:
fivetran deploy --configuration configuration.json
TIP: Make sure to include your
.env
files in your.gitignore
file to ensure you don't accidentally check yourFIVETRAN_API_KEY
into your code repository.
Working with warehouse.db
Our testers support only basic operations. Functionalities like schema operations are not supported by fivetran debug
. In case you see failures while using fivetran debug
and your code has schema definition changes, use the fivetran reset
command to reset the warehouse.db
and state
files:
fivetran reset
If the error persists, then most likely there is an error in your code.
Working with yield and child functions
The yield
keyword in a generator function is a standard Python code pattern. The Fivetran Connector SDK uses it to efficiently deliver data to our core platform.
There are two ways that yield
can be used with a child function. See the examples below.
Let's assume we have a child function:
def child_function(data):
for item in data:
yield item
You can call the function from update()
in two ways:
Using
yield from
:def update(data): yield from child_function(data)
Using
yield
: In this case, you need to iterate over the generation returned fromchild_function
:def update(data): for item in child_function(data): yield item
Our pagination examples use the first approach.
Connecting to Duck DB
The Fivetran Local Tester saves the data sent to Fivetran by you connector code into a DuckDB warehouse.db
file stored at the <project directory>/files/warehouse.db
path:
Using Duck DB CLI
Run Duck DB CLI:
duckdb files/warehouse.db
Once running, you can see the tables and their schema as well as query the warehouse with SQL:
.tables .schema SELECT * FROM default_schema.hello_world;
To stop Duck DB, run the following command:
.exit
Using DBeaver
Connect to a database and select DuckDB.
Click Open to navigate to and select the
warehouse.db
file created by runningfivetran debug
.Dbeaver may need to download updated drivers, then should then show that it is connected to the local
warehouse.db
file and enable you to explore it using regular SQL. Your tables will be located in thetester
schema of thewarehouse
catalog.
fivetran debug
issue with open DBeaver connection
If you use fivetran debug
while a connection is already open in DBeaver, the command doesn't work until the connection is closed. The command fails with the following error message:
IO Error: Could not set lock on file "/Users/john.smith/connector/files/warehouse.db": Conflicting lock is held in /Applications/DBeaver.app/Contents/MacOS/dbeaver (PID 28552) by user john.smith.
This occurs because DuckDB does not support concurrent connections to the same database file, as explained in their concurrency documentation. To resolve this issue, close the connection in DBeaver before running fivetran debug
.
Connector SDK logs
You have two options for managing your SDK connections' logs:
- Using the Fivetran Platform Connector. Your connections' logs are available in the
CONNECTOR_SDK_LOG
table within the associated Fivetran destination. - Using external log services, configured for respective destination.
Connector SDK logs provide in-depth event data, including timestamps, log levels, and messages.