How to Build a Connector SDK Connector with Visual Studio Code and GitHub Copilot
The video tutorial demonstrates how to build a Fivetran Connector SDK custom connector using Copilot, an AI pair programmer. The demo showcases the end-to-end process of creating, testing, and deploying a connector for the FDA Tobacco API.
For prerequisites and best practices, see our Building a Custom Connector With VS Code and AI documentation.
The tutorial is based on our FDA Tobacco Problem API Connector example connector..
Prerequisites
- Connector SDK prerequisites
- Python virtual environment set up (see How to Create a Python Virtual Environment for detailed instructions)
- Connector SDK installed (see How to Install Fivetran Connector SDK for detailed instructions)
- Docker & Docker Compose
- VS Code with GitHub Copilot enabled
See the Tutorials for Fivetran Connector SDK to learn how to build custom connectors with Fivetran's Connector SDK.
Prepare the AI assistant using a special agents.md file. This file defines the goals, formatting rules, and behavior for Copilot. It helps ensure the AI follows consistent practices during code generation.
See our agents.md file example to learn more.
Create a new folder for the project along with the following files:
This folder structure should ensure compatibility with Connector SDK. See our Connector SDK Setup Guide and Project Structure documentation for more details.
See our How to Create a Project Folder documentation if you need in-depth instructions.
To help the AI understand the target data source (for the purposes of this example, FDA Tobacco APIs), compile authentication info, endpoint details, and sample payloads into a notes.txt file.
Additionally, create a fields.yaml file for schema clarity.
To generate the connector files, we provide a structured three-part prompt to the AI assistant, using project-local context files.
The full prompt combines three logical sections outlined below:
The first prompt part references real API material from:
notes.txt: authentication, endpoints, sample queriesfields.yaml: field structure pulled from API schema
This content gives the AI model deep, domain-specific context for code generation.
The second prompt part specifies the functional requirements:
- Dynamically create tables based on the endpoints available
- Flatten nested dictionaries
- Use key-value pairs as table columns.
- Upsert behavior for records
- Only define primary keys where necessary; allow Fivetran to infer others
- Limit queries to the first 10 results per endpoint to avoid API overuse (no API key provided).
The third prompt part instructs the AI assistant to write code directly into the defined files in the FDA_tobacco project folder:
connector.pyconfiguration.jsonrequirements.txtREADME.md
In this part, we also require adherence to Fivetran SDK best practices.
Prompt example
I need a Fivetran Connector SDK solution for https://api.fda.gov/tobacco/problem.json. I have some notes and example queries in #file:notes.txt and the fields documented in #file:fields.yaml. Have it dynamically create tables based on the endpoints available. Flatten the dictionaries and upsert the key:value pairs as the columns for the tables. Only define the Primary Key for the schema objects, let Fivetran infer the rest. Process the first 10 responses from each endpoint and then exit gracefully, we do not have an API key and do not want to exceed the limits. Create a Fivetran Connector SDK solution that follows Fivetran best practice outlined in #file:fivetran_connector_sdk.instructions.md. I have the files prepared in #file:FDA_tobacco.
The AI assistant generates and populates all required files live in VS Code:
connector.pyconfiguration.jsonrequirements.txtREADME.md
This marks the completion of the prompt execution phase, after which the developer moves into the testing and validation stage using fivetran debug.
Watch this tutorial's segment on YouTube (05:32 - 06:09)
With the files generated, test the connector using the fivetran debug command. The tool validates the configuration and performs a limited sync, returning an upsert summary to verify correctness.
In the video tutorial, when an error occurs due to an incorrect working directory, the AI agent suggests a correction and re-runs the test. This highlights how well-structured prompts and AI agents can resolve issues dynamically.
To manually troubleshoot errors, refer to Fivetran’s SDK Troubleshooting Guide.
Watch this tutorial's segment on YouTube (06:09 - 06:40)
Use DuckDB to inspect the synced data and verify that the expected schema and values were loaded. See our Working with DuckDB documentation to learn more.
Watch this tutorial's segment on YouTube (06:40 - 07:31)
Once validated, deploy the connector using the fivetran deploy command. The CLI should return the connection ID and confirm the success of the deployment.
Once deployed, start syncing data with the connection.
Summarize what we did in this tutorial.