Databricks Setup Guide Private Preview
Follow our setup guide to replicate your Databricks catalog to your destination using Fivetran.
Prerequisites
To connect Databricks to Fivetran, you need the following:
- A Databricks account.
- At least one SQL warehouse to sync data from. Learn more about SQL warehouses in this connector's Limitations documentation.
Setup instructions
Create personal access token
Fivetran uses a secure token to connect to Databricks. Follow Databricks' token management guide.
NOTE: If we find a table which doesn't have ChangeDataFeed enabled, we try to activate it. Make sure the personal access token has MODIFY permissions on the table. The command to enable ChangeDataFeed for a table is
ALTER TABLE catalog_name.schema_name.table_name SET TBLPROPERTIES (delta.enableChangeDataFeed=true)
.
Connect SQL warehouse
In the Databricks console, go to SQL > SQL warehouses > Create SQL warehouse. If you want to select an existing SQL warehouse, skip to the below section.
In the New SQL warehouse window, enter a Name for your warehouse.
Choose your Cluster Size and configure the other warehouse options.
Click Create.
Go to the Connection details tab.
Make a note of the following values. You will need them to configure Fivetran.
- Server Hostname
- Port
- HTTP Path
Finish Fivetran configuration
- In the connector setup form, enter your chosen destination schema name.
- (Optional) Enter the catalog you want to sync data from. If you leave this field empty, we use the default
hive_metastore
catalog. - Enter the server hostname of your Databricks cluster that you noted in the Connect SQL warehouse step.
- Enter the Port number that you noted in the Connect SQL warehouse step. The default value is
443
. - Enter the HTTP path of the SQL warehouse that you noted in the Connect SQL warehouse step.
- Enter your personal access token.
- Click Save & Test.
Fivetran tests and validates the Databricks connection. On successful completion of the setup tests, you can sync your data using your new Databricks connector.
Setup tests
Fivetran performs the following Databricks connection tests:
- The Databricks Connection test checks the accessibility of the Databricks project and validates the database credentials you provided in the setup form.
- The Permission test checks that we can connect to the database and get the details of tables and columns.
NOTE: The tests may take a couple of minutes to finish running.