SFTP Setup Guide
Follow our setup guide to sync files from your SSH server to your destination using SFTP.
Prerequisites
To set up the Fivetran SFTP connector, you need the following:
- An account on an SSH server containing files with supported filetypes and encodings
- The ability to log in to this account using either a password or key pair
Setup instructions
Begin Fivetran configuration
In the connector setup form, select the sync strategy: Magic Folder or Merge Mode.
Enter the Destination schema name of your choice.
If you selected Merge Mode as your sync strategy, enter the Destination table name.
Connect
(Not applicable to Hybrid Deployment) In the Connection Method drop-down menu, you can choose to Connect directly or Connect via SSH Tunnel:
Connect directly: Fivetran will connect directly to your SFTP Server. This is the simplest method. To connect directly, Fivetran's IP addresses should be safelisted in your firewall.
Connect via SSH Tunnel: Fivetran will connect to a separate server in your network which provides an SSH tunnel to your SFTP Server. You must choose this option if your SFTP Server is in an inaccessible subnet.
NOTE: For connectors configured for Hybrid Deployment, Connect directly is pre-selected in the Connection Method drop-down menu.
(Not applicable to Hybrid Deployment) If you use a keypair for logging into your SFTP server, set the Login with keypair? toggle to ON. Make a note of the SFTP Server Public Key and proceed to the next section.
If you select Connect via SSH Tunnel, make a note of the automatically generated SSH Tunnel Public Key.
NOTE: You can use either of the automatically generated public keys, SSH Tunnel Public Key or SFTP Server Public Key, for configuration as they are same.
If you use password for logging into your SFTP server and chose Connection Method as Connect directly, proceed to the Add login details section.
(Hybrid Deployment only) If your destination is configured for Hybrid Deployment, the Hybrid Deployment Agent associated with your destination is pre-selected in the Select an existing agent drop-down menu. To use a different agent, select the agent of your choice, and then select the same agent for your destination.
Configure your keypair in a text editor
If you want to connect via SSH Tunnel, this step is mandatory as we only support keypair based login to Tunnel Host. If you want to connect directly to your SFTP server, then this step is only required if your SSH user account uses key pairs to log in.
You need to create a group and SSH user, add the SSH user to the group, create the .ssh
directory and authorized_keys
file, and grant them permissions, unless they already exist and have the permissions granted. To do it, log in to your server and run the following commands:
Create group
fivetran
:sudo groupadd fivetran
Create an SSH user
fivetran
:sudo useradd -m -g fivetran fivetran
Switch to your
fivetran
user:sudo su - fivetran
Create the
.ssh
directory:mkdir ~/.ssh
Grant the
.ssh
directory permissions:chmod 700 ~/.ssh
Switch to the
.ssh
directory:cd ~/.ssh
Create the
authorized_keys
file:touch authorized_keys
Grant the
authorized_file
permissions:chmod 600 authorized_keys
Using your favorite text editor, add the SSH Tunnel public key from the Fivetran setup form to the authorized_keys
file. The key must be all on one line. Make sure no line breaks are introduced when cutting and pasting.
Add the Fivetran public key to the /.ssh/Authorized_keys file on any SSH account you wish to use.
IMPORTANT: If you choose both login with keypair and connect via SSH Tunnel, make sure you configure the SFTP Server Public Key and SSH Tunnel Public Key in their respective servers.
Add login details
In the connector setup form, enter the following details:
- SFTP Server Host Address
- SFTP Server Port
- SFTP Server Username
- SFTP Server Password (not required for keypair login)
If you selected Connect via an SSH tunnel as the Connection Method, enter the following details:
- SSH Tunnel Host Address
- SSH Tunnel Port
- SSH Tunnel Username
IMPORTANT: If you entered DNS instead of IP Address in Host Address of SFTP Server, you must have ip address to DNS mapping for SFTP Server in /etc/hosts file of tunnel Host Or the name should resolve to ip using Internal DNS server.
(Optional) Click Run connection test to validate the login credentials and connection to the SFTP server.
NOTE: You can skip this intermediate test and proceed to the next step. However, if you choose to skip, we will perform this test once you have finished your configuration.
Add SFTP configuration
Magic Folder
(Optional) In the setup form, enter your Folder Path from your SFTP server to specify the section of the file system in which you'd like Fivetran to look for files. We examine all files under the specified folder except the nested sub-folders. If you don't provide a prefix, we'll search the root folder for files we can sync.
Merge Mode
In the setup form, choose your configuration options. Using these configuration options, you can select subsets of your folders, specific types of files, and more to sync only the files you need in your destination. In addition, setting up multiple connectors targeted at the same file system but with different options allows you to slice and dice a file system any way you'd like.
Select files
(Optional) Folder Path - Use the folder path to specify a portion of the file system in which you'd like Fivetran to look for files. We examine files under the specified folder and all of its nested subfolders for files we can sync. If you don't provide a prefix, we'll look through the entire file system for files to sync.
(Optional) File Pattern - Use a regular expression as the file pattern to decide whether or not to sync specific files. The pattern applies to everything under the prefix (folder path). If you're unsure what regular expression to use, you can leave this field blank, and we'll sync everything under the prefix.
For example, if under the prefix you have a folder
data
, which has sub-folders,subFolder1
,subFolder2
, etc. These sub-folders have JSON files with the formatreport_03/12/2050.json
. Use the following regex patters to decide whether or not to sync specific files:data/.*
matches all the files in thedata
folder.data/subFolder1/.*\.json
matches all the JSON files in thesubFolder1
.data/subFolder2/report_.*\.json
matches all the JSON files in thesubFolder2
folder that has a name that starts with the prefixreport_.
. For example,report_file.json
.report_\d{2}/\d{2}/\d{4}\.json
matches all the JSON files that begin with the prefixreport_
and are followed by a date format ofDD/MM/YYYY
orMM/DD/YYYY
. For example,report_03/12/2050.json
.
TIP: We recommend that you test your regex.
(Optional) Click Preview Files to validate the file pattern.
NOTE: You can skip this intermediate test and proceed to the next step. However, if you choose to skip, we will perform this test once you have finished your configuration.
Compression - If your files are compressed but do not have extensions indicating the compression method, you can decompress them according to the selected compression algorithm. If all of your compressed files are correctly marked with a matching compression extension (.bz2, .gz, .gzip, .tar, or .zip), you can select infer. If you select uncompressed, we do not decompress the files and sync the uncompressed files. If you choose a compression format, we decompress every file using the format you select. For example, if you have an automated CSV output system that GZIPs files to save space but saves them without a .gzip extension, you can set this field to GZIP. We will decompress every file that we examine using GZIP.
(Optional) Archive Folder Pattern - Use a regular expression to filter and sync files from archived folders. We sync the files in compressed archives with filenames matching the specified pattern. If there are multiple files within archive (TAR or ZIP) folders, you can use the archive folder pattern to filter file types.
For example, if you specify the archive folder pattern as
.*json
, we will sync only the files that end in a .json file extension from the archive folder. > NOTE: This is only used to filter out the files inside the archived folder. All the file matching the File Pattern will be listed
Format
File Type - Note that all files are processed as the selected file type. Use the File Pattern field to select the file extensions you want to sync.
If your file type is CSV, TSV, or log, then enter the following details:
- (Optional) Delimiter - character used in csv files to separate one field from the next. Fivetran tries to infer the delimiter, but in some cases, this isn’t possible. If your files sync with the wrong number of columns or uses a unique delimiter, consider setting this value. For example, if you have tab-delimited files, you must enter
\t
, and if you have pipe-delimited files, enter|
. If you leave this field blank, we infer the delimiter for each file. You can store files of different types of delimiters in the same folder without any problems. For more information on the delimiter inference, see our documentation. - Quote character - Typically CSVs use double quotes
"
to enclose a value. Set the toggle to off if you don’t want to use an enclosing character. - Non-Standard escape character - Set the toggle to ON if your CSV generator uses non-standard ways of escaping characters like newline, delimiter, etc. Not standard in CSVs.
- (Optional) Null Sequence - Specify the value indicating null if your CSVs use a special value indicating null. Use this field only if you are sure your CSVs have a null sequence. Typically, CSVs have no native notion of a null character. However, some CSV generators have created one, using characters such as
\N
to represent null. - Skip Header Lines - Use this option to skip over a fixed number of header lines at the beginning of your CSV files. Set the toggle to ON, and then in the Number of skipped header lines field, specify the number of header lines you want to skip.
- Skip Footer Lines - Use this option to skip over a fixed number footer lines at the end of your CSV files. Set the toggle to ON, and then in the Number of skipped footer lines field, specify the number of footer lines you want to skip.
- Headerless files - Set the toggle to ON if your CSV-generating software doesn't provide a header line. Fivetran can generate generic column names and sync data rows with them.
- (Optional) Line Separator - Specify the custom line separator for your CSV files. The line separator is used in files to separate one row from the next. If you leave this field blank, we use the new line character
\n
as the line separator by default.
If your file type is JSON or JSONL, then select the following:
JSON Delivery Mode - Use this option to choose how Fivetran should handle your JSON data.
- If you select Packed, we load all your JSON data into the
_data
column without flattening it. - If you select Unpacked, we flatten one level of columns and infer their data types.
- (Optional) Delimiter - character used in csv files to separate one field from the next. Fivetran tries to infer the delimiter, but in some cases, this isn’t possible. If your files sync with the wrong number of columns or uses a unique delimiter, consider setting this value. For example, if you have tab-delimited files, you must enter
Beta Primary Key used for file process and load - Use this option to let Fivetran know how you'd like to update the files in your destination. When you modify a previously synced file, the option you select determines if we should replace the rows in the destination table or append new rows to the table:
- If you select Upsert file using file name and line number, we will upsert your data using the surrogate primary keys
_file
and_line
. If a file has a unique name, we will sync the data for that file as new data. - If you select Append file using file modified time, we will upsert your files using surrogate primary keys
_file
,_line
, and_modified
. You can track the full history of a file or set of files and your files will have a combination of old and new data or data that is updated periodically. - If you select Upsert file using custom primary key, you can keep the most recent version of every record and your files will have a combination of the old and new data or data that is updated periodically. You can choose the primary keys you want to use after you save and test. For more information, see our documentation.
NOTE: You can modify the primary keys only if your initial sync fails. If your initial sync is successful, the option to modify the primary keys is not available.
- If you select Upsert file using file name and line number, we will upsert your data using the surrogate primary keys
Additional options
Error Handling - Use the error handling option to choose how to handle errors in your files. If you know that your files contain some errors, you can choose to skip poorly formatted lines.
If you select skip, we ignore improperly formatted data within a file, allowing you to sync only valid data.
If you select fail, we fail the sync with an error on finding any improperly formatted data.
> TIP: We recommend that you select fail unless you are sure that you have undesirable, malformed data.
You will receive a notification on your Fivetran dashboard if we encounter errors.
(Optional) PGP Encryption Options - Use this option to sync PGP encrypted files. Set the toggle to ON and specify the following:
- PGP Private Key - Upload the PGP secret key as an attachment.
- (Optional) Passphrase - Enter the passphrase you used to generate the key.
- (Optional) Signer's Public Key - Upload the signer's public key as an attachment. This key is used for verifying the files.
NOTE: For PGP decryption processes, we strictly comply with the RFC4880 standard. We support syncing only base64 encoded files.
(Hybrid Deployment only) If your destination is configured for Hybrid Deployment, the Hybrid Deployment Agent associated with your destination is pre-selected in the Select an existing agent drop-down menu. To use a different agent, select the agent of your choice, and then select the same agent for your destination.
Finish Fivetran configuration
Click Save & Test. Fivetran will take it from here and sync data from your SFTP server.
Click Confirm to ensure that the server SSH key is trusted when the initial tests run.
Fivetran tests and validates the SFTP connection. On successful completion of the setup tests, you can sync your SFTP data to your destination.
Setup tests
Depending on your sync strategy, Fivetran performs the following SFTP connection tests:
(Both Magic Folder Mode and Merge Mode) The Validating Connection Parameters test validates the username, password, host address, and port number you specified in the setup form.
(Both Magic Folder Mode and Merge Mode) The Connecting to SFTP Server test validates the server credentials you specified in the setup form and checks the accessibility of your SFTP server.
(Merge Mode) The Validating File Pattern Regex test validates the file pattern regex you specified in the setup form. We perform this test only if you specify a regex in the File Pattern field.
(Merge Mode) The Validating Archive Pattern test validates the archive pattern regex you specified in the setup form. We perform this test only if you specify a regex in the Archive Folder Pattern field.
(Merge Mode) The Validating EscapeChar test validates the escape character you specified for your CSV files and checks the length of the character which must not be more than one. We perform this test only if you specify an escape character in the Escape Character field.
(Merge Mode) The PGP Support test validates whether the connector can successfully retrieve a minimum of one sample file and a maximum of ten sample files from FTP and decrypt them using the PGP keys you uploaded. We perform this test only if you set the PGP Encryption Options toggle to ON.
(Merge Mode) The Multi-Character Delimiter Support test validates the length of the delimiter which must be within 15 characters. We perform this test only if you specify the delimiter for your CSV files in the Delimiter field.
(Merge Mode) The Finding Matching Files test checks if the connector can successfully retrieve a minimum of one sample file and a maximum of ten sample files based on the configuration you specified in the setup form.
NOTE: The tests may take a couple of minutes to complete.
Related articles
description Connector Overview
settings API Connector Configuration