DocumentDB Setup Guide
Follow our setup guide to connect Amazon DocumentDB to Fivetran.
Prerequisites
To connect your DocumentDB to Fivetran, you need:
- DocumentDB version 4.0 or higher
- Your database primary instance's IP (e.g.,
1.2.3.4
) or domain (your.server.com
) - Your database's port (usually
27017
) - An SSH server
Fivetran needs to connect to the database's primary instance to perform incremental updates using change streams. (DocumentDB does not support reading change streams using replica instances).
To perform the initial syncs though, Fivetran can connect to replica instance(s) and recommends the same as it helps enhance the sync performance.
Setup instructions
Connect to DocumentDB
Connect using SSH
Fivetran connects to a separate server in your network that provides an SSH tunnel to your DocumentDB primary instance.
To connect using SSH, configure your firewall and/or other access control systems to allow incoming connections to your DocumentDB port (usually 27017
) from your SSH tunnel server's IP.
Before you proceed to the next step, you must follow our SSH connection instructions.
Connect using private networking (Fivetran-provisioned)
Connect using AWS PrivateLink
You must have a Business Critical plan to use AWS PrivateLink.
AWS PrivateLink allows VPCs and AWS-hosted or on-premises services to communicate with one another without exposing traffic to the public internet. PrivateLink is the most secure connection method. Learn more in AWS’ PrivateLink documentation.
Follow our AWS PrivateLink setup guide to configure PrivateLink for your database.
Connect using private networking (self-service)
Self-service private networking is currently in Beta.
You must have a Business Critical plan to use self-service private networking.
If self-service private networking is enabled for your account, you can set up and manage your own AWS PrivateLink connection between Fivetran and your DocumentDB database. This option provides the same security benefits as Fivetran-provisioned AWS PrivateLink - keeping traffic within your private VPC and off the public internet - but without requiring Fivetran Support to provision the connection.
For detailed setup steps, see our setup instructions for self-service Fivetran accounts.
Create user
Create a database user for Fivetran using the DocumentDB Mongo shell.
Open the Mongo shell.
Connect to your primary node as an admin user.
Go to the
admin
database.Execute the following command to create a user for Fivetran. Replace
<username>
and<password>
with a username and password of your choice.use admin db.createUser({ user: "<username>", pwd: "<password>", roles: [ "readAnyDatabase" ] })
Enable change streams
Fivetran uses change streams to perform incremental updates. You must explicitly enable change streams for all of the collections that you want Fivetran to sync.
Follow Amazon DocumentDB's Enabling Change Streams instructions to enable change streams.
Set change stream log retention duration
Set the change stream log retention duration so that it can retain at least 48 hours' worth of changes. We recommend increasing the size to retain seven days' worth of data.
Follow Amazon DocumentDB's Modifying the Change Stream Log Retention Duration instructions to adjust your change stream log retention duration.
Finish Fivetran configuration
In your connection setup form, enter a destination schema prefix. This prefix applies to each replicated schema and cannot be changed once your connection is created.
Choose your connection method. If you selected Connect via an SSH tunnel:
Copy the Public Key and add it to the
authorized_keys
file on your SSH server when configuring the SSH tunnel.Enter the following SSH connection details:
- SSH Host - Enter either an IP address (for example,
1.2.3.4
) or a domain name (for example,your.server.com
). Do not use a load balancer's IP address or hostname. - SSH Port - The port number for your SSH server.
- SSH User - The username for the SSH connection.
- SSH Host - Enter either an IP address (for example,
If TLS is enabled on your DocumentDB cluster, ensure that the Require TLS through Tunnel toggle is turned ON.
(Not applicable to self-service private networking) In the Host and ports field, click + Add and enter the hostname and port of your primary node:
Use the format
hostname:port
orIP:port
(for example,server.example.com:27017
or1.2.3.4:27017
).(Optional) Click + Add again to provide the hostname and port for one or more replica nodes. Fivetran may use replicas to improve the initial sync speed and balance resource usage.
Do not include any path components after the top-level domain.
(Only applicable to self-service private networking) From the PrivateLink connections drop-down menu, select your existing PrivateLink connection.
- If you want to create a new PrivateLink connection, click + Configure a new PrivateLink connection and follow the setup instructions for self-service Fivetran accounts to configure the connection in your AWS account.
- If TLS is enabled on your DocumentDB cluster, ensure that the Require TLS when using PrivateLink toggle is turned ON.
(Only applicable to self-service private networking) In the Port field, enter the port that matches your load balancer’s listener port.
The port will be automatically appended to your PrivateLink endpoint URL.
Enter the Fivetran-specific User that you created in Create user step.
Enter the Password for the Fivetran-specific user.
Select your Pack mode.
Click Save & Test. Fivetran tests and validates our connection to your DocumentDB primary instance. Upon successful completion of the setup tests, you can sync your data using Fivetran.
Setup tests
Fivetran performs the following tests to ensure that we can connect to your DocumentDB primary and replica instances and that it is properly configured:
- The Connecting to SSH Tunnel Test validates the SSH tunnel details you provided in the setup form. It then checks that we can connect to your database using the SSH Tunnel.
- The Connecting to Host Test validates the database credentials you provided in the setup form. It then verifies that the database host is not private and checks that we can connect to the host.
- The Validating Certificate Test generates a pop-up window where you must choose which certificate you want Fivetran to use. It then validates that certificate and checks that we can connect to your database using TLS.
- The Connecting to Database Test connects to your all the database instances provided and verifies that your database is at least version 4.0. It then checks that we can access the schemas in your database.
- The Connecting to Database Test also checks if one of the instances provided is of type 'primary' by verifying that the change stream feature is enabled for at least one collection.
The tests may take a few minutes to finish running.