Hybrid Deployment
You must have an Enterprise or Business Critical plan to use the Hybrid Deployment model.
Fivetran's Hybrid Deployment model enables organizations to sync data sources using Fivetran while ensuring the data never leaves the secure perimeter of the cloud or on-premises network. This architecture grants you complete control over your data's flow, allowing you to meet specific business needs concerning data security.
With the Hybrid Deployment model, you decide where to host the data pipelines while still enjoying the advantages of an automated SaaS model. Your data remains within your private network, with Fivetran serving as a unified control plane for all your data movements. This setup not only supports Hybrid and multi-cloud deployments but also offers an extensible solution complete with APIs, metadata sharing, and more. Additionally, it simplifies troubleshooting, provides straightforward setup, and is easy to configure and support.
When setting up a new data pipeline, you have the option to run it locally. When you install a Hybrid Deployment Agent within your environment, it communicates outbound with Fivetran. This agent manages the data pipeline processing in your network, with configuration and monitoring still performed through the Fivetran dashboard or API. Only metadata (including MAR information) and logs are sent to Fivetran, which allows Fivetran to understand how the pipeline is running and to display the details in the dashboard.
Architecture
The following diagram outlines the high-level architecture of our Hybrid Deployment model:
Key capabilities
The key capabilities of the SaaS Deployment model:
- Data ingestion and preparation
- Parallel processing
- Temporary data storage: you own and manage the temporary data storage (buckets)
- Data load into a destination
Data privacy and security: Our Hybrid Deployment model processes data within your infrastructure, keeping your actual data within the secure boundaries of your network.
Hybrid Deployment Agent: You host this agent on your infrastructure. It connects your local environment to Fivetran's Managed SaaS, maintaining constant communication with Fivetran to determine when the data pipeline needs to run. The local agent picks up those details from Fivetran's orchestration layer to perform the sync.
Deployment and operation: To use the Hybrid Deployment model, start the agent container on Kubernetes or on a Linux machine equipped with Docker or Podman. After setting up the agent on your network, the agent manages data pipeline processing within that network. Configuration and monitoring of the data pipeline still occur within the Fivetran environment (using the Fivetran dashboard or API). The agent only sends metadata, including syncs metrics, MAR information and logs to Fivetran's cloud for tracking and monitoring purposes, accessible using the Fivetran dashboard.
Network Security: The agent creates a secure outbound connection to Fivetran using modern encryption standards like mTLS. You have the option to limit the outbound traffic to the Fivetran Orchestration and API endpoints.
Resilience: The control plane is a fully managed, cloud-based component of Hybrid Deployment. Our Core Services SLA supports the configuration and monitoring of your data pipeline processing.
Capacity and limitations: Each agent can support up to 10 connections. We recommend that you plan the deployment strategy with this limitation in mind.
Setup guide
Follow our step-by-step setup guides to set up the Hybrid Deployment model for secure data integration in your local environment:
Supported connectors and destinations
Fivetran supports the Hybrid Deployment model for a subset of its connectors and destinations.
On the Fivetran dashboard, you can identify the connectors and destinations that support the Hybrid Deployment model by the Hybrid Deployment icon next to their names.
Sizing guidelines
To successfully deploy and operate the Hybrid Deployment Agent, it is essential to allocate sufficient system resources based on the type and number of connections you plan to run. The following sections provide conceptual guidance to help you estimate and plan your resource requirements.
Base resource requirements
Each Hybrid Deployment Agent requires a minimum of 2 vCPUs and 2 GB of RAM. Additionally, each connection consumes the same amount of resources (2 vCPUs and 4 GB of RAM). These baseline requirements apply to non–High-Volume Agent (non-HVA) connectors.
If you want to run multiple data pipelines concurrently on a single host, you must scale the CPU and memory resources proportionally. For more information about the resources required for multiple concurrent pipelines, see the Prerequisites section of our setup guides.
Additional requirements for HVA connectors
High-Volume Agent (HVA) connectors are optimized for large-scale data movement and require more resources per connection. For each HVA connection, we recommend 4 vCPUs and 8 GB of RAM.
Disk space considerations
Disk space usage depends on the volume of the source data. Data pipelines that process large datasets require more storage.
We recommend provisioning disk space based on the expected maximum volume of data in your source, and allocating additional buffer space for temporary files and retries. For Docker and Podman, ensure that the base directory where containers and images are stored, typically /var/lib/docker
(for rootful) or $HOME/.local
(for rootless), has at least 50 GB of available disk space.
Private networking
Private networking services (such as AWS PrivateLink, Azure Private Link, and Google Cloud Private Service Connect) allow you to sync data from your source into your destination without exposing traffic to the public internet.
Hybrid Deployment works with the private networking service in your local environment to securely connect to your source and destination. You do not have to perform any additional setup to enable private networking in Hybrid Deployment.
Related articles
assignment Hybrid Deployment FAQ
settings API Hybrid Deployment Agent Management