Best Practices for File Source Configurations
Question
What are the most effective ways to configure a File connector to maximize performance and minimize Monthly Active Rows (MAR) costs under the 2025 pricing model?
Environment
File connections using Merge Mode on the 2025 pricing model.
What's new?
Area | What changed | Why it matters |
---|---|---|
Per-connection pricing and Multiple Table support | Each connection (source to destination) now accrues its own MAR-based discount curve. Multiple Table support allows a single connection to sync several logically related tables, pooling MAR volume onto that curve. | Consolidate related datasets into one connection to optimize pricing. |
Re-sync detection | We detect unchanged rows from the previous sync, even though the modified timestamp is at the file level. We identify a row based on the configured primary key. | Unchanged rows are not counted towards MAR, delivering cost effectiveness without compromising data freshness. |
For more information, see our 2025 Pricing Updates.
Answer
Prerequisites
- Understand how your source files are generated:
- Files are appended, overwritten, or newly created?
- File names are unique (for example, with appended timestamps) or reused?
- Files contain only incremental changes or a full historical refresh?
- Identify Primary Keys - Determine whether each dataset mapped to a table includes fields that can uniquely identify each row (i.e., a natural primary key).
- Specify granular paths and patterns - Provide the most specific File Path and optional regex File Pattern to exclude irrelevant data.
- Understand Monthly Active Rows (MAR) and the pricing model - Learn how MAR is calculated, how to track your usage, and how to optimize your syncs. For more information, see our pricing documentation.
Recommended file source configurations
In all the following configurations, we recommend grouping logically related datasets into a single connection using Multiple Table support. This approach combines their MAR into one discount curve, ensuring cost effectiveness.
Configuration A - Unique file names with incremental data
- Each file has a unique name and contains only new data.
- Primary key used for file process and load - Append file using file modified time (We treat each file as new rows).
Configuration B - Reused file names with incremental data
- Same file name reused; each upload includes only new data.
- Primary key used for file process and load - Append file using file modified time (We treat each file as new rows).
Configuration C - Reused file names, full refresh
- Same file name reused; each file is a complete snapshot of historical data plus new data.
- Primary key used for file process and load - Upsert file using custom primary key (recommended) or Upsert file using file name and line number (if the row order is stable).
- Why - Re-sync detection uses the primary key to ensure only genuinely new or changed rows count towards MAR.
Configuration D - Unique file names, full refresh
- Each file has a unique name and is a complete snapshot of historical data plus new data.
- Primary key used for file process and load - Upsert file using custom primary key.
- Why - Re-sync detection uses the primary key to ensure only genuinely new or changed rows count towards MAR.
Optimization playbook
- Consolidate datasets - Use Multiple Table support to sync logically related tables through one connection, concentrating MAR, and enabling you to ensure cost optimization.
- Use Append file using file modified time when files contain only new data - This is the most efficient option. This approach adds rows to the destination without extra overhead, making it ideal for incremental files.
- Use Upsert file using custom primary key for full snapshot files - Data is upserted based on your chosen primary key. While re-sync detection helps avoid repeated MAR charges, upserts are not cost-effective due to the additional processing required in your destination.
Considerations
Fivetran prefers to be prescriptive to our users regarding file configuration. Consider our recommended file source configurations for the best approaches to configuring your file ingestion processes.
If you don't use the recommended approaches, you may experience increased MAR usage, degraded sync performance, or both.