PostgreSQL MAR Management

Use Case

You want to understand how the functioning of PostgreSQL database connections affects Monthly Active Rows (MAR).

Environment

Connector Overview

The initial sync (first-ever historical sync) replicates the entire PostgreSQL data into your destination. Depending on the size of your database, this can take anywhere from a few days to a couple of weeks. This historical sync doesn't contribute toward MAR.

If you introduce additional database tables later, you won't incur MAR.

You can configure your connection(s) in multiple ways. Creating one single connection is easier to manage, whereas creating separate connections lets you track and isolate changes coming to the database and keep them independent of each other:

You can set up one connection for static tables and another connection for dynamic tables to track MAR.
You can group your connections by any arbitrary group. For example, business unit, function, or relationship.
You can include small and large tables in a connection.

Contributing factors

There are four primary actions that contribute to MAR:

A new row is added to an existing table
A value is updated for an existing row in a table
A value is deleted within a table
A new column (with values) is added to a table

Managing MAR

You have an unlimited number of free customer-triggered re-syncs per connection each month. For more information about MAR usage, see our pricing documentation.
If you synced a table initially, later excluded it from your syncs, left the table for some time and didn't refresh it, and then later re-sync the table, every primary key row counts towards MAR.
If separate connections sync the same data from the same source (with the same primary keys), they contribute separately towards your MAR. This means if you have two or more connections based on the same connector that sync from one source to multiple destinations, we count each connection’s MAR separately.
If a table does not have a primary key, we create a synthetic (hash) primary key. The composition of this primary key differs by source. We determine the MAR for these tables using their synthetic primary keys.
Post-load transformations do not count towards MAR.
Replication frequency doesn't impact your monthly active rows because we calculate MAR by determining the unique rows that are updated.
If you delete a row that has a primary key, and then, after a while, create another row with the same primary key, this counts toward only one MAR.
In a table without a primary key, if you remove or add columns from which we generate the synthetic primary key, you incur MAR for every change that you made.
Make sure the destination is always connected. If Fivetran can't write data to your destination, we cache this data for up to twenty-four hours before discarding it.
If the destination is disconnected for more than twenty-four hours, we do the following (depending on how long the WAL retains data for):
- If the WAL retains data for less than twenty-four hours, we perform a historical re-sync, which incurs MAR.
- If the WAL retains data long enough, we may not need to perform a re-sync because we can read through the logs.
Perform regular vacuuming of PostgreSQL so that the XMIN is always in an unfrozen state; otherwise, we have to perform a full historical re-sync, which incurs MAR.
One of the best practices is to modify only new records in PostgreSQL. Modifying very old records every sync causes a high percentage of these tables to incur MAR.
Block schemas or tables from syncing if they don't contain valuable information.
Column blocking to reduce MAR is only applicable to tables that do not have primary keys in PostgreSQL. This is because we create a synthetic primary key based on a commonly changing column and other unique keys. Blocking the commonly changing column reduces MAR.
We recommend that you not pause your PostgreSQL connections for a long period of time because we then need to perform a historical re-sync. We cannot perform a log read because the logs have expired.