Events
Event tracking measures consumer behavior on a website, app or emails, usually through pixel tracking. Fivetran integrates with several services that collect events sent from your website, mobile app, or server and then loads these events into your destination.
Supported services
Fivetran offers support for the following event tracking libraries:
- Amazon Kinesis Firehose
- Apache Kafka
- AWS MSK
- Azure Event Hubs
- Confluent Cloud
- Heroku Kafka
- Snowplow Analytics (open source)
- Webhooks
See our Swagger documentation for more information about webhook endpoints.
Sync overview
After instrumenting the tracking code on your website, server, or mobile application, Fivetran's event pipeline will collect, enrich, and load all of this data into your destination in near real-time. In addition to being easy to set up, Fivetran is built to scale to hundreds of millions of events per day, and automatically retain a secure backup of your event data. If your destination is ever compromised, we can easily reload all your events for you.
The following diagram outlines the process for event collection at Fivetran:
When we receive the events in our collection service, we first buffer the events in the queue and store them in our cloud storage buckets before writing the events to a temporary file. We push the events (the temporary file) to the destination when the sync runs. For more information on the data retention period of different connectors, see our Data retention period documentation.
NOTE: We do not sync the events that are in our queue or are getting stored in the storage buckets while a sync is in progress. We process these events during the next sync.
Supported regions
By default, we store that event data in a cloud storage service in one of the following locations:
- EU region (for destinations run in the EU region)
- UK region (for destinations run in the UK region)
- US region (for all other destinations)
NOTE: For the Webhooks connector, you can configure Fivetran to store event data in a bucket you manage.
Event collector servers
Our collection service uses regional collector servers to receive events and store them temporarily. We expose the collector servers using the webhooks.fivetran.com
sub-domain and use Amazon Route 53 as our Domain Name System (DNS) server.
We support geolocation routing for the requests to the webhooks.fivetran.com
sub-domain. We route the requests to the appropriate regional collectors using the location the DNS queries originate from. We store and process your data within the data processing location you select in the destination setup form.
For example,
Data processing location is US. When we receive a request to
webhooks.fivetran.com
, our collection service routes the requests to the US collector server. We store your data in a storage bucket in the US location.Data processing location is EU. When we receive a request to
webhooks.fivetran.com
from a client in the US, our US collector server collects the request and then stores your data in a storage bucket in the EU location.When we receive a request to
webhooks.fivetran.com
originating from outside the supported regions, we route the requests to the US collector server.
We support Handshake requests if your source requires a response with some validation data from our collector servers before establishing the connection or creation of webhooks. Refer to the individual connector docs to see if it's supported for the connector you are using.
Data retention period
Fivetran retains event data from the Webhooks connector and other connectors that use webhooks. We store that data so that it can be re-synced if needed. The data retention period depends on the connector type.
Connector | Data Retention Period | Note |
---|---|---|
AppsFlyer | 30 days | |
Eloqua | 30 days | |
Github | 30 days | |
Greenhouse | 30 days | |
Help Scout | 30 days | |
HubSpot | 30 days | |
Intercom | 30 days | |
Iterable | 30 days | |
Jira | 30 days | |
Pipedrive | 30 days | |
Recharge | 30 days | |
Branch | Persistent | |
Mandrill | Persistent | |
SendGrid | Persistent | |
Shopify | Persistent | |
Snowplow | Persistent | |
SurveyMonkey | Persistent | |
Webhooks | Persistent | |
Zendesk Support | Persistent |
Updating data
Fivetran takes the new data and appends it to the existing tables. When we encounter events that cannot be parsed (such as incomplete data) we skip those events and alert the user.
Fivetran is always collecting data and stores it long term in a staging S3 bucket. The changes are then loaded on a 15-minute interval by default. This upload time can be changed inside of your Dashboard.
Fivetran does not propagate deleted data for events, because the events have already happened.
Schema changes
Fivetran does not propagate schema changes. If you stop tracking a metric, the metric will have a null value for that row. All the historical data would still exist. If you stop tracking a table, the table will still exist, but you won't get any new rows. It's not possible to delete a row in the source, because the event has already occurred.
Supported message types
Connectors | Message types |
---|---|
Amazon Kinesis Firehose | JSON |
Apache Kafka | JSON, Avro, Protobuf, and Text |
AWS MSK | JSON, Avro, Protobuf, and Text |
Amazon Event Hubs | JSON and Text |
Azure Service Bus | JSON, Text, Protobuf, and XML |
Confluent Cloud | JSON, Avro, Protobuf, and Text |
Heroku Kafka | JSON and Text |
XML format support
Fivetran supports XML format messages for the following connectors:
- Apache Kafka
- AWS MSK
- Azure Event Hubs
- Confluent Cloud
- Heroku Kafka
- Azure Service Bus
To sync XML messages, select the Message Type as Text in the connector setup form. We sync XML data to a TEXT data type column in your destination.
You can parse the XML data from the TEXT column using destination-native methods.
NOTE: Standard limitations, such as the maximum data size allowed in TEXT columns for different destinations, apply to the data.
Parse XML from TEXT
Snowflake
You can parse and access the XML data by using a combination of the following Snowflake functions:
Sample Query
WITH xml_variant AS (
SELECT to_variant(parse_xml(<xml_text_column>)) variant
FROM <xml_text_table>
WHERE check_xml(<xml_text_column>) IS NULL
)
SELECT get(xmlget(variant, <tag_name> , [ <instance_num> ]), '$') from xml_variant