How Can I Transform JSON Data?
Use Case
You have nested JSON data in Snowflake that you need to parse to use the data for analytics. You are looking for an example of how this works.
Environment
Snowflake
Recommendation
Fivetran doesn't automatically unpack nested JSON objects to separate columns or tables in the Snowflake destination. We recommend using Snowflake’s native FLATTEN function to parse the values you are looking for into separate columns. Follow the process to Flatten and Query JSON Data.
Before manipulating your production data, try this out on sample data to familiarize yourself with the process:
In Snowflake, create sample JSON data using the following script:
CREATE OR REPLACE SCHEMA TEST; CREATE OR REPLACE TABLE TEST.JSON_TEST AS SELECT parse_json(column1) AS FT FROM VALUES ('{more: {test:{ "ID": 2, "name": "Fivetran", "function": "Data Pipeline As a Service", "code": { "abbr": "5T", "st": { stattrib:"1" } } } } } ') as raw_json;
Run the following command and analyze the output:
SELECT FT:more::STRING as more FROM TEST.JSON_TEST;
Flatten the JSON completely using the following command:
SELECT FT:more.test.ID::INTEGER as id, FT:more.test.name::STRING as name, FT:more.test.function::STRING as "function", FT:more.test.code.abbr::STRING as codeabbr, FT:more.test.code.st.stattrib::STRING as codeststattrib FROM JSON_TEST;
Considerations
JSON objects don’t always have a fixed format, and keys frequently change, making it challenging to know what keys exist.
To understand your JSON structure, use a flatten function:
select
upper(regexp_replace(f.path, '\\[[0-9]+\\]', '[]')) as path
, typeof(f.value) as type
, count(*) as record_count
from
TEST.JSON_TEST
, lateral flatten(parse_json(FT), recursive=>true) f
group by 1, 2
order by 1, 2;