Unveiling the Difference: LATERAL FLATTEN(...) vs. TABLE(FLATTEN(...)) in Snowflake
Snowflake's powerful FLATTEN
function allows you to efficiently work with nested data structures, particularly arrays and objects. While both LATERAL FLATTEN(...)
and TABLE(FLATTEN(...))
offer this capability, they differ significantly in their application and impact on data processing. This article delves into the core differences and provides practical examples to illustrate their use cases.
Understanding the Scenario
Imagine you have a table named orders
containing customer orders, each with an array of items
representing the products purchased. Your goal is to access and analyze individual items from each order. This is where FLATTEN
comes into play.
Original Code:
CREATE TABLE orders (
order_id INT,
customer_id INT,
items ARRAY
);
INSERT INTO orders (order_id, customer_id, items) VALUES
(1, 10, ['apple', 'banana']),
(2, 20, ['orange', 'grape', 'pear']);
Unveiling the Differences
1. LATERAL FLATTEN(...):
- Purpose: Used to expand an array within a single row into multiple rows, maintaining the context of the original row.
- Behavior: Introduces a new row for each element in the array, inheriting values from the parent row.
- Syntax:
LATERAL FLATTEN(array_column) AS element_name
- Example:
SELECT o.order_id, o.customer_id, f.value AS item
FROM orders o
LATERAL FLATTEN(o.items) AS f;
This query will produce the following result:
order_id | customer_id | item |
---|---|---|
1 | 10 | apple |
1 | 10 | banana |
2 | 20 | orange |
2 | 20 | grape |
2 | 20 | pear |
2. TABLE(FLATTEN(...):
- Purpose: Used to create a new table-like object from the flattened array, allowing you to perform operations on the flattened data independently.
- Behavior: Creates a temporary table-like object with columns based on the flattened array elements.
- Syntax:
TABLE(FLATTEN(array_column))
- Example:
SELECT *
FROM TABLE(FLATTEN(orders.items)) AS t
WHERE t.VALUE = 'grape';
This query will return only the row containing "grape".
Key Takeaways
- LATERAL FLATTEN: Ideal for scenarios where you need to access individual elements from an array while preserving the original row context.
- TABLE(FLATTEN): Suitable for situations where you want to manipulate the flattened data independently and perform aggregations or filtering operations.
Beyond the Basics
LATERAL FLATTEN
is typically more efficient for simple data manipulation tasks.TABLE(FLATTEN)
allows for complex queries and aggregations on the flattened data.- Both options provide valuable tools for effectively dealing with nested data structures in Snowflake.
Understanding these differences will allow you to select the most appropriate approach for your specific data analysis needs.
References: