Subqueries¶
DataFusion supports EXISTS
, NOT EXISTS
, IN
, NOT IN
and Scalar Subqueries.
The examples below are based on the following table.
❯ select * from x;
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
EXISTS¶
The EXISTS
syntax can be used to find all rows in a relation where a correlated subquery produces one or more matches
for that row. Only correlated subqueries are supported.
❯ select * from x y where exists (select * from x where x.column_1 = y.column_1);
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
1 row in set.
NOT EXISTS¶
The NOT EXISTS
syntax can be used to find all rows in a relation where a correlated subquery produces zero matches
for that row. Only correlated subqueries are supported.
❯ select * from x y where not exists (select * from x where x.column_1 = y.column_1);
0 rows in set.
IN¶
The IN
syntax can be used to find all rows in a relation where a given expression’s value can be found in the
results of a correlated subquery.
❯ select * from x where column_1 in (select column_1 from x);
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
1 row in set.
NOT IN¶
The NOT IN
syntax can be used to find all rows in a relation where a given expression’s value can not be found in the
results of a correlated subquery.
❯ select * from x where column_1 not in (select column_1 from x);
0 rows in set.
Scalar Subquery¶
A scalar subquery can be used to produce a single value that can be used in many different contexts in a query. Here is an example of a filter using a scalar subquery. Only correlated subqueries are supported.
❯ select * from x y where column_1 < (select sum(column_2) from x where x.column_1 = y.column_1);
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
1 row in set.