Pangram verdict · v3.3
We believe that this document is fully human-written
AI likelihood · overall
HumanArticle text · 1,510 words · 6 segments analyzed
../postgres-lateral-makes-quite-a-good-dslPublished on: 2026-03-15By BenLateral joins are quite neat and you can build a query eDSL with them.Postgres (and a few other databases(?)) has a lesser known or used join type known as the lateral join. They allow columns from preceding FROM clauses to be used in subqueries that are being joined.As a (bad) example, take this pretty standard query joining two tables:SELECT * FROM users u INNER JOIN posts p ON u.id = p.user_id This can be rewritten as a lateral join with:SELECT * FROM users u CROSS JOIN LATERAL (select * from posts p where u.id = p.user_id) p2Notice that the join type changed to CROSS, normally this would result in a cartesian product, but the filter inside the subquery means each post is still paired up only with its user.In fact, both queries actually get the same query plan:+----------------------------------------------------------------+ | QUERY PLAN | |----------------------------------------------------------------| | Hash Join (cost=37.00..60.52 rows=1070 width=88) | | Hash Cond: (p.user_id = u.id) | | -> Seq Scan on posts p (cost=0.00..20.70 rows=1070 width=48) | | -> Hash (cost=22.00..22.00 rows=1200 width=40) | | -> Seq Scan on users u (cost=0.00..22.00 rows=1200 width=40) | +----------------------------------------------------------------+This is actually really useful as it provides a way to solve an expressivity problem that I think most ORMs and query builders have: that queries are difficult to compose.I think most composition techniques in other query builders boil down to either passing some query builder object through a sequence of functions, each of which appends a table to be joined and a where clause, or having functions that return subqueries which you then have to handle joining with your query yourself.
Neither of these are very good, the first way probably only works in dynamically typed languages, and in the latter you lack any way to provide a posts_of_users(user_id) -> [Post] ‘function’.Worst of all IMO are ORMs which abstract away relationships. It’s cool at first to be able to write select(User).with(Post) and have it build the join automatically, or even handle a M2M join. But as soon as you need to update something it becomes even more tedious than managing the join table yourself, I’ve witnessed before the need to first load all values before you can add or remove an entry using the normal interface, lest you accidentally instruct the ORM to delete all entries before inserting yours, or try to create duplicates on one side of a M2M.Also any ORM with some .with() method is going to be hell in a typed language, it either needs to use some macro magic to summon types such as FooWithBar for every possible combination, or maybe it produces some With<Foo, bar> type that results in some horrible types With<With<Foo, Bar>, Baz> that are impossible to work with without IDE assistance.I’m also not much of a fan of the ‘generate bindings’ approach of writing plain SQL queries (either in their own files, or inline) and using some compile time code execution to generate functions for the host language to use (e.g. sqlx). With these you’re back to writing SQL, but at least you have type safety. Also this is strictly less composable than any other form of building queries.I want to present a type of query builder library that is all of:Expressive: Queries doing complicated joins shouldn’t be inscrutable, ideally such a query should be clearer represented in the eDSL than in SQLComposable: Queries should be reusable, and parameterisableType safe: The eDSL works within the type system of the host language to ensure queries are correctly typedAlways generates valid SQL: This is particularly useful when it comes to aggregations, as using an aggregate operator changes the requirements of the entire part of the query it is used inWorks with user types: I think this is one of the draws to using ORMs, in that they handle the boilerplate of making user types work with the database.
The first library I’ve encountered that provides a way to build queries compositionally using lateral joins is the Haskell library Rel8, and it looks like this:postsForUser :: Expr UserId -> Query (Post Expr) postsForUser userId = do post <- each postSchema where_ $ post.userId ==. userId pure post usersAndPosts :: Query (User Expr, Post Expr) usersAndPosts = do user <- each userSchema post <- postsForUser user.id pure (user, post)This interface is extremely expressive to the point that it feels like you’re actually just manipulating data that is already in the host language, but in actuality you’re building a sql query that will look something like this:SELECT CAST("id0_1" AS "int4") as "_1/id", CAST("name1_1" AS "bpchar"(1)[]) as "_1/name", CAST("id0_3" AS "int4") as "_2/id", CAST("user_id1_3" AS "int4") as "_2/userId", CAST("body2_3" AS "bpchar"(1)[]) as "_2/body" FROM (SELECT * FROM (SELECT "id" as "id0_1", "name" as "name1_1" FROM "user" AS "T1") AS "T1", LATERAL (SELECT "id" as "id0_3", "user_id" as "user_id1_3", "body" as "body2_3" FROM "post" AS "T1") AS "T2" WHERE (("user_id1_3") = ("id0_1"))) AS "T1"The trick is that Expr UserId doesn’t contain any UserId, but instead contains (in this case) the SQL expression id0_1, which can then in postsForUser be used as if it were any other SQL expression such as a literal.A Query is then just a SELECT ... clause, and each line of the do block introducing a query adds a CROSS JOIN LATERAL ..., and a where_ just introduces a WHERE into the query being built.
Now this does of course have one glaring issue: id0_1 is only valid when used inside a sibling subquery of the subquery introducing the *_1 columns, which is also syntactically positioned afterwards. And usually in query builders we want to make it reasonably difficult to generate invalid queries, as debugging those is always a royal pain.In Haskell this is actually solved inherently: the Expr values are only ever available ‘inside’ the Query monad. There is no way to get an Expr ‘out’ of it and therefore any Expr can only be used after it has been introduced, and only in a scope equal to (within the same Query) or deeper than (within some Query created by calling a function) that in which it was introduced.Now I’m going to stop talking about Haskell here, because what I would actually like to talk about is the Rust library I wrote which replicates the behaviour of Rel8, which I will call rust-rel8 until I can think of a catchy project name.My library looks pretty much just like Rel8, but in rust:fn posts_of_user(user_id: Expr<i32>) -> Query<Post> { query::<Post<ExprMode>>(|q| { let post = q.q(Query::each(&Post::SCHEMA)); q.where_(user_id.equals(post.user_id.clone())); post }) }
let q = query::<(User<ExprMode>, Post<ExprMode>)>(|q| { let user = q.q(Query::each(&User::SCHEMA)); let post = q.q(posts_of_user(user.id.clone()));
(user, post) }) .order_by(|x| (x.clone(), sea_query::Order::Asc));
let rows = q.all(&mut *pool).await.unwrap();You can even do cool things such as aggregating the result of posts_of_user, so that the result is (User, Vec<Post>) as it comes out of the database:let q = query::<(User<ExprMode>, ListTable<Post<ExprMode>>)>(|q| { let user = q.q(Query::each(&User::SCHEMA)); let posts = q.q(posts_of_user(user.id.clone()).many()); // that .many() turns a Query<T> into Query<ListTable<T>>
(user, posts) }) .order_by(|x| (x.0.name.clone(), sea_query::Order::Asc));
let rows: Vec<(User, Vec<Post>)> = q.all(&mut *pool).await.unwrap();Even cooler, a left outer join is made using .optional():fn latest_post_of_user(user_id: Expr<i32>) -> Query<MaybeTable<Post>> { query::<Post<ExprMode>>(|q| { let post = q.q(Query::each(&Post::SCHEMA)); q.where_(user_id.equals(post.user_id.clone())); post }) .order_by(|x| (x.id.clone(), sea_query::Order::Desc)) .limit(1) .optional() }
let q = query::<(Expr<String>, Expr<Option<String>>)>(|q| { let user = q.q(Query::values(demo_users.shorten_lifetime())); let post = q.q(latest_post_of_user(user.id.clone())); // `post` here is `MaybeTable<Post>`, we can either project an `Expr<Option<T>>` out of it, // or we could also just return it from the query, which would give us `Option<T>` // after decoding the result. let post_content = post.project(|p| p.contents.clone()); (user.name, post_content) }) .order_by(|x| (x.clone(), sea_query::Order::Asc));
let rows = q.all(&mut *pool).await.unwrap();
assert_eq!( vec![ ("Huldra".to_owned(), None), ("Leschy".to_owned(), Some("Quak!".to_owned())), ("Undine".to_owned(), Some("Croak".to_owned())) ], rows )You can even declare your own tables:#[derive(Debug, PartialEq, rust_rel8_derive::TableStruct)] struct User<'scope, Mode: TableMode = ExprMode> { id: Mode::T<'scope, i32>, name: Mode::T<'scope, String>, }
impl<'scope> User<'scope, NameMode> { const SCHEMA: TableSchema<Self> =
TableSchema { name: "users", columns: User { id: "id", name: "name", }, }; }
#[derive(Debug, PartialEq, rust_rel8_derive::TableStruct)] struct Post<'scope, Mode: TableMode = ExprMode> { id: Mode::T<'scope, i32>, user_id: Mode::T<'scope, i32>, body: Mode::T<'scope, String>, }
impl<'scope> Post<'scope, NameMode> { const SCHEMA: TableSchema<Self> = TableSchema { name: "posts", columns: Post { id: "id", user_id: "user_id", body: "body", }, }; }The .aggregate builder also is designed so you can build aggregations without the pain normally encountered when constructing them in SQL:let q = query::<Two<_, i32, i32>>(|q| { let a = q.q(Query::values([ Two { a: 1, b: 1 }, Two { a: 1, b: 2 }, Two { a: 1, b: 3 }, Two { a: 1, b: 4 }, Two { a: 2, b: 1 }, Two { a: 2, b: 2 }, Two { a: 3, b: 1 }, ])); a }) .aggregate::<(Expr<i32>, ListTable<Expr<i32>>, ListTable<Two<_, i32, i32>>)>(|a, e| { let x = a.group_by(e.a.clone()); let y = a.array_agg(e.a.clone().add(Expr::lit(1i32))); let as_array = a.array_agg(e); // .aggregate enforces that everything in the output must have passed through `a`, // and therefore must either be used as part of a group by or in an aggregation function. (x, y, as_array) });
let rows = q.all(&mut *pool).await.unwrap();
assert_eq!( vec![ ( 1, vec![2, 2, 2, 2], vec![