Returning related rows in a single SQL query using JSON
When building database-backed applications you’ll often find yourself wanting to return a row from the database along with its related rows.
A few examples:
- Retrieving a list of congressional legislators and their terms, following a foreign key relationship
- Return blog entries and their tags in one go, via a many-to-many table
You can do this in SQLite using the json_group_array() aggregation function. A couple of examples.
Legislators and their terms, via a foreign key
Simplified schema for this database:
CREATE TABLE [legislators] ( [id] TEXT PRIMARY KEY, [name] TEXT, [bio_birthday] TEXT);CREATE TABLE [legislator_terms] ( [legislator_id] TEXT REFERENCES [legislators]([id]), [type] TEXT, [state] TEXT, [start] TEXT, [end] TEXT, [party] TEXT);
Here’s a query that returns each legislator along with a JSON array of their terms:
select legislators.id, legislators.name, json_group_array(json_object( 'type', legislator_terms.type, 'state', legislator_terms.state, 'start', legislator_terms.start, 'end', legislator_terms.end, 'party', legislator_terms.party )) as terms, count(*) as num_termsfrom legislators join legislator_terms on legislator_terms.legislator_id = legislators.id group by legislators.idorder by idlimit 10
And the result:

Note that this query does group by legislators.id
which is allowed in SQLite but may not work in other databases, which might require group by legislators.id, legislators.name
instead.
Tags on blog entries, via a many-to-many table
Simplified schema:
CREATE TABLE [blog_entry] ( [id] INTEGER PRIMARY KEY, [title] TEXT);
CREATE TABLE [blog_tag] ( [id] INTEGER PRIMARY KEY, [tag] TEXT);
CREATE TABLE [blog_entry_tags] ( [id] INTEGER PRIMARY KEY, [entry_id] INTEGER, [tag_id] INTEGER, FOREIGN KEY([entry_id]) REFERENCES [blog_entry]([id]), FOREIGN KEY([tag_id]) REFERENCES [blog_tag]([id]));
Query to retrieve entries with their tags:
select blog_entry.id, blog_entry.title, json_group_array(json_object('tag', blog_tag.tag)) as tagsfrom blog_entry join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id join blog_tag on blog_tag.id = blog_entry_tags.tag_idgroup by blog_entry.idorder by blog_entry.id desc
id | title | tags |
---|---|---|
8191 | I don’t know how to solve prompt injection | [{“tag”:“ai”},{“tag”:“security”},{“tag”:“openai”}] |
8190 | Weeknotes: Datasette Lite, s3-credentials, shot-scraper, datasette-edit-templates and more | [{“tag”:“shotscraper”},{“tag”:“datasette”},{“tag”:“plugins”},{“tag”:“datasettelite”},{“tag”:“projects”},{“tag”:“s3credentials”},{“tag”:“weeknotes”}] |
8189 | Prompt injection attacks against GPT-3 | [{“tag”:“ai”},{“tag”:“gpt3”},{“tag”:“security”},{“tag”:“openai”}] |
There’s a subtle bug in the above: if an entry has no tags at all it will be excluded from the query results entirely.
You can fix that using left joins like this:
select blog_entry.id, blog_entry.title, json_group_array(json_object('tag', blog_tag.tag)) as tagsfrom blog_entry left join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id left join blog_tag on blog_tag.id = blog_entry_tags.tag_idwhere blog_entry.id < 4group by blog_entry.idorder by blog_entry.id desc
This almost works, but it outputs the following returning {"tag": null}
for entries with no tags:
id | title | tags |
---|---|---|
3 | Todo list | [{“tag” |
2 | Blogging aint easy | [{“tag” |
1 | WaSP Phase II | [{“tag” |
David Fetter showed me the solution:
select blog_entry.id, blog_entry.title, json_group_array( json_object('tag', blog_tag.tag) ) filter ( where blog_tag.tag is not null ) as tagsfrom blog_entry left join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id left join blog_tag on blog_tag.id = blog_entry_tags.tag_idgroup by blog_entry.idorder by blog_entry.id
That extra filter on the aggregation does the trick!
Other databases
Other databases are capable of the same thing, but using different functions. PostgreSQL has json_agg() for example, which is also available in Django as JSONBAgg.
Here’s an equivalent query in PostgreSQL syntax:
select blog_entry.id, title, slug, created, coalesce(json_agg(json_build_object(blog_tag.id, blog_tag.tag)) filter ( where blog_tag.tag is not null ), json_build_array()) as tagsfrom blog_entry left join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id left join blog_tag on blog_entry_tags.tag_id = blog_tag.idgroup by blog_entry.idorder by blog_entry.id
See that running here in django-sql-dashboard
.