SQLite behaves differently when dealing with aggregation than many other RDBMS’s. Consider the following table and values:
create table foo (a int, b int);
insert into foo (a, b) values (1, 10);
insert into foo (a, b) values (2, 11);
insert into foo (a, b) values (3, 12);
If I query it thus:
select a, group_concat(b) from foo;
Normally, I would expect to receive an error, due to the fact that I haven’t included column ‘a’ in a GROUP BY clause. Below is the error produced by SQL Server (PostgreSQL would emit something similar).
Column 'foo.a' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
SQLite, on the other hand, just goes along with it and produces this result:
3|10,11,12
What good is this? How did it pick the value for column ‘a’? If we add another row, there seems to be a pattern in what it’s picking, and perhaps we can tentatively say that it’s using the most recently added row, although it could simply be indeterminate.
sqlite> insert into foo (a, b) values (2, 13);
sqlite> select a, group_concat(b) from foo;
2|10,11,12,13
This seems like a bug to me, but I’m wondering what our database experts here have to say about it.
(I’m using SQLite version 3.6.16 on Ubuntu.)
This is useful behavior in cases where you’re selecting multiple grouped columns but you only need the query engine to truly test one column for the grouping. Take this for example:
Given an Orders and OrderDetails table.
In other databases, we would need to include both OrderID and OrderDate in the group by. The database would then group by both columns, which is redundant in this case. By grouping only on OrderID, we get the same results with more efficiency and less code.