This is how the data is formatted:
item_name | item_serial | sub_group | conc_stuff | other_data | more_data
----------+-------------+-----------+------------+------------+-----------
foo bar-01-a widget b-3 towel dent
foo bar-02-a widget a-1 42 mouse
foo bar-03-a widget p-1 babel dolphin
foo3 bar-21-f widget f-1 42 marvin
foo3 bar-22-f widget x-1 poetry vogon
I have gotten the query to perform the way I want it to, the problem is I need to return more data.
SELECT item_name,
array_to_string(array_agg(conc_stuff), ',') as stuff
FROM dataset
WHERE some_selector = 'X'
GROUP BY item_name
ORDER BY item_name;
I have tried what seems simple yet logical to me:
SELECT item_name,
item_serial,
sub_group,
array_to_string(array_agg(conc_stuff), ',') as stuff
FROM dataset
WHERE some_selector = 'X'
GROUP BY item_name
ORDER BY item_name;
I need to return something that looks like this:
item_name | item_serial | sub_group | stuff
----------+-------------+-----------+-------------
foo bar-01-a widget a-1,b-3,p-1
foo3 bar-21-f widget f-1,x-1,g-5
foo6 bar-81-z widget r-1,d-8,w-0
instead of just this:
item_name | stuff
----------+--------------
foo a-1,b-3,p-1
foo3 f-1,x-1,g-5
foo6 r-1,d-8,w-0
When I try to add additional fields to the query, I get :
ERROR: column "stuff.item_serial" must appear in the GROUP BY clause or be used in an aggregate function
But I dont want to GROUP BY item_serial, I just want it to be returned with the aggregate, right?
Do I need to run a subquery? Im sure this is simple. The if there are multiple methods, which is most efficient? Some of the text I will be concatenating are coordinates (LARGE string of text).
You need to pick one value for each name you have. You can’t have each name only returned once but the
item_serialvalue multiple times.Picking one value out of those that are there for a group value is done through aggregate functions:
This picks th4e “first”
item_serialandsub_groupfor each name.If you want the last value, use
maxinstead.But the important thing to understand is that you have to pick one value for the non-grouped columns. And you need to tell the DBMS exactly which one it should use by supplying an aggregate function which picks one value.
SQLFiddle example: http://www.sqlfiddle.com/#!1/58009/1