James Gardner: Home > Blog > 2007 > Concatenating ref table fields in...

Concatenating ref table fields in PostgreSQL using a custom aggregate function

Posted:2007-09-19 19:19
Tags:sql

Frequently when writing web applications with a RDBMS back end I come across a situation where I have three layer one to many mapping which I want to represent in a single table.

As an example of this structure imagine that in a particular application a project can have multiple topics. Each topic can have multiple sub topics. I want to be able to represent every topic and subtopic in a single table where the first column in each row is the topic name and the second column is a comma-separated list of sub-topics associated with the particular topic from the first column.

Here are my tables showing only the topics and subtopics for a particular project:

test_database=# select * from topic;
 uid | name
-----+------
   1 | A
   2 | B
   3 | C
   4 | D
(4 rows)

test_database=# select * from subtopic;
 topic | name
-------+------
     1 | a
     2 | b
     3 | b
     3 | c
     3 | d
     4 | e
     4 | a
(7 rows)

Here's the join you might try first:

SELECT topic.name as topics, subtopic.name AS subtopics
FROM topic
LEFT JOIN subtopic ON subtopic.topic = topic.uid;

This gives the following:

 topics | subtopics
--------+-----------
 A      | a
 B      | b
 C      | b
 C      | c
 C      | d
 D      | e
 D      | a
(7 rows)

This isn't quite what we want because we want to collapse the duplicate topics so that the subtopics form a single field. To do this we need to use a GROUP BY clause to group by the topic. We'll then need an aggregate function to turn the multiple sub-topics into an array and finally we'll need a function to convert the array to a string. Luckily the PostgreSQL array_to_string function can handle the last part and it works for converting integers and other types in an array to strings too:

test_database=# SELECT array_to_string(ARRAY['a','b','c'], ', ');
 array_to_string
-----------------
 a, b, c
(1 row)

All we need is the aggregate to turn the fields into an array. The PostgreSQL documentation suggests an aggregate you can create for this purpose:

CREATE AGGREGATE array_accum (
    sfunc = array_append,
    basetype = anyelement,
    stype = anyarray,
    initcond = '{}'
);

Putting everything together we have:

SELECT
    topic.name as topics,
    array_to_string(array_accum(subtopic.name), ', ') AS subtopics
FROM topic
LEFT JOIN subtopic ON subtopic.topic = topic.uid
GROUP BY topic.name
ORDER BY topic.name;

and here's the result:

 topics | subtopics
--------+-----------
 A      | a
 B      | b
 C      | b, c, d
 D      | e, a
(4 rows)

If you want to test this example yourself here's the SQL you'll need:

CREATE AGGREGATE array_accum (
    sfunc = array_append,
    basetype = anyelement,
    stype = anyarray,
    initcond = '{}'
);

CREATE TABLE topic(
    uid INTEGER,
    name VARCHAR
);
CREATE TABLE subtopic(
    topic INTEGER,
    name VARCHAR
);

INSERT INTO topic VALUES (1,'A');
INSERT INTO topic VALUES (2,'B');
INSERT INTO topic VALUES (3,'C');
INSERT INTO topic VALUES (4,'D');

INSERT INTO subtopic VALUES (1,'a');
INSERT INTO subtopic VALUES (2,'b');
INSERT INTO subtopic VALUES (3,'b');
INSERT INTO subtopic VALUES (3,'c');
INSERT INTO subtopic VALUES (3,'d');
INSERT INTO subtopic VALUES (4,'e');
INSERT INTO subtopic VALUES (4,'a');

SELECT
    topic.name as topics,
    array_to_string(array_accum(subtopic.name), ', ') AS subtopics
FROM topic
LEFT JOIN subtopic ON subtopic.topic = topic.uid
GROUP BY topic.name
ORDER BY topic.name;

(view source)

James Gardner: Home > Blog > 2007 > Concatenating ref table fields in...