We're going to talk about a little bit of a complex grouping method using arrays.

Asked 1 years ago, Updated 1 years ago, 80 views

I would like to create a complete list based on incomplete list information in PostgreSQL. Is there a good way?
The following tables are available:

CREATE TABLE abscissa(
  a_id TEXT
  o_id_list TEXT[]
);

US>CREATE TABLE ORDINATE(
  o_id TEXT
  a_id_list TEXT[]
);

The a_id_list and o_id_list are incomplete lists of a_id and o_id, respectively.
Based on this, I want to make a complete list.
For example, suppose you have the following data:

a_1 | {o_1,o_2,o_4}
a_2 | {o_1, o_3, o_5}
a_3 | {o_3,o_5}

o_1 | {a_1,a_2}
o_2 | {a_1}
o_3 | {a_2,a_3}
o_4 | {a_1}
o_5 | {a_2,a_3}

This is

 {a_1,a_2,a_3} | {o_1,o_2,o_3,o_4,o_5}

I'd like to summarize it in such a way.
Thank you for your cooperation.

Note:

If there is no overlap,

a_4 | {o_6,o_7}
a_5 | {o_8,o_9}

This is the case.

I'm sorry that I'm not used to using it and it's clumsy.

postgresql

2022-09-30 21:11

2 Answers

"List the elements of each concatenation in the graph" seems to be a problem.

I think it would be easier and easier to understand if you search for graph problems procedurally instead of using SQL.
Many efficient algorithms (e.g., depth-first search) are also well known

I tried writing SQL, but it became complicated using recursive SQL
By adding arrays with common parts until there is no change, we get a very large graph.

CREATE TABLE foo(foo_id int4, bar_id_list int4[]);
INSERT INTO foo (foo_id, bar_id_list)
VALUES
(1,ARRAY[1,2,4]),
(2,ARRAY[1,3,5]),
(3,ARRAY [3,5]),
(4, ARRAY[6]),
(5, ARRAY [7,8]),
(6, ARRAY [8,9]),
(7,ARRAY [9,10]),
(8,ARRAY[10,11]);

WITH RECURSIVER AS (
  SELECT
    T1.foo_id,
    T1.bar_id_list
  from
    foo T1

  UNION ALL

  SELECT
    T1.foo_id,
    array_agg(DISTINCT T1.bar_id ORDER BY T1.bar_id)ASbar_id_list
  FROM(
    SELECT DISTINCT
      T1.foo_id,
      unnest(T1.bar_id_list||T2.bar_id_list)ASbar_id,
      T2.bar_id_list AS prev_bar_id_list
    from
      foo T1
    CROSS JOIN
      rT2
    WHERE
      T1.bar_id_list&T2.bar_id_list
  )T1
  GROUP BY
    foo_id
  HAVING
    max(T1.prev_bar_id_list)<array_agg(T1.bar_id ORDER BY T1.bar_id)
)
SELECT
  T1.foo_id,
  T1.bar_id_list
FROM(
  SELECT
    T1.foo_id,
    T1.bar_id_list,
    row_number()OVER(Partition BY T1.foo_id ORDER BY array_length(T1.bar_id_list,1)DESC)AS row_number
  from
    rT1
)T1
WHERE T1.row_number = 1;
 foo_id | bar_id_list  
--------+---------------
      1 | {1,2,3,4,5}
      2 | {1,2,3,4,5}
      3 | {1,2,3,4,5}
      4 | {6}
      5 | {7,8,9,10,11}
      6 | {7,8,9,10,11}
      7 | {7,8,9,10,11}
      8 | {7,8,9,10,11}
(8 rows)


2022-09-30 21:11

 with w1 as (select array_agg(a_id order by a_id) from (select distinguished unnest(a_id_list) as a_id from order) as a_id from order) as t2)
, w2as(select array_agg(o_id order by o_id) from(select distinguished unnest(o_id_list)aso_id from abscissa)ast1)
select * from w1, w2


2022-09-30 21:11

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.