I have a question about the title.
When the system processes a task for multiple users at once, it inserts the processed user's code into the log table, separated by commas, text.
When searching for the user's log from that log, as it is,
select * from log_table where user_code like '% user code %';
Search by intermediate match as shown in .
As the amount of logs has increased and it is getting heavier, I am thinking of accelerating this process this time.
At the moment, it is expected to be twice as fast by changing from text to integer, but is there any other way to do it faster?
Can someone teach me?
sql postgresql
Inserting the processed user's code into the log table, separated by commas and text.
I think you're pushing a comma-separated list of user codes into a single column of the log table, but first of all, I'm concerned about the unnormalized point.
I think it is necessary to provide identifiers in either id or DateTime for one action, plus at least two types of columns (for normalization).
Process_ID, user_code
---------------------
1, 252533
1, 252600
2, 252533
2, 252004
2, 252111
3, 252600
4, 252533
...
If you pair these two as main keys or add an ID to each log entry as main keys, I think normalization can be achieved.
Now that you normalize it like this,
select* from log_table where user_code=252533;
In , there is no need to match patterns by like
and the database only needs to match the user_code column, so you can probably expect better performance.
Postgres can handle arrays, so let's make it an integer array and index it.It should be considerably faster.
-- Add column of integer array
ALTER TABLE log_table ADD COLUMN user_codes integer[];
-- put up an index
CREATE INDEX log_table_user_codes ON log_table USING GIN(user_codes);
-- Select row with 10 in user_codes
SELECT* FROM log_table WHERE user_codes@>ARRAY[10];
For PostgreSQL, you can create custom indexes by using GIN (Generalized Inverted Indexes)
in Shouichi's answer.
For example, here is a table called alll_label
with about 1.5 million records.
testdb=#\dall_label;
Table "all_label"
Column | Type | Modifiers
-------------------+-----------------------------+------------------------
:
label | text | not null
:
testdb=#SELECT COUNT(label) FROM all_label;
count
---------
1526550
(1 row)
Attempt to query without indexing.
testdb=#EXPLAINANALYZE SELECT* FROM all_label WHERE label LIKE '%intermediate%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on all_label (cost=0.00..40078.88 rows=148 width=78) (actual time=0.019..362.919 rows=7 loops=1)
Filter:(label~~'%intermediate%'::text)
Rows Removed by Filter—1526543
Planning time: 0.186 ms
Execution time: 362.951 ms
Next, create an index and query.
## Create a space-separated index of the text in the label column
testdb=#CREATE INDEX idx_all_label ON all_label USING GIN (string_to_array(label, ''));
testdb=#EXPLAINANALYZE SELECT* FROM all_label WHERE(string_to_array(label, '')))@>'{intermediate}'::text[];
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on all_label (cost=87.16..15290.57 rows=7633 width=78) (actual time=0.042..0.051 rows=6 loops=1)
Check Cond: (string_to_array(label, ':text)@>'{intermediate}':text[])
Heap Blocks: exact=5
->Bitmap Index Scan on idx_all_label (cost=0.00..85.25 rows=7633 width=0) (actual time=0.033..0.033 rows=6 loops=1)
Index Cond: (string_to_array(label, ':text)@>'{intermediate}':text[])
Planning time: 0.128 ms
Execution time: 0.088 ms
Reduced from 362.951 ms to 0.088 ms.
If you find it difficult to create new tables or increase columns, you may be able to consider the above actions.
It may not be an answer because it will be removed from the tag, but I will write it as one of the examples.
I thought it would be better to store the logs in a document type DB and search by a search engine.Specifically, MongoDB and ElasticSearch.
Even if the logs are non-standard, they are flexible and search fast.
Other answers will be added to the existing table, so the data will need to be temporarily saved and registered again, and all codes will need to be changed at once.
Here's how you can add tables and transfer data between tables, and migrate code sequentially.
When the system processes a task for multiple users at once, it inserts the processed user's code into the log table, separated by commas, text.
Typical anti-pattern in DB design.One of the negative effects is that I can't search using the index, so I'm stuck with the problem beautifully.
(This is detailed in the SQL Anti-pattern (ISBN 978-4-87311-589-4) so please read it.)
I don't know the structure of log_table
but...It will contain an ID that identifies the 処理action 」, assuming log_id
.
Create another table to associate the log_table
with the log_table.user_code
comma-separated and paired with log_id
.
log_user table
+------+---------+
| log_id | user_code |
+------+---------+
| 10| 1|
| 10| 5|
| 10| 8|
| 11| 1|
| 12| 3|
| 12| 99|
Any "action" related to a user can be found in
SELECT log_table.* FROM log_table LEFT JOIN log_user ON Log_table.log_id=log_user.log_id WHERE log_user.user_code=10;
You can search in
You do not have to change all the codes at once if you leave the insertion in the user_code
column of log_table
.If you run out of code to refer to, you can erase it or leave it alone.
585 PHP ssh2_scp_send fails to send files as intended
618 GDB gets version error when attempting to debug with the Presense SDK (IDE)
574 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
577 Who developed the "avformat-59.dll" that comes with FFmpeg?
626 Uncaught (inpromise) Error on Electron: An object could not be cloned
© 2024 OneMinuteCode. All rights reserved.