Why shouldn't I index all the columns in the DB table?

Asked 2 years ago, Updated 2 years ago, 97 views

Hello, I'm a little weak with Database.
The query can be written to some extent, but the index seems to be still lacking.

People often say that if you put a lot of indexes on one table, the performance gets worse. I'm not sure why the performance is bad. For example, in MySQL, if all columns of a table are indexed, this is inefficient and there is a performance problem I often hear it, but I don't know why the performance is bad inside even if I nod.

Please give us a refreshing answer!

database

2022-09-22 21:59

2 Answers

I'm not a DBA but a programmer Let me explain it from a programmer's point of view.

Let's say you build your own DB. DB is a system for finding specific items in a huge list.

List<User> users;

It's when the size of the list is small, so you can just go for-loop and compare them one by one, but

for (int i = 0; i != users.Count; ++i)
{
    if (users[i].name == inName)
        return users[i];
}

If the list size increases or a quick search is required, something like dictionary is required.

Dictionary<string, User> nameIndex;
nameIndex.TryGetValue(inName, outUser);

It's good that the search is faster, but here's a drawback. Every time you add data, you have to add it to the index.

users.Add(newUser);
nameIndex.Add(newUser.name, newUser);

If you add indexes to all columns, it will work as below.

ageIndex.Add(newUser.age, newUser);
jobIndex.Add(newUser.job, newUser);
...

The index will need more capacity than the main body. It would be a bit difficult to use a DB to process huge data, but if there is a larger index than the main body, right?

In addition to adding, you will have to find the existing key in the index one by one and insert a new key. DB handles it on its own, so it looks fine (?), but it's terrible to think that it's programming itself.

Lastly, is the search okay?

For example, let's say you find the user who logged on last week and played the game and reward them. Let's say we simply index the gameplay time and add a record.

gameRecord.playTime = Now();
playTimeIndex.Add(gameRecord.playTime, gameRecord)

If 10,000 users play 100 times a day, the number of keys in that index is a million. Seven million in a week, maybe 2.5 billion in a year. If you have accumulated data for about a year, you need to find a case that is included in 2016-03-14 00:00 to 2016-03-20 23:59:59 out of 2.5 billion keys... As expected, it's terrible;;

That's why you shouldn't index any column.

If you index the play week column instead of the play time, the height decreases rapidly from 2.5 billion to 52.

gameRecord.playTime = Now();
gameRecord.playWeek = playTime.ToWeek()
playWeekIndex.Add(gameRecord.playWeek, gameRecord)

52 searches will be quick, right?


2022-09-22 21:59

First of all, looking at the characteristics of the index,

Number 3 here can be a problem if there are more indexes.

Each time a record is inserted and deleted, it is necessary to take an operation on all indexes that must be changed by that record and save the results back to disk.

For example, if there is only a primary key in the table, a very simple representation of the operations that occur when a record is inserted and deleted is as follows:

Now suppose that the table has an index value and an index out for another key:

It's done as above. Depending on the implementation of the DBMS, there will only be a difference in the speed of each operation, and the amount of work will not change. The more indexes you have, the more things you need to do when inserting and deleting records, all of which are related to disk I/O.

Therefore, if you create a lot of indexes, the performance of insert/delete will inevitably differ. However, many indexes are available in the search, so you will be able to perform under various search conditions. If most of the query language you use is programmed, you already know which columns (properties, columns) are searched by, so you need to create an index to use it efficiently.

So is the nature of trying to use the database and the proportion of searches high? Or are there frequent insertions/deletions? Accordingly, you have to operate strategically.


2022-09-22 21:59

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.