How to determine the quick sort pivot

Asked 2 years ago, Updated 2 years ago, 44 views

I'm an information student
I think it's very important to decide on pivot in quick sort, but what are the advantages of each?
①Choose three from the array and set the median to pivot. ②
Compare the first two elements of the array and use the larger one as pivot. ③Random
④Repeat を and use the median candidate obtained in で as pivot

To be honest, in terms of priority
④I think it is >1>3>2. What do you think?

Benefits I think
①Possible to avoid minimum or maximum values and select near median values
②I'm not sure
③In the end, if there is a possibility to choose the maximum or minimum value, can I do random calculations without unnecessary calculations?
④Decrease the probability of taking the maximum and minimum values of の, and increase the probability of taking a value close to the median value (calculation volume increases a little)

algorithm

2022-09-30 19:21

2 Answers

First of all, there is a big difference between how to choose a median or a maximum value and how to choose it at random, whether it is a decisive or stochastic algorithm.Quick sort is an algorithm that does not stop when you choose a pivot incorrectly.Therefore, if you randomly select the pivot, there is a possibility that sorting will not stop.In addition, it is also important to note that "choosing random values" in recent calculators is a more expensive calculation than choosing a median or a maximum value.

For example, The quick-sorting algorithm on the Japanese version of Wikipedia does not stop if you choose the only minimum value for elements of an array in the pivot every time."How to pivot the first two maximums" or "How to pivot the middle of the three" is the effort required to ensure that sorting stops.

In addition, if you can choose the median of elements in the array as a pivot, you can reduce the average number of comparisons between elements.This is what you expect when you choose three elements from an array (for example, the first three) and use the median as a pivot.In fact, on average, it should be faster than the "first two maximums" method...

Finally, the method of recursively finding the median.Consider, for example, an implementation that divides the whole into three parts, and the result of the "Choose three elements and take a median" operation for each of them is m1, m2, m3, and the three median elements are pivots.A similar approach has been considered in the paper "Engineering a Sort Function" (Bentley & Milroy.1993.link), which follows the implementation of the C language's qsort function.

Personally, I think it's best to actually implement and measure the merits and demerits of this area.Once you start tuning around here, it's hard to estimate how fast the implementation is because it involves factors such as how long the median approximation takes in the first place and how long it takes to replace elements in an array.These days, for example, CPU's are pretty good at predicting branches, and if the array is too long, it's a question of whether or not you're in the cache, so you can't ignore the situation.Therefore, it is ultimately easy to experiment and check the speed.

To sum up: Generally speaking, the median of three elements (selected according to some rule) is likely to be faster on average than the "maximum of the first two" or "random," but it actually depends on the implementation and calculator, so try the experiment in the end.

Supplemental: For example, the Quicksort page on the English version of Wikipedia also has some information about pivot selection, so it will be helpful.Intro sort and dual-pivot quicksorts related to faster sorting are also mentioned.


2022-09-30 19:21

If you think about the possibility that the pivot selection will be biased, you also have to think about what input will be given.
All of them have equally biased pivot inputs, so they are good and bad in that sense, but considering what kind of data is actually easy to give, it's terrible to have them biased when sorted data is entered like 2.I can't say 1, 3, and 4 in general because I don't know the specific operation, but it's better to avoid using the method of 4 and 1 rather than 3 because the pivot selection becomes more regular.
Ultimately, your personal priority will be 3>(4>1)>2.4 and 1 are parentheses because they do not know exactly what to do.


2022-09-30 19:21

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.