What is Judy array?

Asked 2 years ago, Updated 2 years ago, 98 views

It looks like an associative array, but
For associative arrays, map I think it's sufficient that it's easily prepared in STL. Why was Judy array born?
Compared to map and set, what is the big difference between what is prepared there?

c++ algorithm data-structure

2022-09-30 20:33

1 Answers

Simply put, it's a tree structure that focuses on CPU cache efficiency.

The question appears to be a mixture of layers as features and layers as implementations, so I'll explain it separately.

feature layer

The data structure that is given a key and can retrieve the value associated with it is commonly referred to as a map.Maps can be implemented in a variety of ways.If you're just talking about "maps" as a feature, you don't really think about the processing time and memory efficiency of operations (searching, inserting, deleting, etc.).Some languages and libraries may refer to maps as "arrays" (regardless of implementation).

In terms of functionality, whether implementation is a list, array, or tree, it can be used as a map, so it's "enough.However, for example, a map of 1 million elements is difficult to use in O(n).Therefore, various data structures are considered.

std::map in C++ is a "map" at this layer in that sense, since no specific implementation is specified.However, due to the limited amount of calculation, it is often implemented in trees and hash tables.

implementation layer:array

An array as an implementation is a data structure in which keys can be counted as integers and the keys are directly indexed with values arranged in a straight line.

Speed and memory efficiency are best when the range of keys available is limited to some extent in advance and the key distribution is sufficiently dense.

However, if that condition is not met, for example, the array will waste a huge amount of memory, with keys ranging from 0 to 2^32 but only 1000 of them actually being used.

In general, the list is not an "array" in this sense."In the case of ""associative array"", the implementation is rarely a complete array, so it is better to think of it as a functional layer."If you use the term "array" at the feature layer, you may also call the array "vector" at the implementation layer to distinguish it.

implementation layer:tree

Wood structures are often used when keys are not distributed well or the range of keys is not limited.The data structure is basically reference O(logn), but there are many variations to improve performance in more detail.

  • Balance: The average depth of a tree is O(log n), but if the data is biased and all nodes are concentrated on one branch, the depth is O(n).There are many differences in how the algorithms work to balance things so that they don't happen.
  • Aliti: How many branches are coming out of the node?In the simplest binary tree, non-leaved nodes have up to two children.The algorithm is simple, but as the number of elements increases, the total number of nodes increases, and that's how much space and time are inefficient because you have to touch all parts of the memory.On the other hand, a block of disks that can be used as a single node by growing a lot of branches on a node is compatible with storing a lot of data on the disk.Poor data tends to lead to poor spatial efficiency.

Judy array is designed to adaptively change the structure of the nodes to make the most of the CPU cache, whether data is dense or sparse.

If the map in the question refers to std::map in C++, of course you can use Judy array to implement std::map.It's confusing that the name has "array", but this "array" means the functional layer, not the implementation layer.


2022-09-30 20:33

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.