Table of Contents
- Construction Benchmarks
- Modifying Benchmarks
Now it gets interesting. This benchmark benchmarks a few things at once:
- Insert 100 million random
- Clear all entries with
- Reinsert 100 million random
intinto the same cleared map.
- Remove all of the inserted entries one by one until the map is empty again.
- Destruct the empty map.
100 million int-int pairs take at least 1526 MB. It is interesting to see how much overhead the maps have here, and how they deal with resizing their data.
clear() is interesting too, for flat maps it might be possible to optimize for data that is trivially destructible, then clear() can be very fast. Reinsertion is interesting to see if a map reuses initialized memory, and if it can gain any speed from that. Removing elements one by one is interesting to see removal performance - some maps need to rearrange entries (e.g. robin-hood based maps) which might slow down their performance.
Please see this for the full benchmark code. It makes use of
sfc64, which is an extremely fast and high quality random number generator.
Now it gets a lot more interesting. The clear winner here is
robin_hood::hash. It’s consistently faster than all other hashes. Both
folly::hasher lead to timeouts in the
absl hash variants. So these hashes should be used with great care.
For each group of hashes, the bold entries show the pareto front of speed vs. memory usage:
tsl::robin_mapis clearly the fastest map in this benchmark. It’s raw insertion speed is fastest of all. But this speed comes at a cost: it requires 4597 MB RAM, which is quite a bit more than the other maps. It is interesting to see that it seems that it does not make use of trivially destructible optimization in
clear(), because it takes about 1.04 seconds. In comparison,
robin_hood::unordered_flat_mapchecks if the entries are trivially destructible and only takes 0.0054 seconds. But even without that optimization, reinsertion speed is 2.77 seconds which is a blazingly fast compared to all other maps.
robin_hood::unordered_flat_mapis a bit slower, but requires far less peak memory: only 1717 MB. It has a 1 byte overhead per entry, and interestingly it seems to have practically the same peak memory usage as
folly::F14ValueMapis already quite a bit slower, but still on the pareto front because it uses a tad lower memory. It seems to not have the 1 byte overhead.
phmap::parallel_flat_hash_mapmakes use of multiple maps internally, so when it needs to resize, it can do so in steps which lowers the peak memory requirement.
tsl::sparse_mapis optimized for memory usage and thus takes even less memory. It is faster and uses less memory than it’s main competitior
I think it is interesting to note that
robin_hood::unordered_node_map is the fastest node-based map, featuring stable references like
std::unordered_map. Also interesting to see is that
absl::flat_hash_map’s reinsert is actually quite a bit slower than the original insertion. In that case it seems to be faster to just destroy the whole map and create a new one, instead of reusing it - which seems strange to me.
Each entry shows total runtime for that part of the benchmark.
- blue: Insert 100M entries.
- green: reinsert 100M entries.
- red: remove all 100M entries one-by-one.
- magenta: Destruct the now-empty map.