Hash Table
1. Hash Table Basics
1.1 Motivation
Associated Container (aka, Map, Dictionary) is an Abstract Data Type (ADT) composed of a collection of key-value pairs, such that each possible key appears at most once in the collection. Operations associated with such ADT allow:
- the addition of a pair to the collection
- the removal of a pair from the collection
- the modification of an existing pair
- the lookup of a value associated with a particular key
Such ADT can be implemented by Hash Table or Tree data structure.
1.2 Big Picture and Concepts
Big Picture
- Calculate the slot location in hash table by key
- Resolve collision by chaining or open addressing
注:为什么是key-value pair而不是value?因为要处理collision。
Concepts
- Hash Function: calculate slot location of hash table by key
- Collision: more than one keys maps to the same slot location in the hash table
- Chaining: a method to resolve collision
- Open Addressing: a method to resolve collision
2. Hash Function
2.1 What is a good hash function?
A good hash function satisfies the assumption of simple uniform hashing: each key is equally likely to hash to any of the \(m\) slots, independently of where any other key has hashed to.
2.2 Common Hash Functions
- Divide Method
- Multiplication Method
- Universal Hashing
3. Resolve Collision
3.1 Chaining
TODO
3.2 Open Addressing
TODO
4. std::hash and std::unordered_map
2.3.1 std::hash
template<typename Key>
struct hash;
std::hash是一个模板类,同时也是一个Functor,提供了operator()方法,返回size_t类型的slot location。
STl中对不同类型的Key做了特化处理:
2.3.2 user defined hash function
C++中自定义hash function一般是自己写std::hash的特化,主要就是重写operator()的逻辑。 TODO
2.3.3 std::unordered_map
TODO