In the last session, we implemented a Hash Table, the backbone of which is the hash function.
An ideal hash function gives a unique location for each key. However, since the number of possible keys is much larger than the number of locations, some keys will inevitably hash to the same index location. When this happens, it is called a hash collision.
For this exercise, we will attempt collision resolution using Open Addressing. Refer to your notes if you would like to learn about the other collision resolution strategy, using Closed Addressing.
Some of the open addressing strategies include:
- Linear Probing (we will be using this today)
- Quadratic Probing (not in syllabus)
- Double Hashing (not in syllabus)
- Robin Hood Hashing (not in syllabus)
With reference to the algorithm given in the notes, you are required to Implement a Hash Table class (in the hashtable_open_addressing.py file) that incorporates open addressing collision resolution, using Linear Probing.
with the attributes:
size- size of the hash tablevalues- a Python list spanning the size of the hash table, pre-filled with the None placeholder
and the following methods:
repr()- returns a formatted string containing the values in the hash table_hash(key)- takes in a
stringargumentkey - returns a hashed
integervalue denoting thelocationusing a hash function. For simplicity, the _hash method has been implemented for you.Take note, this will not be the case for exams.
- takes in a
_rehash(old_location)- returns a new location (for linear probing)
-
Tip
How do I implement rehash for linear probing?- Simply increment the old location to move to the next slot.
- If you're at the end of the table, wrap around to the beginning.
setitem(key, value)- adds/updates a key-value pair (as a tuple) in the hash table
- raises an
Exceptionif the hashtable is full -
Tip
- Step 1: Hash the key to find the starting index
- Step 2: Remember the starting index to detect when we've looped back to the beginning
- Step 3: Search (repeatedly) for an empty slot or a matching key to update
- Step 4: If the current slot is empty, insert the new key-value pair as a tuple
- Step 5: If the current slot has a matching key, update its value
- Step 6: Move to the next index using rehashing
- Step 7: If we’ve gone full circle, the table is full. Raise an exception!
getitem(key)- returns the following:- returns a value associated with the given key or
- raise a
KeyErrorif the key is not found -
Tip
- Step 1: Hash the key to find the starting index
- Step 2: Remember the starting index to detect when we've looped back to the beginning
- Step 3: Search (repeatedly) for matching key (ie. Start Probing)
- Step 4: If the current slot is empty, the key doesn’t exist (stop search)
- Step 5: Check if the key matches the current slot
- Step 6: Move to the next index using rehashing
- Step 7: Check if we’ve gone full circle and key wasn't found. Raise a KeyError exception!
Implement the _rehash(old_location) method such that:
- it accepts a string
old_locationas parameter - and returns a integer value denoting the new location of an item in the hash table, for linear probing
Implement the rest of the methods for the Hash Table (with linear probing included).
Can you figure out what changes need to be implemented, if we want to include a delitem method?
NOTE: Remember to include the docstrings for your methods