Implementing a heap and a hashtable

posted in: Research Paper | 0

Objectives

The objectives of this programming assignment are:

  • To implement a heap and a hash table.
  • To implement linear probing to handle hash collisions.
  • To practice writing templated classes.

Background

For this project, you have to complete two templated classes:

  1. h: the Heapclass is a templated, vector-based heap implementation. This is a straight-forward implementation of a heap. Since it uses a vector rather than a C-style array to store the heap, you will not need to write a copy constructor, assignment operator, or destructor.
  2. h: the HashTableclass is a hash table implementation with a wrinkle — each bucket in the table is a heap! Collisions will be handled by linear probing. The array to store the hash table is a dynamically-allocated array (of heaps), so you will need to write a copy constructor, assignment operator, and destructor for this class.

Why would you ever want to build such a thing? I’m not sure you would, but here is a made-up application. Suppose you are a donut baker, and you distribute donuts to many different shops. You have a large number of donut types that you can bake — glazed, cinnamon, boston cream, bavarian cream, sprinkle, etc. — and lots of customers. Some of the customers are more important to you because they buy more donuts and always pay on time. You would like to record all your orders in the following way:

  1. First, you want to be able to search by donut type.
  2. Then, given the donut type, you always want to process the highest priority order first.

So, the donut type is the key for hashing purposes. All orders for a particular type will be stored in the same bucket, but the bucket will be a max-heap so that you can always retrieve the highest priority order quickly.

Once you see how the donut example works, it’s not too difficult to think of others. Suppose you want to track the repair priority of highway bridges. You want to search by highway name — US50, I97, I695, MD32, etc. — and, given the highway name, retrieve the bridge that is most in need of repair. The highway name is the hash key, and each bucket of the hash table is a max-heap using based on the repair priorities.

Since the Heap and HashTable classes are templated, it is easy to test them with either of the two examples given, or an example of your own. See the comments in hash.h for the requirements that the template class must satisfy and donut.h for an implementation of the donut example.

Your Assignment

Your assignment is to complete the templated class Heap in heap.h and the templated class HashTable in hash.h. The classes must be implemented entirely in their .h files: there will be no separate .cpp files.

You may have been taught to implement templated files differently, creating separate .h and .cpp files that include each other and are both guarded, but that is not how we are doing it for this project.

The HashTable class uses the Heap class, so it is recommended that you implement and test Heap first. A sample driver is provided that demonstrates use of the HashTable class with a Donut object. The driver does not perform thorough testing. As always, you are responsible for thoroughly testing your program. It is particularly important that your code run without segmentation faults, memory leaks, or memory errors.

PUBLIC METHODS IN HEAP

  • A constructor with the signature:
  • template <class T>
  • Heap<T>::Heap() ;

Initializes the empty heap.

  • A function that returns the number of items in the heap:
  • template <class T>
  • unsigned Heap<T>::size() const ;

This function is implemented in-line in heap.h.

  • A function that returns true if the heap is empty and false otherwise:
  • template <class T>
  • bool Heap<T>::empty() const ;

This function is implemented in-line in heap.h.

  • A function that returns true if the heap has ever contained data and false otherwise:
  • template <class T>
  • bool Heap<T>::used() const ;

This function is implemented in-line in heap.h. It is needed to support linear probing in HashTable.

  • A function that inserts an object into the heap:
  • template <class T>
  • void Heap<T>::insert(const T& object) ;

The function inserts object into the heap and maintains the max-heap property.

  • A function that reads the highest priority element in the heap:
  • template <class T>
  • T Heap<T>::readTop() const ;

The function returns the highest priority element in the heap, but does not remove it from the heap.

  • A function that removes the highest priority element from the heap:
  • template <class T>
  • void Heap<T>::removeTop() ;

The max-heap property must be remained after the highest priority element is removed.

  • A function to dump the contents of the heap array in index order:
  • template <class T>
  • void Heap<T>::dump() ;

Prints the contents of the heap in array-index order (not by priority). The template class T must overload the insertion operator (operator<<).

PUBLIC METHODS IN HASHTABLE

  • A constructor with the signature:
  • template <class T>
  • HashTable<T>::HashTable(unsigned size, hash_fn hash) ;

size is the size of the hash table and hash is a pointer to a hash function (hash_fn is a typedef in hash.h).

  • A destructor with the signature:
  • template <class T>
  • HashTable<T>::~HashTable() ;

The destructor must delete the dynamically-allocated hash table.

  • A copy constructor with the signature:
  • template <class T>
  • HashTable<T>::HashTable(const HashTable<T>& ht) ;

The copy constructor must make a deep copy of the hash table.

  • An assignment operator with signature:
  • template <class T>
  • const HashTable<T>& HashTable<T>::operator=(const HashTable<T>& ht) ;

The assignment operator must make a deep copy of the right-hand-side object. It must also guard against self-assignment.

  • A function that returns the size of the hash table:
  • template <class T>
  • unsigned HashTable<T>::tableSize() const ;

This function is implemented in-line in hash.h.

  • A function that returns the number of elements in the hash table:
  • template <class T>
  • unsigned HashTable<T>::numEntries() const ;

This function is implemented in-line in hash.h.

  • A function that returns the load factor of the hash table:
  • template <class T>
  • float HashTable<T>::lambda() const ;

This function is implemented in-line in hash.h.

  • A function that inserts an object into the hash table:
  • template <class T>
  • bool HashTable<T>::insert(const T& object) ;

The function inserts object into the hash table. The insertion index is determined by applying the hash function _hash that is set in the HashTable constructor call and then reducing the output of the hash function modulo the table size (the MOD compression function). Hash collisions should be resolved using linear probing.The function returns true if the object is successfully inserted and false otherwise. The function could fail to insert an object if, for example, the hash table is full.

  • A function that reads and removes the highest priority object with the given key:
  • template <class T>
  • bool HashTable<T>::getNext(string key, T& obj) ;

The function locates the bucket corresponding to the given key using the hashing method described for the insert() function and resolving collisions using linear probing. It then retrieves and removes the highest priority object from the heap and returns the object in the reference parameter obj.The function returns true if it sucessfully locates and returns an object with the specified key value; otherwise, it returns false.

  • A function to dump the contents of the hash table in index order:
  • template <class T>
  •   void HashTable<T>::dump() ;

Prints the contents of the hash table in array-index order. For each bucket, it should call the heap dump() function to output the heap contents.

Additional Requirements

  • Both templated classes (Heap and HashTable) must be implemented in their respective .h files. There are no separate .cpp files for Heap and HeapTable.
  • Private helper functions may be added to either templated class; however, they must be declared in the private section of the class declaration.
  • Heap must implement a vector-based max-heap data structure. It must not use the index zero element of the vector; heap elements are stored at indices 1, 2, …, size(). The template class T must provide priorities through a priority() function that returns an unsigned integer.
  • If readTop() or removeTop() is called on an empty heap, a range_error exception must be thrown.
  • The heap insert() and removeTop() methods must run in O(logn)O(log⁡n) time.
  • HashTable must implement a hash table in which each buckets is a Heap object. Each bucket contains objects with the same key value, organized in a max-heap by priority. The template class T must provide the key value for an object through a key() function that returns a string.
  • The hash table size and hash function are specified in the constructor call. The table must be dynamically allocated in the constructor.
  • Hash collisions must be resolved by linear probing.
  • Hash indices must be computed by applying the hash function to the key value and reducing the hash output modulo the table size (i.e. the MOD compression function).
  • For the HashTable class, a destructor, copy constructor, and assignment operator must be implemented.
  • Basic hash operations on a sparsely populated table (small load factor) must run in constant time. In particular, insertion to an empty table and retrieval from a table with only one element must run in constant time.

Implementation Notes

  • The _used variable and used() function of the Heap class are provided to support linear probing in HashTable. Recall that, with linear probing, buckets that have never held data must be treated differently than buckets that once held data that has since been deleted.
  • Be sure to separately test your Heap class. For grading purposes, it will be tested separately and as it is used with HashTable.

Provided Files

Header files are included for both templated classes. You must complete the implementations in the respective .h files; there are no separate .cpp files for the two classes.

  • heap.h — header file for the Heap class. Complete your heap implementation in this file.
  • hash.h — header file for the HashTable class. Complete your hash table implementation in this file.

A simple driver program that uses the Donut class is also provided. This is not a thorough test program. You are responsible for thorough testing of your implementations.

  • driver.cpp — a simple test driver for the HashTable and Heap classes. Uses the Donut class defined in donut.h.
  • donut.h — a Donut class for use withe driver program.
  • driver.txt — sample output for the driver program.

 

Last Updated on