July 07, 2024

#Collections: Part 4- A Guide to HashTable in Java

 

What is a Hashtable?

Java Hashtable class implements Map, which maps keys to values.  It operates on the concept of Hashing, where each key is converted by a hash function into a distinct index in an array. The index functions as a storage location for the matching value. Here is the declaration for java.util.Hashtable class.

public class Hashtable < K,V > extends Dictionary implements Map < K,V >, Cloneable, Serializable

where, K is the type of keys maintained by this map and V is the type of mapped values.

The important points about the Java Hashtable class are:
  • A Hashtable is an array of lists. Each list is known as a bucket. The position of the bucket is identified by calling the hashcode() method. A Hashtable contains values based on the key.
  • It contains only unique elements.
  • It may not have any null key or value. If you attempt to insert a null key or value, a NullPointerException will be thrown.
  • Hashtable is synchronized. This means that all operations on a Hashtable are thread-safe. This synchronization comes with a performance cost, so if thread safety is not required, HashMap is generally preferred due to better performance.
Constructors of Java Hashtable class
  • Hashtable(): Constructs a new, empty hashtable with a default initial capacity (11) and load factor (0.75).
  • Hashtable(int initialCapacity): Constructs a new, empty hashtable with the specified initial capacity and default load factor (0.75).
  • Hashtable(int initialCapacity, float loadFactor): Constructs a new, empty hashtable with the specified initial capacity and the specified load factor.
The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

A hash table is a collection of items which are stored in such a way as to make it easy to find them later. Each position of the hash table often called a slot, can hold an item and is named by an integer value starting at 0.

Initially, the hash table contains no items so every slot is empty. hashCode() is a hash function which is present in the Object class and returns an integer value (known as Hash value). This function provides the mapping between an item and the slot where that item belongs in the hash table.

hash = hashfunc(key)
index = hash % array_size

The hash function will take any item in the collection and return an integer in the range of slot names, between 0 and m-1.

Once the hash values have been computed, we can insert each item into the hash table at the designated position. When we want to search for an item, we simply use the hash function to compute the slot name for the item and then check the hash table to see if it is present. This searching operation is O(1), since a constant amount of time is required to compute the hash value and then index the hash table at that location. If everything is where it should be, we have found a constant time search algorithm.

Collision (it may also be called a clash), when two or more items would need to be in the same slot, which will create a problem for the hashing technique.

What is the difference between HashMap and Hashtable?

  • HashMap allows one null key and any number of null values while Hashtable does not allow null keys and null values.
  • HashMap is not synchronized or thread-safe while Hashtable is synchronized or thread-safe.
  • HashMap object values are iterated by using iterator. HashTable is the only class other than Vector which uses Enumerator to iterate the values of HashTable object.
  • HashMap is much faster and uses less memory than Hashtable as the former is unsynchronized. Unsynchronized objects are often much better in performance in compare to synchronized  object like Hashtable in single threaded environment.
  • HashMap is the subclass of the AbstractMap class, whereas Hashtable is a subclass of the Dictionary class(which is now obsolete in Jdk 1.7, so it is not used anymore). It is better off externally synchronizing a HashMap or using a ConcurrentMap implementation (e.g ConcurrentHashMap). Although Hashtable and HashMap has different superclasses but they both are implementations of the "Map" abstract data type.
  • Java 5 introduced ConcurrentHashMap which is an alternative of Hashtable and provides better scalability than Hashtable in Java.

Similarities Between HashMap and Hashtable

  • Both HashMap and Hashtable do not guarantee that the order of the map will remain constant over time. Instead use LinkedHashMap, as the order remains constant over time.
  • Both HashMap and Hashtable implement a Map interface.
  • Both HashMap and Hashtable provide constant time performance for put and get methods assuming that the objects are distributed uniformly across the bucket.
  • Both HashMap and Hashtable work on the Principle of Hashing.

No comments:

Post a Comment