April 01, 2016

Java Collections Interview Questions and Answers

JDK 1.2 introduces a new framework for collections of objects, called the Java Collections Framework.

What are Collections?
A collection (sometimes called a container) is simply an object that groups multiple elements into a single unit. Collections are used to store, retrieve, manipulate, and communicate aggregate data.

Why we need Collections when we have Array?
  • Arrays are immutable, once defined you cannot increase the size of an Array.
  • Array are not really thread-safe.
What is the difference between Collection and Collections?
Both are part of java collection framework, but both serve different purpose. Collection is a top level interface of java collection framework where as Collections is an utility class.

Collection interface is a member of java.util package and List, Set and Queue are main sub interfaces of this interface. JDK doesn’t provide any direct implementations of this interface. But, JDK provides direct implementations of it’s sub interfaces. ArrayList, Vector, HashSet, LinkedHashSet, PriorityQueue are some indirect implementations of Collection interface. Map interface, which is also a part of java collection framework, doesn’t inherit from Collection interface.

Collections is an utility class in java.util package. It consists of only static methods which are used to operate on objects of type Collection. e.g
  • Collections.max() : - This method returns maximum element in the specified collection.
  • Collections.min() : - This method returns minimum element in the given collection.
  • Collections.sort() : - This method sorts the specified collection.
  • Collections.shuffle() : - This method randomly shuffles the elements in the specified collection.
  • Collections.synchronizedCollection() : - This method returns synchronized collection backed by the specified collection.
  • Collections.binarySearch() : - This method searches the specified collection for the specified object using binary search algorithm.
  • Collections.disjoint() : - This method returns true if two specified collections have no elements in common.
  • Collections.copy() : - This method copies all elements from one collection to another collection.
  • Collections.reverse() : - This method reverses the order of elements in the specified collection.
What is the root interface in collection hierarchy ? 
Collection interface extends Iterable interface(which is present in java.lang package not in java.util package). In Oracle doc, iterable interface is not mentioned as a part of the Java Collections framework. So if interviewer ask you this question, then you should answer that the root interface is as Collection interface (which is found in java.util package).

What is a Set in Java?
The java.util.Set is an interface, which is a subtype of the java.util.Collection interface with following properties:
  • Duplicate elements are not allowed.
  • Elements are not stored in order. That means you cannot expect elements sorted in any order when iterating over elements of a Set.

What is the difference between List and Set?
Set contain only unique elements while List can contain duplicate elements. Set is unordered while List is ordered(maintains the order in which the objects are added).

What is the difference between Map and Set?

Map object has unique keys each containing some value, while Set contain only unique values.

What is a Java HashSet?
Java HashSet class is used to create a collection that uses a hash table for storage. It inherits the AbstractSet class and implements Set interface. The important points about Java HashSet class are:
  • HashSet stores the elements by using a mechanism called hashing.
  • HashSet contains unique elements only.

HashSet class declaration:
public class HashSet < E > extends AbstractSet implements Set < E > , Cloneable, Serializable 

Constructors of HashSet:
  • HashSet() : Construct a new, empty HashSet whose backing HashMap has the default capacity (16) and loadFacor (0.75).
  • HashSet(Collection c) : It is used to initialize the hash set by using the elements of the collection c.
  • HashSet(int capacity): It is used to initialize the capacity of the hash set to the given integer value capacity. The capacity grows automatically as elements are added to the HashSet.
Methods of Java HashSet class:
  • void clear() : It is used to remove all of the elements from this set.
  • boolean contains(Object o) : It is used to return true if this set contains the specified element.
  • boolean add(Object o) : It is used to adds the specified element to this set if it is not already present.
  • boolean isEmpty() : It is used to return true if this set contains no elements.
  • boolean remove(Object o) : It is used to remove the specified element from this set if it is present.
  • Object clone() : It is used to return a shallow copy of this HashSet instance: the elements themselves are not cloned.
  • Iterator iterator() : It is used to return an iterator over the elements in this set.
  • int size() : It is used to return the number of elements in this set.
 How does HashSet is implemented in Java?
HashSet is internally implemented using HashMap. Whenever we create a HashSet object, one HashMap object associated with it is also created. The elements we add into HashSet are stored as keys of this HashMap object, where as the value associated with those keys will be a constant.

Every constructor of HashSet class internally creates one HashMap object.

private transient HashMap map;
public HashSet()
{
    map = new HashMap<>();
}
public HashSet(Collection c)
{
    map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
    addAll(c);
}
public HashSet(int initialCapacity, float loadFactor)
{
    map = new HashMap<>(initialCapacity, loadFactor);
}
public HashSet(int initialCapacity)
{
    map = new HashMap<>(initialCapacity);
}


Whenever we insert an element into HashSet using add() method, it actually creates an entry in the internally backing HashMap object with element you have specified as it’s key and constant called “PRESENT” as it’s value. e.g:
private static final Object PRESENT = new Object();

add() method of HashSet class internally calls put() method of backing HashMap object by passing the element you have specified as a key and constant “PRESENT” as it’s value. e.g:
public boolean add(E e)
{
        return map.put(e, PRESENT)==null;
}

remove() method also works in the same manner.
public boolean remove(Object o)
{
        return map.remove(o)==PRESENT;
}


What is a Java LinkedHashSet?
Java LinkedHashSet class is a Linked list implementation of the set interface. It inherits HashSet class and implements Set interface. The important points about Java LinkedHashSet class are:
  • Contains unique elements only like HashSet.
  • Provides all optional set operations, and permits null elements.
  • Maintains insertion order.
Declaration for java.util.LinkedHashSet class:
public class LinkedHashSet <  E > extends HashSet < E > implements Set < E > , Cloneable, Serializable 

Constructors of Java LinkedHashSet class
  • LinkedHashSet() : It is used to construct a default LinkedHashSet.
  • LinkedHashSet(Collection c) : It is used to initialize the LinkedHashSet by using the elements of the collection c.
  • LinkedHashSet(int capacity) : It is used initialize the capacity of the LinkedHashSet to the given integer value capacity.
  • LinkedHashSet(int capacity, float fillRatio) : It is used to initialize both the capacity and the fill ratio (also called load capacity) of the hash set from its argument.  
What is a Java TreeSet?
Java TreeSet class implements the Set interface that uses a tree for storage. It inherits AbstractSet class and implements NavigableSet interface. The objects of TreeSet class are stored in ascending order. The important points about Java TreeSet class are:
  • Contains unique elements only like HashSet.
  • Access and retrieval times are quiet fast.
  • Maintains ascending order.
Declaration for java.util.TreeSet class.
public class TreeSet < E > extends AbstractSet < E > implements NavigableSet < E > , Cloneable, Serializable 
Constructors of Java TreeSet class
  • TreeSet() : It is used to construct an empty tree set that will be sorted in an ascending order according to the natural order of the tree set.
  • TreeSet(Collection c) : It is used to build a new tree set that contains the elements of the collection c.
  • TreeSet(Comparator comp) : It is used to construct an empty tree set that will be sorted according to given comparator.
  • TreeSet(SortedSet ss) : It is used to build a TreeSet that contains the elements of the given SortedSet.
Methods of Java TreeSet class
  • boolean addAll(Collection c) : It is used to add all of the elements in the specified collection to this set.
  • boolean contains(Object o) : It is used to return true if this set contains the specified element.
  • boolean isEmpty() : It is used to return true if this set contains no elements.
  • boolean remove(Object o) : It is used to remove the specified element from this set if it is present.
  • boolean add(Object o) : It is used to add the specified element to this set if it is not already present.
  • void clear() : It is used to remove all of the elements from this set.
  • Object clone() : It is used to return a shallow copy of this TreeSet instance.
  • Object first() : It is used to return the first (lowest) element currently in this sorted set.
  • Object last() : It is used to return the last (highest) element currently in this sorted set.
  • int size() : It is used to return the number of elements in this set.
What is the difference between HashSet and TreeSet?
Main differences between HashSet and TreeSet are :
a. HashSet stores the object in random order, there is no guarantee that the element we inserted first in the HashSet will be printed first in the output. While elements are sorted according to the natural ordering of its elements in TreeSet. If the objects can not be sorted in natural order than use compareTo() method to sort the elements of TreeSet object.
b. HashSet can store null object while TreeSet can not store null object. If one try to store null object in TreeSet object, it will throw Null Pointer Exception.
c. HashSet take constant time performance for the basic operations like add, remove contains and size. While TreeSet guarantees log(n) time cost for the basic operations(add,remove,contains).
d.  HashSet is much faster than TreeSet, as performance time of HashSet is constant against the log time of TreeSet for most operations (add,remove ,contains and size). Iteration performance of HashSet mainly depends on the load factor and initial capacity parameters.
e. HashSet uses equals() method for comparison in java while TreeSet uses compareTo() method for maintaining ordering.

Can a null element added to a Treeset or HashSet?
A null element can be added only if the TreeSet contains one element because when a second element is added then as per set definition a check is made to check duplicate value and comparison with null element will throw NullPointerException. HashSet is based on hashMap and can contain null element.

Why can't we add BigDecimal to TreeSet?
Lets look at the below example:
BigDecimal b1= new BigDecimal("1.0");
BigDecimal b2= new BigDecimal("1.00");
System.out.println(b1.compareTo(b2)); //0
System.out.println(b1.equals(b2)); //false

  
equals() of BigDecimal will return false, because their precision is different, whereas compareTo() will return 0 because the "value" are the same.  

Now lets do the same thing by using Double:
Double d1=new Double(1.0);
Double d2=new Double(1.00);
System.out.println(d1.compareTo(d2)); //0
System.out.println(d1.equals(d2)); //true

In above case, compareTo will return 0 and equals will return true.

So,if you store these two BigDecimal in HashSet you will end up with duplicates (violation of Set Contract), while if you store them in TreeSet you will end up with just 1 element because HashSet uses equals() to check duplicates while TreeSet uses compareTo() to check duplicates.
What is an ArrayList in Java?
Java ArrayList class uses a dynamic array for storing the elements. It inherits AbstractList class and implements List interface. The important points about Java ArrayList class are:
  • Java ArrayList class can contain duplicate elements.
  • Java ArrayList class maintains insertion order.
  • Java ArrayList class is non synchronized.
  • Java ArrayList allows random access because array works at the index basis.
  • In Java ArrayList class, manipulation is slow because a lot of shifting needs to be occurred if any element is removed from the array list.

The declaration for java.util.ArrayList class.
public class ArrayList < E > extends AbstractList < E > implements List < E > , RandomAccess, Cloneable, Serializable 

Constructors of Java ArrayList
  • ArrayList() : It is used to build an empty array list.
  • ArrayList(Collection c) : It is used to build an array list that is initialized with the elements of the collection c.
  • ArrayList(int capacity) : It is used to build an array list that has the specified initial capacity.
Methods of Java ArrayList
  • void add(int index, Object element) : It is used to insert the specified element at the specified position index in a list.
  • boolean addAll(Collection c) : It is used to append all of the elements in the specified collection to the end of this list, in the order that they are returned by the specified collection's iterator.
  • void clear() : It is used to remove all of the elements from this list.
  • int lastIndexOf(Object o) : It is used to return the index in this list of the last occurrence of the specified element, or -1 if the list does not contain this element.
  • Object[] toArray() : It is used to return an array containing all of the elements in this list in the correct order.
  • Object[] toArray(Object[] a) : It is used to return an array containing all of the elements in this list in the correct order.
  • boolean add(Object o) : It is used to append the specified element to the end of a list.
  • boolean addAll(int index, Collection c) : It is used to insert all of the elements in the specified collection into this list, starting at the specified position.
  • Object clone() : It is used to return a shallow copy of an ArrayList.
  • int indexOf(Object o) : It is used to return the index in this list of the first occurrence of the specified element, or -1 if the List does not contain this element.
  • void trimToSize() : It is used to trim the capacity of this ArrayList instance to be the list's current size.
What is the difference between Array and ArrayList in Java?
a. Array is static in size that is fixed length data structure, one can not change the length after creating the Array object. While ArrayList is dynamic in size. Each ArrayList object has instance variable capacity which indicates the size of the ArrayList. As elements are added to an ArrayList its capacity grows automatically.
b. ArrayList can not contains primitive data types (like int, float, double) it can only contains Object, while Array can contain both primitive data types as well as objects. One get a misconception that we can store primitives(int,float,double) in ArrayList, but it is not true. Let us take an example:

ArrayList  arraylistobject = new ArrayList();
arraylistobject.add(10); // try to add primitive 10

JVM through Autoboxing (converting primitives to equivalent objects internally) ensures that only objects are added to the arraylist object. Thus, above step internally works like this :

arraylistobject.add(new Integer(10)); //Converted int primitive to Integer object and added to arraylistobject
c. Length of the ArrayList is provided by the size() method while Each array object has the length variable which returns the length of the array.
e.g :
Integer arrayobject[] = new Integer[3];
arraylength= arrayobject.length;  //uses arrayobject length variable

ArrayList  arraylistobject = new ArrayList();
arraylistobject.add(10);
arraylistobject.size();//uses arraylistobject size method
d.  Array can be multi dimensional , while ArrayList is always single dimensional.
e. In Java, one can ensure Type Safety through Generics. while Array is a homogeneous data structure , thus it will contain objects of specific class or primitives of specific  data type. In array if one try to store the different data type other than the specified while creating the array object , ArrayStoreException is thrown. e.g :
String temp[] =  new String[2];  // creates a string array of size 2
temp[0] = new Integer(12); // throws ArrayStoreException, trying to add Integer object in String[]
f. We can insert elements into the arraylist object using the add() method while  in array we insert elements using the assignment operator.
e.g :
Integer addarrayobject[] = new Integer[5];
addarrayobject[0]= new Integer(1);


Similarities Between Array and ArrayList
a. add and get method : Performance of Array and ArrayList are similar for the add and get operations .Both operations runs in constant time.
b. Duplicate elements : Both array and arraylist can contain duplicate elements.
c. Null Values : Both can store null values and uses index to refer to their elements.
d. Unordered :  Both does not guarantee ordered elements.

How to convert the array of strings into the list ?
Array class of java.util package contains the method asList() which accepts the array as parameter.
e.g:
String[]  meArray =  {"I"  , "Me" , "Myself"};
List meList =  Arrays.asList(meArray);


What is the significance of ensureCapacity() method of Arraylist?
ArrayList internally implements growable dynamic array which means it can increase and decrease its size automatically. If we try to add an element to a already full ArrayList then it automatically re-sized internally to accommodate the new element however sometimes its not a good approach.

If you want to add huge number of elements to an already full ArrayList, in such case ArrayList has to be resized several number of times, which would result in a poor performance. For such scenarios you can use, ensureCapacity() method of Java.util.ArrayList class.  This method increases the size of the ArrayList by a specified capacity.

ensureCapacity(int minCapacity), takes minimum capacity as input parameter and does not return any value and does not throw any exception.

When should you use ArrayList and when should you use LinkedList?
  • If you need to support random access, without inserting or removing elements from anywhere other than the ends, then ArrayList is a better choice.
  • If you add and remove elements from the middle of the list frequently while accessing them sequentially, then LinkedList is better.
What is CopyOnWriteArrayList?  How it is different from  ArrayList in Java?
It is introduced in JDK1.5 and is a thread safe variant of ArrayList in which all mutative operations like add, set are implemented by creating a fresh copy of the underlying array. It guaranteed not to throw ConcurrentModificationException and it permits all elements including null.

Difference between Arraylist and Vector in Java
Its very common  Core Java Interview question
  • Vector is synchronized while ArrayList is not synchronized. Synchronization and thread safe means at a time only one thread can access the code .In Vector class all the methods are synchronized. Thats why the Vector object is already synchronized when it is created.
  • Vector is slow as it is thread safe. In comparison ArrayList is fast as it is non synchronized. Thus in ArrayList two or more threads can access the code at the same time, while Vector is limited to one thread at a time.
  • A Vector defaults to doubling size of its array. By default ArrayList size is 10. While when you insert an element into the ArrayList, it checks whether it reaches the last element then it will create the new array with a size = (Array size by + 50% of array size)., copy the new data of last array to new array, then old array is garbage collected by the Java Virtual Machine (JVM).
  • Other than Hashtable ,Vector is the only other class which uses both Enumeration and Iterator . While ArrayList can only use Iterator for traversing an ArrayList.
  • java.util.Vector class was there in java since the very first version of the java development kit (jdk). java.util.ArrayList  was introduced in java version 1.2 , as part of Java Collections framework. In java version 1.2, Vector class has been refactored to implement the List Inteface.
Vector implements a dynamic array. Following is the list of constructors provided by the vector class.
  • Vector( ) : This constructor creates a default vector, which has an initial size of 10.
  • Vector(int size) : This constructor accepts an argument that equals to the required size, and creates a vector whose initial capacity is specified by size.
  • Vector(int size, int incr) : This constructor creates a vector whose initial capacity is specified by size and whose increment is specified by incr. The increment specifies the number of elements to allocate each time that a vector is resized upward.
  • Vector(Collection c) : This constructor creates a vector that contains the elements of collection c.
Which collection classes are synchronized or thread-safe?
Stack, Properties, Vector and Hashtable.



What is the difference between Queue and Stack?
Queue is a data structure which is based on FIFO(first in first out) property. Stack is a data structure which is based on LIFO (last in first out) property.

What is an Iterator?
Iterator is an interface, which found in java.util package and provides methods to iterate over any Collection.

What is the difference between Iterator and Enumeration?
The main difference between Iterator and Enumeration is that Iterator has remove() method while Enumeration doesn't. Hence, using Iterator we can manipulate objects by adding and removing the objects from the collections. Enumeration behaves like a read only interface as it can only traverse the objects and fetch it.

Which design pattern followed by Iterator?
It follows iterator design pattern. Iterator design pattern provides us to navigate through the collection of objects by using a common interface without letting us know about the underlying implementation.

Enumeration is also an example of Iterator design pattern.

If an ArrayList has to be iterated to read data only, what are the possible ways and which is the fastest?
It can be done in two ways, using for loop or using iterator of ArrayList. The first option is faster than using iterator. Because value stored in arraylist is indexed access. So while accessing the value is accessed directly as per the index.

If accessing through iterator is slow then why do we need it and when to use it.
For loop does not allow the updation in the array(add or remove operation) inside the loop whereas Iterator does. Also Iterator can be used where there is no clue what type of collections will be used because all collections have iterator.

Why is it preferred to declare: List < String >  list = new ArrayList  < String > (); instead of ArrayList <  String  > = new ArrayList < String > ();
  • If later on code needs to be changed from ArrayList to Vector then only at the declaration place we can do that.
  • The most important one – If a function is declared such that it takes list. E.g void showDetails(List list);
  • When the parameter is declared as List to the function it can be called by passing any subclass of List like ArrayList,Vector,LinkedList making the function more flexible.
How to sort list of strings - case insensitive?
using Collections.sort(list, String.CASE_INSENSITIVE_ORDER);

How to make a List (ArrayList,Vector,LinkedList) read only?
A list implemenation can be made read only using Collections.unmodifiableList(list). This method returns a new list. If a user tries to perform add operation on the new list; UnSupportedOperationException is thrown.


What is the difference between Iterator and ListIterator.
Using Iterator we can traverse the list of objects in forward direction . But ListIterator can traverse the collection in both directions that is forward as well as backward.

What is the difference between fail-fast and fail-safe Iterators?
Fail-fast Iterators throws ConcurrentModificationException when one Thread is iterating over collection object and other thread structurally modify Collection either by adding, removing or modifying objects on underlying collection. They are called fail-fast because they try to immediately throw Exception when they encounter failure.

Where as, fail-safe iterator traverse over a copy or view of original collection.

Most of the Collection classes from Java 1.4 e.g. ArrayList, HashMap, HashSet has fail-fast iterators.

The other type of iterator was introduced in Java 1.5 when concurrent collection classes e.g. ConcurrentHashMap, CopyOnWriteArrayList and CopyOnWriteArraySet was introduced. These iterator uses a view of original collection for doing iteration and that's why they doesn't throw ConcurrentModificationException even when original collection was modified after iteration has begun.  This means you could iterate and work with stale value, but this is the cost you need to pay for fail-safe iterator.

What is the difference between the remove() method of Collection and remove() method of Iterator?
Collection interface defines remove(Object obj) method to remove objects from Collection. List interface adds another method remove(int index), which is used to remove object at specific index. You can use any of these method to remove an entry from Collection, while not iterating.

But, you should not use Collection's or List's remove() method during iteration then your code will throw ConcurrentModificationException. Instead use Iterator's remove() method, which removes current element from Iterator's perspective.


How to reverse the List in Collections?
Collections.reverse(listobject);

What is the difference between peek(), poll() and remove() method of the Queue interface?Both poll() and remove() method is used to remove head object of the Queue. The main difference lies when the Queue is empty(), then poll() method will return null while remove() method will throw NoSuchElementException. peek() method retrieves but does not remove the head of the Queue. If queue is empty then peek() method also returns null.


What is the difference between Synchronized Collection and Concurrent Collection?
Though all three collection classes (ConcurrentHashMap, Hashtable, Synchronized Map) are thread-safe and can be used in multi-threaded, concurrent Java application, there is a significant difference between them.

All methods of Hashtable and Synchronized Map are synchronized which makes them quite slow due to contention if a number of thread increases.

ConcurrentHashMap is scalable as it uses stripped locking technique and is specially designed for concurrent use i.e. more than one thread. By default it simultaneously allows 16 threads to read and write from Map without any external synchronization. Unlike Hashtable and Synchronized Map, it never locks whole Map, instead, it divides the map into segments and locking is done on those. Though it performs better if a number of reader threads are greater than the number of writer threads.


What is a Map?
Its an interface which maps unique keys to values. A key is an object that you use to retrieve a value at a later date.
  • Several methods throw a NoSuchElementException when no items exist in the invoking map.
  • A ClassCastException is thrown when an object is incompatible with the elements in a map.
  • A NullPointerException is thrown if an attempt is made to use a null object and null is not allowed in the map.
  • An UnsupportedOperationException is thrown when an attempt is made to change an unmodifiable map.
Method's of Map:
  • void clear( ) : Removes all key/value pairs from the invoking map.
  • boolean containsKey(Object k) : Returns true if the invoking map contains k as a key. Otherwise, returns false.
  • boolean containsValue(Object v) : Returns true if the map contains v as a value. Otherwise, returns false.
  • Set entrySet( ) : Returns a Set that contains the entries in the map. The set contains objects of type Map.Entry. This method provides a set-view of the invoking map.
  • boolean equals(Object obj): Returns true if obj is a Map and contains the same entries. Otherwise, returns false.
  • Object get(Object k) : Returns the value associated with the key k.
  • int hashCode( ) : Returns the hash code for the invoking map.
  • boolean isEmpty( ) : Returns true if the invoking map is empty. Otherwise, returns false.
  • Set keySet( ) : Returns a Set that contains the keys in the invoking map. This method provides a set-view of the keys in the invoking map.
  • Object put(Object k, Object v) : Puts an entry in the invoking map, overwriting any previous value associated with the key. The key and value are k and v, respectively. Returns null if the key did not already exist. Otherwise, the previous value linked to the key is returned.
  • void putAll(Map m) : Puts all the entries from m into this map.
  • Object remove(Object k) : Removes the entry whose key equals k.
  • Collection values( ) : Returns a collection containing the values in the map. This method provides a collection-view of the values in the map.
Why Map interface does not extend the Collection interface in Java Collections Framework ?
Map interface is not compatible with the Collection interface. Since Map requires key as well as value, e.g if we want to add key-value pair then we will use put(Object key , Object value). So there are two parameters required to add element to the HashMap object. In Collection interface add(Object o) has only one parameter. The other reason is, Map supports valueSet, keySet as well as other appropriate methods which have just different views from the Collection interface.

What is a Hashtable?
Java Hashtable class implements Map, which maps keys to values. Here is the declaration for java.util.Hashtable class.

public class Hashtable < K,V > extends Dictionary implements Map < K,V >, Cloneable, Serializable

where, K is the type of keys maintained by this map and V is the type of mapped values.

The important points about Java Hashtable class are:
  • A Hashtable is an array of list. Each list is known as a bucket. The position of bucket is identified by calling the hashcode() method. A Hashtable contains values based on the key.
  • It contains only unique elements.
  • It may have not have any null key or value.
  • It is synchronized.
Constructors of Java Hashtable class
  • Hashtable() : Constructs a new, empty hashtable with a default initial capacity (11) and load factor (0.75).
  • Hashtable(int initialCapacity) : Constructs a new, empty hashtable with the specified initial capacity and default load factor (0.75).
  • Hashtable(int initialCapacity, float loadFactor) : Constructs a new, empty hashtable with the specified initial capacity and the specified load factor.
The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

A hash table is a collection of items which are stored in such a way as to make it easy to find them later. Each position of the hash table, often called a slot, can hold an item and is named by an integer value starting at 0.

Initially, the hash table contains no items so every slot is empty.

hashCode() is a hash function which is present in Object class and returns an integer value (known as Hash value). This function provides the mapping between an item and the slot where that item belongs in the hash table.

hash = hashfunc(key)
index = hash % array_size


The hash function will take any item in the collection and return an integer in the range of slot names, between 0 and m-1.

Once the hash values have been computed, we can insert each item into the hash table at the designated position. When we want to search for an item, we simply use the hash function to compute the slot name for the item and then check the hash table to see if it is present. This searching operation is O(1), since a constant amount of time is required to compute the hash value and then index the hash table at that location. If everything is where it should be, we have found a constant time search algorithm.

Collision (it may also be called a “clash”), when two or more items would need to be in the same slot, which will create a problem for the hashing technique.

How to write a perfect hash function?
A hash function that maps each item into a unique slot is referred to as a perfect hash function. However, irrespective of how good a hash function is, collisions are bound to occur. Nevertheless, we do not need the hash function to be perfect to still gain performance efficiency.

One way to always have a perfect hash function is to increase the size of the hash table so that each possible value in the item range can be accommodated. This guarantees that each item will have a unique slot. Although this is practical for small numbers of items, it is not feasible when the number of possible items is large.

As writing a perfect hash function is not alays possible, our goal should be to create a hash function that minimizes the number of collisions, is easy to compute, and evenly distributes the items in the hash table. There are a number of common ways to extend the simple remainder method. Here a few ways, we can write hash function:
#folding method : In this method we divide the item (key) into equal-size pieces (the last piece may not be of equal size). These pieces are then added together to give the resulting hash value. For example, if our item was the phone number 9920600280, we would take the digits and divide them into groups of 2 (99, 20, 60, 02, 80). After the addition, 99+20+60+02+80=195. If we assume our hash table has 11 slots, then we need to perform the extra step of dividing by 11 and keeping the remainder. In this case 195 % 11 is 8, so the phone number 9920600280 hashes to slot 8.

#mid-square method: In this method, we first square the item, and then extract some portion of the resulting digits. For example, if the item were 44, we would first compute the square of 44=1936. By extracting the middle two digits, 93, and performing the remainder step, we get 5 (93 % 11). Table 5 shows items under both the remainder method and the mid-square method.

How Hashmap works in Java
HashMap in Java works on hashing principle. It is a data structure which allows us to store object and retrieve it in constant time O(1) provided we know the key.

To understand Hashing, we first need to understand what is Hash Function, Hash Value and Bucket?
  • hashCode() is a hash function  which is present in  Object class and returns an integer value.
  • The Hash value is the int value returned by the hash function.
  • A bucket is used to store key value pairs and can have multiple key-value pairs. In hash map, bucket used simple linked list to store objects. 
In hashing, hash functions are used to link key and value in HashMap. Objects are stored by calling put(key, value) method of HashMap and retrieved by calling get(key) method.

When we call put method, hashcode() method of the key object is called so that hash function of the map can find a bucket location to store value object, which is actually an index of the internal array, known as the table.

HashMap internally stores mapping in the form of Map.Entry object which contains both key and value object. (Entry object stores in the bucket like this (hash,key,value,bucketindex)

When you want to retrieve the object, you call the get() method and again pass the key object. This time again key object generate same hash code (it's mandatory for it to do so to retrieve the object and that's why HashMap keys are immutable e.g. String) and we end up at same bucket location. If there is only one object then it is returned and that's your value object which you have stored earlier. Things get little tricky when collisions occur.

Since the internal array of HashMap is of fixed size, and if you keep storing objects, at some point of time hash function will return same bucket location for two different keys, this is called collision in HashMap. (two unequal objects in Java can have same hashcode). In this case, a linked list is formed at that bucket location and a new entry is stored as next node.

If we try to retrieve an object from this linked list, we need an extra check to search correct value, this is done by equals() method. Since each node contains an entry, HashMap keeps comparing entry's key object with the passed key using equals() and when it return true, Map returns the corresponding value.

Which methods you need to override to use any object as key in HashMap?
equals() and hashCode()

What happens On HashMap in Java if the size of the HashMap  exceeds a given threshold defined by load factor?
If the size of the Map exceeds a given threshold defined by load-factor e.g. if the load factor is .75 it will act to re-size the map once it filled 75%. Similar to other collection classes like ArrayList, Java HashMap re-size itself by creating a new bucket array of size twice of the previous size of HashMap and then start putting every old element into that new bucket array. This process is called rehashing because it also applies the hash function to find new bucket location.

Does resizing of HashMap in Java creates any problem?
Yes there is potential race condition exists while resizing HashMap in Java, if two thread at the same time found that now HashMap needs resizing and they both try to resizing. On the process of resizing of HashMap, the element in the bucket which is stored in linked list get reversed in order during their migration to new bucket because Java HashMap doesn't append the new element at tail instead it append new element at the head to avoid tail traversing. If race condition happens then you will end up with an infinite loop.

What is the difference between HashMap and Hashtable?
  • HashMap allows one null key and any number of null values while Hashtable does not allow null keys and null values.
  • HashMap is not synchronized or thread-safe while Hashtable is synchronized or thread-safe .
  • Hashmap object values are iterated by using iterator. HashTable is the only class other than vector which uses enumerator to iterate the values of HashTable object.
  • Hashmap is much faster and uses less memory than Hashtable as former is unsynchronized. Unsynchronized objects are often much better in performance in compare to synchronized  object like Hashtable in single threaded environment.
  • HashMap is the subclass of the AbstractMap class, where as Hashtable is a subclass of Dictionary class(which is now obsolete in Jdk 1.7,so it is not used anymore). It is better off externally synchronizing a HashMap or using a ConcurrentMap implementation (e.g ConcurrentHashMap). Although Hashtable and HashMap has different superclasses but they both are implementations of the "Map" abstract data type.
Similarities Between HashMap and Hashtable
  • Both HashMap and Hashtable  does not guarantee that the order of the map will remain constant over time. Instead use LinkedHashMap, as the order remains constant over time.
  • Both HashMap and Hashtable implements Map interface .
  • Both HashMap and Hashtable provides constant time performance for put and get methods assuming that the objects are distributed uniformly across the bucket.
  • Both HashMap and Hashtable works on the Principle of Hashing .
Why String, Integer and other wrapper classes are considered good keys?
String is most frequently used key as well because String is immutable and final, and overrides equals and hashcode() method. Other wrapper class also shares similar property. Immutability is required, in order to prevent changes on fields used to calculate hashCode() because if key object returns different hashCode during insertion and retrieval than it won't be possible to get an object from HashMap.

Immutability is best as it offers other advantages as well like thread-safety, if you can keep your hashCode same by only making certain fields final, then you go for that as well. Since equals() and hashCode() method is used during retrieval of value object from HashMap, it's important that key object correctly override these methods and follow contact. If unequal object returns different hashcode than chances of collision will be less which subsequently improve the performance of HashMap.

Can we use any custom object as a key in HashMap?
Yes we can use any Object as key in Java HashMap provided it follows equals and hashCode contract and its hashCode should not vary once the object is inserted into Map. If the custom object is Immutable than this will be already taken care because you can not change it once created.

ALSO READ: HashCode & equals() Interview Questions in Java

How null key is handled in HashMap? Since equals() and hashCode() are used to store and retrieve values, how does it work in case of the null key?

Null keys always map to hash 0, thus index 0.

The null key is handled specially in HashMap, there are two separate methods for that putForNullKey(V value) and getForNullKey(). Later is offloaded version of get() to look up null keys.  Null keys always map to index 0.

This null case is split out into separate methods for the sake of performance in the two most commonly used operations (get and put), but incorporated with conditionals in others. In short, equals() and hashcode() method are not used in case of null keys in HashMap.

private V getForNullKey() {
        if (size == 0) {
            return null;
        }
        for (Entry e = table[0]; e != null; e = e.next) {
            if (e.key == null)
                return e.value;
        }
        return null;
}


How Remove method in HashMap works internally in Java?
To understand how remove method of HashMap works, we first need to understand the Entry object. Map.Entry is the static nested class that stores the key/value pair that forms one element of HashMap. Entry object stores in the bucket in the following way (hash,key,value,bucketindex).

That means we need hashvalue and bucketindex besides key to get access to the desired Entry object in HashMap.

HashMap implementation of the remove(key) method in Java Apis that is rt.jar :

public class HashMap< K,V >
extends AbstractMap< K,V >
implements Map< K,V >, Cloneable, java.io.Serializable
{
 public V remove (Object key){
 Entry< K,V > e =  removeEntryForKey(key);
 return (e==null ? null : e.value);
 }
}


remove(key) method calls  removeEntryForKey(key) method internally, which calculate the final hashValue of the key object, and then use that hashValue in the indexFor(int,int) method to find the first entry object in the appropriate bucket.

Since bucket(table) is a LinkedList effectively, we start traversing from the first entry object which we got by using indexFor(int,int) method in the bucket. For each entry object in the bucket we compare whether  hashValue and the key is equal to the calculated hashValue in the first step and the key passed as a parameter in the remove(key) method.

If desired Entry object is found, then we removed that single entry object from the LinkedList. Removing a single Entry object from the LinkedList is implemented just like removing a single object from the LinkedList.

Why String is popular HashMap key in Java?
Since String is immutable, its hashcode is cached at the time of creation and it doesn’t need to be calculated again. This makes it a great candidate for key in a Map and it’s processing is fast than other HashMap key objects. This is why String is mostly used Object as HashMap keys.

What is a LinkedHashMap?
LinkedHashMap class is a Linked list implementation of the Map interface, with predictable iteration order. It inherits HashMap class and implements the Map interface. The important points about Java HashMap class are:
  • A LinkedHashMap contains values based on the key.
  • It contains only unique elements.
  • It may have one null key and multiple null values.
  • It is same as HashMap instead maintains insertion order.
Here is the declaration for java.util.LinkedHashMap class, where, K is the type of keys maintained by this map and V is the type of mapped values :

public class LinkedHashMap < K,V > extends HashMap < K,V > implements Map < K,V >

Constructors of Java LinkedHashMap class
  • LinkedHashMap()   : It is used to construct a default LinkedHashMap.
  • LinkedHashMap(int capacity) :   It is used to initialize a LinkedHashMap with the given capacity.
  • LinkedHashMap(int capacity, float loadfactor) : It is used to initialize both the capacity and the loadfactor.
  • LinkedHashMap(Map m) : It is used to initialize the LinkedHashMap with the elements from the given Map class m.
What is a TreeMap?
Java TreeMap class implements the Map interface by using a tree. It provides an efficient means of storing key/value pairs in sorted order. The important points about Java TreeMap class are:
  • A TreeMap contains values based on the key. It implements the NavigableMap interface and extends AbstractMap class.
  • It contains only unique elements.
  • It cannot have null key but can have multiple null values.
  • It is same as HashMap instead maintains ascending order.

here is TreeMap class declaration, where, K is the type of keys maintained by this map and V is the type of mapped values :
public class TreeMap < K,V > extends AbstractMap < K,V > implements NavigableMap < K,V > , Cloneable, Serializable 
Constructors of Java TreeMap class
  • TreeMap() : It is used to construct an empty tree map that will be sorted using the natural order of its key.
  • TreeMap(Comparator comp) : It is used to construct an empty tree-based map that will be sorted using the comparator comp.
  • TreeMap(Map m) : It is used to initialize a tree map with the entries from m, which will be sorted using the natural order of the keys.
  • TreeMap(SortedMap sm) : It is used to initialize a tree map with the entries from the SortedMap sm, which will be sorted in the same order as sm.
Methods of TreeMap:
  • boolean containsKey(Object key) : It is used to return true if this map contains a mapping for the specified key.
  • boolean containsValue(Object value) : It is used to return true if this map maps one or more keys to the specified value.
  • Object firstKey() : It is used to return the first (lowest) key currently in this sorted map.
  • Object get(Object key) : It is used to return the value to which this map maps the specified key.
  • Object lastKey() : It is used to return the last (highest) key currently in this sorted map.
  • Object remove(Object key) : It is used to remove the mapping for this key from this TreeMap if present.
  • void putAll(Map map) : It is used to copy all of the mappings from the specified map to this map.
  • Set entrySet() : It is used to return a set view of the mappings contained in this map.
  • int size() : It is used to return the number of key-value mappings in this map.
  • Collection values() : It is used to return a collection view of the values contained in this map. 
What is difference between HashMap and TreeMap?
HashMap can contain one null key, where as TreeMap can not contain any null key.
HashMap maintains no order, whereas TreeMap maintains ascending order.

What is IdentityHashMap?
IdentityHashMap in Java was added in Java 1.4. The main difference between IdentityHashMap and HashMap in Java is that IdentityHashMap is a special implementation of Map interface which doesn't use equals() and hashCode() method for comparing object unlike other implementation of Map e.g. HashMap. Instead, IdentityHashMap uses equality operator "=="  to compare keys and values in Java which makes it faster compare to HashMap and suitable where you need reference equality check and instead of logical equality.

By the way, IdentityHashMap is a special implementation of Map interface much like EnumMap but it also violates general contract of Map interface which mandates using equals method for comparing Object.

Declaration for java.util.IdentityHashMap class:
public class IdentityHashMap < K,V >  extends AbstractMap < K,V > implements Map , Serializable, Cloneable

Constructors of IdentityHashMap:
  • IdentityHashMap() : This constructs a new, empty identity hash map with a default expected maximum size (21).
  • IdentityHashMap(int expectedMaxSize) : Tis constructs a new, empty map with the specified expected maximum size.
  • IdentityHashMap(Map < ? extends K,? extends V >  m) : This constructs a new identity hash map containing the keys-value mappings in the specified map.

Difference between IdentityHashMap and HashMap?
  • The main difference between HashMap vs IdentityHashMap is that IdentityHashMap uses equality operator "==" for comparing keys and values inside Map while HashMap uses equals() method for comparing keys and values.
  • Unlike HashMap, who uses hashcode to find bucket location, IdentityHashMap also doesn't use hashCode() instead it uses System.identityHashCode(object).
  • Since IdentityHashMap doesn't use equals() its comparatively faster than HashMap for object with expensive equals() and hashCode().
  • One more difference between HashMap and IdentityHashMap is Immutability of the key. One of the basic requirement to safely store Objects in HashMap is keys needs to be immutable, IdentityHashMap doesn't require keys to be immutable as it is not relied on equals and hashCode.

What is ConcurrentHashMap?
ConcurrentHashMap is introduced as an alternative of Hashtable and provided all functions supported by Hashtable with an additional feature called "concurrency level", which allows ConcurrentHashMap to partition Map.

ConcurrentHashMap allows multiple readers to read concurrently without any blocking. This is achieved by partitioning Map into different parts based on concurrency level and locking only a portion of Map during updates. Default concurrency level is 16, and accordingly Map is divided into 16 part and each part is governed with a different lock. This means, 16 thread can operate on Map simultaneously until they are operating on different part of Map. This makes ConcurrentHashMap high performance despite keeping thread-safety intact.

Since update operations like put(), remove(), putAll() or clear() is not synchronized, concurrent retrieval may not reflect most recent change on Map.

Difference between HashMap and ConcurrentHashMap?
a. ConcurrentHashMap is thread-safe that is the code can be accessed by single thread at a time. While HashMap is not thread-safe.
b. ConcurrentHashMap doesn’t throw a ConcurrentModificationException if one thread tries to modify it while another is iterating over it.
c. HashMap can be synchronized by synchronizedMap(HashMap) method, which will return a collection which is almost equivalent to Hashtable, where every modification operation on Map is locked on Map object while in case of ConcurrentHashMap, thread-safety is achieved by dividing whole Map into different partition based upon Concurrency level and only locking particular portion instead of locking the whole Map.
d. ConcurrentHashMap does not allow null values, so the key can not be null in ConcurrentHashMap. While In HashMap there can only be one null key.

What is WeakHashMap?A hashtable-based Map implementation with weak keys. An entry in a WeakHashMap will automatically be removed when its key is no longer in ordinary use. More precisely, the presence of a mapping for a given key will not prevent the key from being discarded by the garbage collector, that is, made finalizable, finalized, and then reclaimed. When a key has been discarded its entry is effectively removed from the map, so this class behaves somewhat differently than other Map implementations.

How can you Iterate over HashMap, Hashtable or any Map in Java
    public static void main(String[] args)
    {
        //create a map which contains Employee-Manager's mapping
        HashMap < String, String >  empManager = new HashMap < String, String > ();
        empManager.put("Martin", "Russhal");
        empManager.put("Andreas", "Mattrias");
        empManager.put("Elle", "Russhal");
     
        //using foreach loop on KeySet
        for (String key : empManager.keySet()) {
             System.out.println("Employee: " + key + ", Manager: " + empManager.get(key));
        }
     
        //using KeySet Iterator
        Set < String >  keySet = empManager.keySet();
        Iterator < String >  keySetIterator = keySet.iterator();
        String tempKey;
        while (keySetIterator.hasNext()) {
            tempKey = keySetIterator.next();
            System.out.println("Employee: " + tempKey + ", Manager: " + empManager.get(tempKey));
        }
     
        //using entrySet
        Set < Map.Entry < String, String >  >  entrySet = empManager.entrySet();
        for (Map.Entry < String, String >  entry : entrySet) {
           System.out.println("Employee: " + entry.getKey() + ", Manager: " + entry.getValue());
        }

        //print the values using map.values()
        Collection < String >  managerval=empManager.values();
        for (String manager: managerval)
        {
            System.out.println("Manager Name:"+manager);
        }
     
        //forEach and Lambda of Java 8
        //empManager.forEach((k,v)- > System.out.println("Employee : " + k + ", Manager : " + v));
    }

What will be the output of below program?
HashMap < ArrayList , String> al = new HashMap < ArrayList ,, String>();
ArrayList l1=new ArrayList();
ArrayList l2=new ArrayList();
al.put(l1,  "l1");
al.put(l2,  "l2");
System.out.println("Size="+al.size());


Output will be "Size=1", because l1.equals(l2) will return true. But suposse you add an element in l1 or l2 then output might or might not be 1 depending on the reference of String added in the l1 and l2.

-K Himaanshu Shuklaa..

1 comment:

Anonymous said...

thanks you sir

Post a Comment

RSSChomp Blog Directory