July 05, 2024

#Collections: A Guide to Spliterator in Java

What is a Spliterator?

  • A Spliterator is an iterator-like object that can traverse and partition sequences of elements. It combines the functionalities of an Iterator and a Splitter, allowing efficient parallel traversal and decomposition of elements from a source. It supports both sequential and parallel processing of data structures.
  • A Spliterator can traverse and split the underlying data source into multiple parts for concurrent processing. This makes it suitable for parallel computation.
  • Unlike traditional Iterators, Spliterators can partition off some of their elements, which can be processed in parallel by different threads. This feature enhances performance for large data sets.
  • Spliterators are designed to be stateless and immutable. This ensures that they can safely be used concurrently by multiple threads without interference.
  • Spliterators can be either ordered or unordered. Ordered spliterators maintain a specific order of elements (e.g., in a list or array), while unordered spliterators do not guarantee any specific order (e.g., in a set or hash map).
A Spliterator has eight different characteristics that describe its behaviour.
  • SIZED – if it’s capable of returning an exact number of elements with the estimateSize() method
  • SORTED – if it’s iterating through a sorted source
  • SUBSIZED – if we split the instance using a trySplit() method and obtain Spliterators that are SIZED as well
  • CONCURRENT – if the source can be safely modified concurrently
  • DISTINCT – if for each pair of encountered elements x, y, !x.equals(y)
  • IMMUTABLE – if elements held by the source can’t be structurally modified
  • NONNULL – if the source holds nulls or not
  • ORDERED – if iterating over an ordered sequence

What are the main methods of Spliterator?

Spliterator interface defines 8 methods:

  • int characteristics(): Returns a set of characteristics of this Spliterator and its elements. The result is from ORDERED, DISTINCT, SORTED, SIZED, CONCURRENT, NONNULL, SUBSIZED, and IMMUTABLE.
  • long estimateSize( ): Provides an estimate of the number of elements remaining to be traversed or split. It returns Long.MAX_VALUE if infinite, unknown, or too expensive to compute.
  • default long getExactSizeIfKnown( ) : It returns estimateSize() if this Spliterator is SIZED, else -1.
  • default Comparator<? super T> getComparator( ): If this Spliterator’s source is SORTED by a Comparator, returns that Comparator. It will return null if the source is SORTED in natural order. Otherwise, if the source is not SORTED, throws IllegalStateException.
  • default boolean hasCharacteristics(int val) : It returns true if this Spliterator’s characteristics() contain all of the given characteristics.
  • boolean tryAdvance(Consumer<? super T> action): Returns true if a remaining element exists, performs the given action on it; else returns false. If this Spliterator is ORDERED the action is performed on the next element in encounter order. Exceptions thrown by the action are relayed to the caller.
  • default void forEachRemaining(Consumer<? super T>action): This method performs the given action for each remaining element, sequentially in the current thread, until all elements have been processed or the action throws an exception. The actions are performed in encounter order if this Spliterator is ORDERED. If the action throws some exceptions they are relayed to the caller.
  • Spliterator<T> trySplit( ): Splits the elements covered by the spliterator into two parts, returning a new spliterator covering one of the parts. Otherwise, returns null. Thus, if successful, the original spliterator iterates over one portion of the sequence and the returned spliterator iterates over the other portion. This is the key method for enabling parallel processing.

Explain how Spliterator differs from Iterator in Java 8+.

Iterator is primarily used for sequential traversal and supports basic operations like hasNext(), next(), and remove(). Where as Spliterator, supports parallel traversal of elements and can partition a source of elements for parallel processing. It provides additional methods like trySplit() for splitting into two parts.

Does the Spliterator API support primitive values including double, int and long?

Yes, the Spliterator API in Java supports primitive values through specialized interfaces that handle primitive types directly. These interfaces are designed to improve performance by avoiding the overhead of boxing and unboxing primitive values to their corresponding wrapper types (Integer, Long, Double, etc.).

Java provides specialized spliterator interfaces for each of the primitive types:
  • Spliterator.OfInt: This interface is used for int values. It extends Spliterator<Integer> but operates directly on int values without boxing/unboxing.
  • Spliterator.OfLong: This interface handles long values. It extends Spliterator<Long> but operates directly on long values.
  • Spliterator.OfDouble: This interface is used for double values. It extends Spliterator<Double> but operates directly on double values.

What are the benefits of using Primitive Spliterators?

  • Performance: Avoids the overhead of boxing and unboxing primitive values to their wrapper types (Integer, Long, Double), which can improve performance significantly, especially in large datasets or parallel processing scenarios.
  • Memory Efficiency: Reduces memory usage by eliminating the need for objects to wrap each primitive value.
  • Type Safety: Ensures type safety when working directly with primitive values, preventing accidental misuse or errors.
-K Himaanshu Shuklaa..

No comments:

Post a Comment