April 04, 2017

Streams in Java 8 interview questions and answers

What is a Java Stream?
A stream represents a sequence of elements and supports different kind of operations to perform computations upon those elements.

In a simple term, a stream is an iterator whose role is to accept a set of actions to apply on each of the elements it contains.

Why we need Streams? 
The purpose of Streams in Java 8 is to make collection processing simple and concise.

Suppose we want to iterate over a list of integers and find out sum of all the integers greater than 70. Prior to Java 8, we can do it in this way:

List < Integer > intlist=new ArrayList < Integer > ();
        intlist.add(80);
        intlist.add(80);
        intlist.add(10);

private static int findSum(List intlist) {
    Iterator it =
intlist.iterator();
    int sum = 0;
    while (it.hasNext()) {
        int num = it.next();
        if (num > 70)
            sum += num;
    }
    return sum;
}


To know the sum of integers, we would have to provide how the iteration will take place, which is also called external iteration because client program is handling the algorithm to iterate over the list. We have to write a lot of code to perform a single task and the program is sequential in nature, there is no way we can do this in parallel easily.

Java 8 Stream API was introduced to overcome all these shortcomings. With Java 8 Stream API, we can implement internal iteration, which is better because java framework is in control of the iteration.

Internal iteration provides several features such as sequential and parallel execution, filtering based on the given criteria, mapping etc. Most of the Java 8 Stream API method arguments are functional interfaces, so lambda expressions work very well with them. In java 8, to find out sum of all the integers greater than 70 from a list :

private static int findSumUsingStream(List intlist) {
    return
intlist.stream().
             filter(i -> i > 70).
             mapToInt(i -> i).
            sum();
}


What is the difference between intermediate and terminal operations?
Intermediate operations return a stream so we can chain multiple intermediate operations without using semicolons.

Intermediate operations are always lazy, i.e. they do not process the stream at the call site, an intermediate operation can only process data when there is a terminal operation. Some of the intermediate operations are filter, map and flatMap.

Terminal operations are either void or return a non-stream result. These operations terminate the pipeline and initiate stream processing. The stream is passed through all intermediate operations during terminal operation call. Terminal operations include forEach, reduce, Collect and sum.


Stream operations must be non-interfering and stateless : Most stream operations accept some kind of lambda expression parameter, a functional interface specifying the exact behavior of the operation. Most of those operations must be both non-interfering and stateless.

A function is non-interfering when it does not modify the underlying data source of the stream, e.g. in the above example no lambda expression does modify 'intlist' by adding or removing elements from the collection.

A function is stateless when the execution of the operation is deterministic, e.g. in the above example no lambda expression depends on any mutable variables or states from the outer scope which might change during execution.

What is the difference between Streams and Collections?
 A collection is an in-memory data structure to hold values and before we start using collection, all the values should have been populated. Whereas a java Stream is a data structure that is computed on-demand.

Java Stream doesn’t store data, it operates on the source data structure (collection and array) and produce pipelined data that we can use and perform specific operations. Such as we can create a stream from the list and filter it based on a condition.

Java Stream operations use functional interfaces, that makes it a very good fit for functional programming using lambda expression.

Java 8 Stream internal iteration principle helps in achieving lazy-seeking in some of the stream operations (like filtering, mapping, or duplicate removal).

Java Streams are consumable, so there is no way to create a reference to stream for future usage. Since the data is on-demand, it’s not possible to reuse the same stream multiple times.

Java 8 Stream support sequential as well as parallel processing, parallel processing can be very helpful in achieving high performance for large collections.

How Streams work?
A stream represents a sequence of elements and supports different kind of operations to perform computations upon those elements. e.g:

List myList =Arrays.asList("andreas", "allena", "michael", "marina", "michelle ", "droy");

myList.stream()
.filter(s -> s.startsWith("m"))
.map(String::toUpperCase)
.sorted()
.forEach(System.out::println);


Output will be:
MARINA
MICHAEL
MICHELLE


What are different kind of streams?

Streams can be created from various data sources, especially collections. Lists and Sets support new methods stream() and parallelStream() to either create a sequential or a parallel stream. Parallel streams are capable of operating on multiple threads.

Arrays.asList("Kempten", "Munich", "Berlin")
.stream()
.findFirst()
.ifPresent(System.out::println);  // Kempten


Calling the method stream() on a list of objects returns a regular object stream. But we don't have to create collections in order to work with streams. Just use Stream.of() to create a stream from a bunch of object references. For e.g:

Stream.of("Kempten", "Munich", "Berlin")
.findFirst()
.ifPresent(System.out::println);  // Kempten


All the Java Stream API interfaces and classes are in the java.util.stream package. Since we can use primitive data types such as int, long in the collections using auto-boxing and these operations could take a lot of time, there are specific classes (like IntStream, LongStream and DoubleStream) to handle primitive types.

IntStreams can replace the regular for-loop utilizing IntStream.range():

IntStream.range(1, 5).forEach(System.out::println);
// 1
// 2
// 3
// 4

All those primitive streams work just like regular object streams with the following differences:
Primitive streams use specialized lambda expressions, e.g. IntFunction instead of Function or IntPredicate instead of Predicate.
Primitive streams support the additional terminal aggregate operations sum() and average()

Arrays.stream(new int[] {1, 2, 3, 4})
    .map(n -> 3 * n + 1)
    .average()
    .ifPresent(System.out::println);  // 8.5

Sometimes it's useful to transform a regular object stream to a primitive stream or vice versa. For that purpose object streams support the special mapping operations mapToInt(), mapToLong() and mapToDouble:

Stream.of("m1", "m5", "m3")
    .map(s -> s.substring(1))
    .mapToInt(Integer::parseInt)
    .max()
    .ifPresent(System.out::println);  //5

Primitive streams can be transformed to object streams via mapToObj().
IntStream.range(1, 4)
    .mapToObj(i -> "m" + i)
    .forEach(System.out::println);
// m1
// m2
// m3

Processing Order Of Streams: When executing the below code snippet, nothing is printed to the console. That is because intermediate operations will only be executed when a terminal operation is present.
Stream.of("Kempten", "Berlin", "Munich", "Frankfurt", "Hamburg")
    .filter(city -> {
       System.out.println("Filter: " + city);
        return true;
    });


Now we extend the above example by the terminal operation forEach:
Stream.of("Kempten", "Berlin", "Munich", "Frankfurt", "Hamburg")
        .filter(city -> {
            System.out.println("Filter: " + city);
            return true;
        })
        .forEach(city -> System.out.println("forEach: " + city));


It will print:
Filter: Kempten
forEach: Kempten
Filter: Berlin
forEach: Berlin
Filter: Munich
forEach: Munich
Filter: Frankfurt
forEach: Frankfurt
Filter: Hamburg
forEach: Hamburg


The operations is expected to be executed horizontally one after another on all elements of the stream. But instead each element moves along the chain vertically. The first string "Kempten" passes filter then forEach, only then the second string "Berlin" is processed.

This behavior can reduce the actual number of operations performed on each element, e.g:
Stream.of("Kempten", "Berlin", "Munich", "Frankfurt", "Hamburg")
        .map(city -> {
            System.out.println("map: " + city);
            return city.toUpperCase();
        })
        .anyMatch(city -> {
            System.out.println("anyMatch: " + city);
            return city.startsWith("M");
        });


Which will print:
map: Kempten
anyMatch: KEMPTEN
map: Berlin
anyMatch: BERLIN
map: Munich
anyMatch: MUNICH

 

The operation anyMatch returns true as soon as the predicate applies to the given input element. This is true for the third element passed "MUNICH". Due to the vertical execution of the stream chain, map has only to be executed twice in this case. So instead of mapping all elements of the stream, map will be called as few as possible.
 
Reusing Streams: As soon as you call any terminal operation the stream is close, thats why in Java 8 streams cannot be reused. Lets take an example:

Stream stream=
    Stream.of("Kempten", "Berlin", "Munich", "Frankfurt", "Hamburg")
    .filter(s->s.startsWith("X"));
       
System.out.println(stream.anyMatch(s->true)); //true
System.out.println(stream.noneMatch(s->false)); //exception


In the above example, we are calling noneMatch after anyMatch on the same stream, that's why following exception is thrown:
Exception in thread "main" java.lang.IllegalStateException: stream has already been operated upon or closed
    at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
    at java.util.stream.ReferencePipeline.noneMatch(Unknown Source)

   
To overcome this limitation, we need to create a new stream chain for every terminal operation we want to execute. In the below example, each call to get() constructs a new stream on which we are save to call the desired terminal operation.

Supplier < Stream < String > > supplyStream=
    ()-> Stream.of("Kempten", "Berlin", "Munich", "Frankfurt", "Hamburg")
    .filter(s->s.startsWith("M"));

       
System.out.println(supplyStream.get().anyMatch(s->true)); //true
System.out.println(supplyStream.get().noneMatch(s->false)); //true


Advanced Operations: Java 8 Streams have some complex operations like collect, flatMap and reduce.

public class Employee {
    String name;
    int age;

    Employee(String name, int age) {
        this.name = name;
        this.age = age;
    }  
    @Override
    public String toString() {
        //return "Employee with name "+name+" has an age of "+age+" years.";
        return name;
    }
   
    public static void main(String[] args) {
        List < Employee > employees=Arrays.asList(
            new Employee ("Andreas Kaiser", 35),
            new Employee ("Mario Morich", 31),
            new Employee ("Richard Marino", 31),
            new Employee ("Michael Jackson", 64),
            new Employee ("Dale Bhagwagar", 48),
            new Employee ("Tiger Shroff", 26),
            new Employee ("Paul Davidson", 45),
            new Employee ("K Himaanshu Shuklaa", 30));
    }
}

Collect : It's an extremely useful terminal operation to transform the elements of the stream into a different kind of result, e.g. a List, Set or Map. Collect accepts a Collector which consists of four different operations: a supplier, an accumulator, a combiner and a finisher.

A very simple way to construct a list from the elements of a stream:
List < Employee > filteredEmployees=employees.stream()
    .filter(e->e.name.startsWith("M"))
    .collect(Collectors.toList());
System.out.println(filteredEmployees);

 

Now, let us group all the Employees by age:

Map < Integer, List < Employee > > employeesByAge=employees.stream()
    .collect(Collectors.groupingBy(e->e.age));
      
employeesByAge.forEach((age, e)->System.out.format("age %s: %s \n", age, e));


It will print:
age 64: [Michael Jackson]
age 48: [Dale Bhagwagar]
age 35: [Andreas Kaiser]
age 26: [Tiger Shroff]
age 45: [Paul Davidson]
age 30: [K Himaanshu Shuklaa]
age 31: [Mario Morich, Richard Marino]


To determine the average age of all the employees:
Double averageAge=employees.stream()
    .collect(Collectors.averagingInt(e->e.age));
System.out.println("Average Age Of All The Employees Is :"+averageAge);
//Average Age Of All The Employees Is :38.75


The summarizing collectors return a special built-in summary statistics object. By this, we can simply determine min, max and arithmetic average age of the persons as well as the sum and count.

IntSummaryStatistics empAgeSummary =employees.stream()
    .collect(Collectors.summarizingInt(e -> e.age));
System.out.println(empAgeSummary);


This will print:


IntSummaryStatistics{count=8, sum=310, min=26, average=38.750000, max=64}

'joining' can be used to join all persons into a single string. The join collector accepts a delimiter as well as an optional prefix and suffix.


String joining = employees.stream()
    .filter(p -> p.age >= 45)
    .map(p -> p.name)
    .collect(Collectors.joining(" & ", "In Germany, ", " are of legal age."));
System.out.println(joining);



It will print:


In Germany, Dale Bhagwagar & Michael Jackson & Paul Davidson are of legal age.

Now let us transform the stream elements into a map, but to do so we have to specify how both the keys and the values should be mapped. IllegalStateException will be thrown if the mapped keys are not unique. To bypass this exception we can optionally pass a merge function as an additional parameter.

If the age is unique:


Map employeeMap=employees.stream()
    .collect(Collectors.toMap(e->e.age, e->e.name));
System.out.println(employeeMap);


This will print:


{64=Michael Jackson, 48=Dale Bhagwagar, 32=Richard Marino, 35=Andreas Kaiser, 26=Tiger Shroff, 45=Paul Davidson, 30=K Himaanshu Shuklaa, 31=Mario Morich}

If the age is not unique:


Map employeeMap=employees.stream()
    .collect(Collectors.toMap(e->e.age, e->e.name, (ename1, ename2) -> ename1 + ";" + ename2));
System.out.println(employeeMap);


It will print:
{64=Michael Jackson, 48=Dale Bhagwagar, 35=Andreas Kaiser, 26=Tiger Shroff, 45=Paul Davidson, 30=K Himaanshu Shuklaa, 31=Mario Morich;Richard Marino}

How to build our own special collector?
Let us create our own collector, by transforming all persons of the stream into a single string consisting of all names in upper letters separated by the | pipe character. We need to create a new collector via Collector.of() and have to pass the four ingredients of a collector: a supplier, an accumulator, a combiner and a finisher.

Collector < Employee, StringJoiner, String > empCollector =
        Collector.of(
            () -> new StringJoiner("|"), // supplier,
            (j, p) -> j.add(p.name.toUpperCase()), // accumulator
            (j1, j2) -> j1.merge(j2), // combiner
            StringJoiner::toString); // finisher
String empnames = employees.stream()
    .collect(empCollector);
System.out.println(empnames);


It will print:
ANDREAS KAISER|MARIO MORICH|RICHARD MARINO|DALE BHAGWAGAR|TIGER SHROFF|MICHAEL JACKSON|PAUL DAVIDSON|K HIMAANSHU SHUKLAA

Since strings in Java are immutable, we need a helper class like StringJoiner to let the collector construct our string. The supplier initially constructs such a StringJoiner with the appropriate delimiter. The accumulator is used to add each persons upper-cased name to the StringJoiner. The combiner knows how to merge two StringJoiners into one. In the last step the finisher constructs the desired String from the StringJoiner.
 

What is FlatMap? What is the difference between map and flatMap stream operation?
Map is limited because every object can only be mapped to exactly one other object. What if we want to transform one object into multiple others or none at all? We can do it by using flatMap.

FlatMap transforms each element of the stream into a stream of other objects. So each object will be transformed into zero, one or multiple other objects backed by streams. The contents of those streams will then be placed into the returned stream of the flatMap operation.

Lets create a list of four foos each consisting of four bars.
public class Foo {
    String name;
    List bars = new ArrayList<>();

    Foo(String name) {
        this.name = name;
    }
    public static void main(String[] args) {
       
        List foos=new ArrayList();
       
        //creating foos
        IntStream.range(1, 5)
        .forEach(i->foos.add(new Foo("E"+i)));
       
        //creating bars
        foos.forEach(f->
        IntStream.range(1, 5)
        .forEach(i->f.bars.add(new Bar("B"+i+f.name))));
    }
}
class Bar {
    String name;

    Bar(String name) {
        this.name = name;
    }
}


FlatMap accepts a function which has to return a stream of objects. So in order to resolve the bar objects of each foo, we ned to pass the appropriate function:
foos.stream()
    flatMap(f -> f.bars.stream())
    .forEach(b -> System.out.println(b.name));


Reduce : The reduction operation combines all elements of the stream into a single result. Java 8 supports three different kind of reduce methods.

#The first one reduces a stream of elements to exactly one element of the stream. Let's see how we can use this method to determine the oldest Employee:
employees.stream()
    .reduce((p1, p2) -> p1.age > p2.age ? p1 : p2) // compares both Employees ages in order to return the Employee with the maximum age.
    .ifPresent(System.out::println); // Michael Jackson


The reduce method accepts a BinaryOperator accumulator function. That's actually a BiFunction where both operands share the same type, in that case Employee.

#The second reduce method accepts both an identity value and a BinaryOperator accumulator. This method can be utilized to construct a new Employee with the aggregated names and ages from all other Employees in the stream:

Employee result = employees.stream()
    .reduce(new Employee("", 0),
    (p1, p2) -> {
    p1.age += p2.age;
    p1.name += p2.name;
    return p1;
    });
System.out.format("name=%s; age=%s", result.name, result.age);


This will return:
name=Andreas KaiserMario MorichRichard MarinoDale BhagwagarTiger ShroffMichael JacksonPaul DavidsonK Himaanshu Shuklaa; age=310

#The third reduce method accepts three parameters: an identity value, a BiFunction accumulator and a combiner function of type BinaryOperator. Since the identity values type is not restricted to the Employee type, we can utilize this reduction to determine the sum of ages from all employees:
       
Integer ageSum = employees.stream()
    .reduce(0, (sum, p) -> sum += p.age, (sum1, sum2) -> sum1 + sum2);
System.out.println(ageSum); //it will print 310


Let's extend the above code:
Integer ageSum = employees.stream().
                reduce(0, (sum, p) -> {
                System.out.format("Accumulator: sum=%s; employee=%s\n", sum, p);
                return sum += p.age;
                },
                (sum1, sum2) -> {
                System.out.format("Combiner: sum=%s; employee=%s\n", sum1, sum2);
                return sum1 + sum2;
                });
System.out.println(ageSum);


Output of above code would be:
Accumulator: sum=0; employee=Andreas Kaiser
Accumulator: sum=35; employee=Mario Morich
Accumulator: sum=66; employee=Richard Marino
Accumulator: sum=97; employee=Dale Bhagwagar
Accumulator: sum=145; employee=Tiger Shroff
Accumulator: sum=171; employee=Michael Jackson
Accumulator: sum=235; employee=Paul Davidson
Accumulator: sum=280; employee=K Himaanshu Shuklaa
310


The combiner never gets called and the accumulator function does all the work. It first get called with the initial identity value 0 and the first employee Andreas Kaiser. In the next three steps sum continually increases by the age of the last steps Employee up to a total age of 310.

Let's use parallelStream instead of stream, it will result in an entirely different execution behavior. Now the combiner is actually called. Since the accumulator is called in parallel, the combiner is needed to sum up the separate accumulated values.

Integer ageSum = employees.parallelStream().
                reduce(0, (sum, p) -> {
                System.out.format("Accumulator: sum=%s; employee=%s\n", sum, p);
                return sum += p.age;
                },
                (sum1, sum2) -> {
                System.out.format("Combiner: sum=%s; employee=%s\n", sum1, sum2);
                return sum1 + sum2;
                });
System.out.println(ageSum);


Output of above code would be:
Accumulator: sum=0; employee=Michael Jackson
Accumulator: sum=0; employee=Richard Marino
Accumulator: sum=0; employee=Mario Morich
Accumulator: sum=0; employee=K Himaanshu Shuklaa
Accumulator: sum=0; employee=Andreas Kaiser
Accumulator: sum=0; employee=Dale Bhagwagar
Accumulator: sum=0; employee=Tiger Shroff
Combiner: sum=31; employee=48
Combiner: sum=35; employee=31
Accumulator: sum=0; employee=Paul Davidson
Combiner: sum=45; employee=30
Combiner: sum=66; employee=79
Combiner: sum=26; employee=64
Combiner: sum=90; employee=75
Combiner: sum=145; employee=165
310 


Parallel Streams are the streams that are executed in parallel to increase runtime performance on large amount of input elements. Parallel streams use a common ForkJoinPool available via the static ForkJoinPool.commonPool() method. The size of the underlying thread-pool uses up to five threads - depending on the amount of available physical CPU cores.

ForkJoinPool commonPool = ForkJoinPool.commonPool();
System.out.println(commonPool.getParallelism()); // 3


BY default the common pool is initialized with a parallelism of 3. This value can be decreased or increased by setting the following JVM parameter:
-Djava.util.concurrent.ForkJoinPool.common.parallelism=5

Collections support the method parallelStream() to create a parallel stream of elements. Alternatively you can call the intermediate method parallel() on a given stream to convert a sequential stream to a parallel counterpart. 


What is stream pipelining in Java 8?
Stream pipelining is the concept of chaining operations together. This is done by splitting the operations that can happen on a stream into two categories: intermediate operations and terminal operations.

Each intermediate operation returns an instance of Stream itself when it runs, an arbitrary number of intermediate operations can, therefore, be set up to process data forming a processing pipeline.

There must then be a terminal operation which returns a final value and terminates the pipeline. 


Difference between DoubleSummaryStatistics , IntSummaryStatistics and LongSummaryStatistics?
They all does the same task i.e to compute statistical information on the stream of data. They differ by the way they store the statistical information as they expect a different data type of the values being used.

IntSummaryStatistics and LongSummaryStatistics expect non floating point values and hence stores the statistical information like min,max and sum as non floating values ( int or long ) whereas DoubleSummaryStatistics stores these information as floating value. 


What is StringJoiner?
StringJoiner is a util method to construct a string with desired delimiter. This has been introduced with wef from Java 8.

StringJoiner strJoiner = new StringJoiner(".");
strJoiner.add("Himaanshu").add("Shuklaa");
System.out.println(strJoiner); // prints Himaanshu.Shuklaa


Difference between findAny() and findFirst() in Java 8?
Both the methods return an arbitrary element from the stream – unless the stream has an encounter order, in which case findFirst() returns the first element. When there is no encounter order it returns any element from the Stream. The java.util.streams package documentation says, "Streams may or may not have a defined encounter order. It depends on the source and the intermediate operations." The return type of findFirst() is also an Optional instance which is empty if the Stream is empty too. The behavior of the findFirst method does not change in the parallel scenario. If the encounter order exists, it will always behave deterministically.

findAny() method allows you to find any element from a Stream. Use it when you are looking for an element without paying an attention to the encounter order. In a non-parallel operation, it will most likely return the first element in the Stream but there is no guarantee for this.



-K Himaanshu Shuklaa..
 

No comments:

Post a Comment

RSSChomp Blog Directory