3.2.4 Parallel Streams Explained
Parallel Streams in Java are a powerful feature that allows you to process collections of data in parallel, leveraging multi-core processors to improve performance. Understanding how to use parallel streams effectively is crucial for optimizing the performance of your Java SE 11 applications.
Key Concepts
1. Parallel Processing
Parallel processing involves dividing a task into smaller sub-tasks that can be executed concurrently on multiple processors. In the context of Java Streams, parallel processing allows you to perform operations on stream elements simultaneously, which can significantly speed up computation for large datasets.
2. Creating Parallel Streams
You can create a parallel stream from a collection by calling the parallelStream()
method. Alternatively, you can convert a sequential stream to a parallel stream using the parallel()
method.
Example
List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); int sum = numbers.parallelStream() .mapToInt(n -> n) .sum(); System.out.println(sum); // Output: 55
3. Performance Considerations
While parallel streams can improve performance, they are not always the best choice. The overhead of splitting the data and merging the results can outweigh the benefits of parallel processing for small datasets. Additionally, operations that are inherently sequential, such as reduce
with an associative operator, may not benefit from parallelism.
Example
List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5); int sum = numbers.parallelStream() .reduce(0, (a, b) -> a + b); System.out.println(sum); // Output: 15
4. Stateful and Stateless Operations
Stateful operations, such as distinct
and sorted
, require maintaining state across multiple elements, which can complicate parallel processing. Stateless operations, such as map
and filter
, are easier to parallelize because they do not depend on the state of previously processed elements.
Example
List<String> names = Arrays.asList("Alice", "Bob", "Charlie", "Alice", "Bob"); List<String> distinctNames = names.parallelStream() .distinct() .collect(Collectors.toList()); System.out.println(distinctNames); // Output: [Alice, Bob, Charlie]
Examples and Analogies
Think of parallel streams as a factory assembly line where multiple workers (processors) are working simultaneously to assemble products (process data). Each worker takes a part of the product (stream element) and performs a specific task (operation). The final product (result) is assembled by combining the work of all the workers.
For example, consider a scenario where you need to calculate the sum of a large list of numbers. Using a parallel stream, you can divide the list into smaller chunks, process each chunk in parallel, and then combine the results to get the final sum.
Example
List<Integer> largeNumbers = IntStream.rangeClosed(1, 1000000).boxed().collect(Collectors.toList()); int sum = largeNumbers.parallelStream() .mapToInt(n -> n) .sum(); System.out.println(sum); // Output: 500000500000
By mastering parallel streams, you can significantly improve the performance of your Java SE 11 applications, especially when dealing with large datasets and computationally intensive operations.