× Please, log in to give us a feedback. Click here to login

You must be logged to download. Click here to login


MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation


MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

New in Java 7: Fork/Join

Meet the new framework for parallel processing in Java 7

With the increasing number of computers with multiple processors or multiple cores in each processor, it is becoming more and more important to implement parallelism in our applications to take advantage of these resources. In some cases (there are servers being sold nowadays with more than 120 cores), parallelizing processes at the operating system level is not enough to occupy all of the available processing cores, leaving some of them idle.

There are many strategies for parallelism. Java implements the strategy of work queues + thread pools since version 5. Classic LISP and Smalltalk have the operations select / map / reduce that allow dividing large quantities of data in batches for parallel processing (bulk data operations, planned for Java 8). With JSR 166, led by Doug Lea, Java 7 brings the Fork/Join framework, which applies the divide and conquer strategy to parallelize algorithms.

Divide and conquer is a classic technique for problem solving with algorithms that do not necessarily need to be executed in parallel. The idea is to divide a problem in smaller parts which, recursively, are divided again in even smaller chunks until their solution is trivial. Then, after solving for each of the smaller pieces, the solutions are combined until reaching the solution of the whole problem. The classic application of this technique can be seen in the sorting algorithm “merge sort” (see Links).

By dividing a program in smaller pieces, each piece can be handed to a different processor or core for computation and, later, be combined for the final solution. The Fork/Join framework facilitates this implementation, promoting scalability of the application, because the developer does not need to be concerned if the machine that will execute the code contains half a dozen or hundreds of processing cores.

To exemplify, Listing 1 shows an application for the calculation of the average of 10 million numbers. An array with 10 million real numbers is generated randomly (in case you have a database with such a large amount of data, using it can be even more interesting than using random numbers) and then fed into a sequential adder or a parallel added, depending on which line of code is commented out (highlighted in bold in the listing). Finally, the sum is divided by the amount of numbers added to obtain the average.

Listing 1. Application for testing the Fork/Join framework.
package com.mrbool.java7.forkjoin;

import java.util.Random;

public class ForkJoinTest {
	private static final int QTY = 10_000_000;
	private static double[] generateNumbers(int qty) {
		double[] numbers = new double[qty];
		Random rand = new Random(System.currentTimeMillis());
		for (int i = 0; i < qty; i++) numbers[i] = rand.nextDouble() * 10d;
		return numbers;
	public static void main(String[] args) throws Exception {
		double[] numbers = generateNumbers(QTY);
		long start = System.nanoTime();
		double sum = SequentialAdder.sequentialSum(numbers);
		// double sum = ParallelAdder.parallelSum(numbers);
		long end = System.nanoTime();
		double time = (end - start) / 1_000_000.0;
		System.out.println("Average = " + (sum / QTY));
		System.out.println("Time = " + time + "ms");

The sequential implementation of the added is trivial and is shown in Listing 10. The method sequentialSum() receives an array of doubles and, for each number on the array, sums it to the variable result, which is returned at the end of the method.

Listing 2. Sequential implementation of the number sum.
package com.mrbool.java7.forkjoin;

public class SequentialAdder {
	public static double sequentialSum(double[] numbers) {
		double result = 0d;
		for (int i = 0; i < numbers.length; i++) result += numbers[i];
		return result;

The parallel adder, however, is much more complex and can be seen in Listing 11. The complexity comes from the fact that the Fork/Join framework expects to receive an object that represents a task and extends class java.util.concurrent.RecursiveAction. In the case of the adder, we have developed the class SumTask. Moreover, another class, NumberArray, encapsulates operations for splitting and summing the number array.

Listing 3.Implementation of number sum using the Fork/Join framework.
package com.mrbool.java7.forkjoin;

import java.util.concurrent.*;

public class ParallelAdder {
	public static double parallelSum(double[] numbers) {
		SumTask task = new SumTask(new NumberArray(numbers));
		ForkJoinPool pool = new ForkJoinPool();
		return task.result;

class SumTask extends RecursiveAction {
	private static final int SEQUENTIAL_LIMIT = 2_000_000;

	final NumberArray array;
	double result;
	SumTask(NumberArray array) {
		this.array = array;

	protected void compute() {
		if (array.size < SEQUENTIAL_LIMIT) {
			result = array.sum();
		else {
			SumTask left = new SumTask(array.subArrayLeft());
			SumTask right = new SumTask(array.subArrayRight());
			invokeAll(left, right);
			result = left.result + right.result;

class NumberArray {
	final double[] numbers;
	final int start;
	final int end;
	final int size;
	NumberArray(double[] numbers, int start, int end) {
		this.numbers = numbers;
		this.start = start;
		this.end = end;
		this.size = end - start + 1;

	NumberArray(double[] numbers) {
		this(numbers, 0, numbers.length - 1);
	double sum() {
		double result = 0;
		for (int i = start; i <= end; i++) result += numbers[i];
		return result;
	NumberArray subArrayLeft() {
		return new NumberArray(numbers, start, start + (size / 2) - 1);
	NumberArray subArrayRight() {
		return new NumberArray(numbers, start + (size / 2), end);

First off, the parallel adder creates the instance of the SumTask class that will work on the whole numbers array passed as argument to the parallelSum() method. Then, some threads are created by the Fork/Join framework to execute the given task. The default number of threads is equal to the number of processors available.

When processing starts, Fork/Join calls the method compute() from the class that represents the task. This method first checks if the array has a size that is sufficiently small to be processed sequentially (less than two million elements, as set by the constant SEQUENTIAL_LIMIT). In case the array is not yet under the limit, it is split in half, creating two new sum tasks: left and right. Next, the Fork/Join framework is asked to do these tasks in parallel, through the methods invokeAll(), inherited from RecursiveAction.

Following the divide and conquer strategy, eventually Fork/Join will call the method compute() of these new sub-tasks, which will continue to split the array until it becomes smaller than the specified limit. At this point, the NumberArray.sum() will add the numbers sequentially and start to produce results, which will be summed up until we are back to the task that represented the entire array.

The Fork/Join framework, thus, provides a parallelism solution that is generic – which should be completed by the developer with the implementation of the task – and portable – it is not necessary to worry about the hardware that will execute the program, as this task is delegated to the virtual machine and to the Java API. However, it is necessary to consider the following when using this framework:
• The framework is recommended only for tasks that make intensive use of the CPU and should be avoided in case the tasks mixes CPU processing with input/output (I/O) operations;
• Different values for the limit that defines when the task should be executed sequentially should be tried in order to discover through these experiments the optimal value for it – one that makes the parallel algorithm really beneficial when compared to the sequential one. Small values for this limit increase the degree of parallelism, but also increase the cost of thread management and of the task itself (array splitting, result combination, etc.).

The example of this article is for didactic purposes and might not display faster results with the utilization of the parallel algorithm. For instance, when executed in a laptop with an Intel Core 2 Duo processor, with up to 10 million numbers the sequential adder executed more rapidly than the parallel adder for different values of the constant SEQUENTIAL_LIMIT. When the amount of numbers was increased to 100 million, the parallel solution started to be faster than the sequential one, but the -Xmx2048m switch had to be used for the execution of the JVM in order to avoid OutOfMemoryError. In servers with hundreds of processing cores we expect the parallel algorithm to be more efficient even for smaller sets of data.

Página da Wikipédia que explica o Merge Sort, demonstrando-o em diversas linguagens de programação.

Columnist not avaible

What did you think of this post?
To have full access to this post (or download the associated files) you must have MrBool Credits.

  See the prices for this post in Mr.Bool Credits System below:

Individually – in this case the price for this post is US$ 0,00 (Buy it now)
in this case you will buy only this video by paying the full price with no discount.

Package of 10 credits - in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download few videos. In this plan you will receive a discount of 50% in each video. Subscribe for this package!

Package of 50 credits – in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download several videos. In this plan you will receive a discount of 83% in each video. Subscribe for this package!

> More info about MrBool Credits
You must be logged to download.

Click here to login