Posted by: Morten Nobel-Jørgensen | April 25, 2010

Why not have OpenMP in Java?

Java was originally created as a simplification of C++ to boost productivity. Java nicely shielded the programmers from a lot of complexity such as pointers and platform architecture.

For many years I have mainly coded in Java, but have now started playing a bit with C++ again. Most of the time you have to write more or a event amount of code in C++ than you would have in Java, but today I stumbled upon a C++ feature that was so extremely simple compared to how you would do in Java.

I’m currently working on writing a ray-tracer in C++ – just another hobby project inspired by this lecture: MIT OpenCourseWare: Computer Graphics.

The main loop of my application shoots a ray for every pixel in the destination image:

for (int x = 0; x < imageWidth; x++) {
	for (int y = 0; y < imageHeight; y++) {
		Color shadedColor = shootRay(x,y);
		image.SetPixel(x, y, shadedColor);

One way to optimize the code above would be to parallelize the loop, so both cores of my CPU would help my complete the loop as fast as possible. In Java I would have solved this in the following way:

  • Find the number of processors on the current system (Runtime.getRuntime().availableProcessors())
  • Divide the loop into the same amount of threads as processors
  • Start the threads
  • Wait for all threads has completed

In Java 7 this will be a little more simple using the fork-join feature.

But how can this be solved in C++? I expected the solution to be as cumbersome as in Java, but was surprised to find out how simple and elegant this can be solved:

#pragma omp parallel for
for (int x = 0; x < imageWidth; x++) {
	for (int y = 0; y < imageHeight; y++) {
		Color shadedColor = shootRay(x,y);
		image.SetPixel(x, y, shadedColor);

OpenMP will do all the hard work under the hood:

  • Create a thread pool (to avoid the overhead of creating new threads every time OpenMP is asked to perform a task).
  • Divide the task into the threads
  • Start the threads and wait for all the threads to finish

If you for some reason use a system that does support OpenMP (either due to an old C++ compiler or OpenMP has not been enabled in the build settings), the #pragma clause will simply be ignored and the loop will run sequential.

Of course OpenMP has many other features, but the example above is a fully functional example, that in my case reduced the render time from 70 to 35 – just by adding a single line of code.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: