× Please, log in to give us a feedback. Click here to login

You must be logged to download. Click here to login


MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation


MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

Java Image Processing: Capturing images with Java and HTML5

See in this article how to capture images from a webcam using HTML5 and Java to send those images to an image server.

The use of devices to capture images and the use of other peripherals for personal computers have always been related to the hardware that performs these image capture using the operating system’s drivers (or modules on Unix platforms). The Java technologies enable manipulation of images captured by a webcam or similar device through the JMF (Java Media Framework – a topic not covered in this article). Although JMF abstracts the communication with image capture devices, the JMF, through JVM, still need the driver used by the operating system to communicate with these devices. Thus, there was (and still is) a strong reliance on the operating system and the image capture device when using a camera or any other peripheral. The device driver is responsible to "talk" with their low-level hardware and provide data in the most standardized way to the operating system, which in turn sends this data to any software that may execute many operations. Thereafter, the developer can work with the obtained data to build the software and use it as he/she wishes.

The new HTML5 specification does not remove the need for a device driver, but it is possible to abstract the data capture operation using the web browser. In the new specification the communication with the device is now taken over by an intermediary: the browser. In this case, it facilitates the interface between the web application (in JavaScript) and the capture device. With this new specification, the idea is that the browser assumes the form of a bridge between the device and the web application, i.e. it gets the data from the operating system that has high dependence on driver and devices, and abstracts the data thus providing them on a standardized format for web applications. The browser mediates the communication and now assumes a behavior just like the bridge that the operating system executes between drivers and applications when abstracting the communication hardware from the running application.

One advantage of this approach is that it will be possible to diversify the hardware without having to worry about designing them to run on a given architecture or platform, as happens nowadays, with computers that are compatible with Windows, Unix/Linux and Mac OS. Examples of this type of device are Google Chromebooks, which run with an integrated operating system (Chrome OS based on Chromium project) and the Firefox OS (a Gecko-based browser and Firefox) smartphones browser. In the future, it will suffice that the device has a HTML5 compliant browser (or any of its future specifications) and the applications created for this pattern will run in the same way. In fact, nowadays it is possible to observe a good match for this pattern among the most advanced browsers like Chrome, Safari, and Firefox. This independence and the addition of new resources favor even more web applications.

Moreover, the new HTML5 standard has other great features that Java developers are used to see on enterprise software development such as access to relational database (via the Web SQL Database which based on the SQLite), asynchronous processing on independent threads (via the Web Workers, similar to the implementation of the Runnable interface), or even resources for 2D graphics (vector and canvas, similar to the Java 2D API) for creating games, graphics, or vector applications. There is also a specification for 3D graphics in progress based on OpenGL named WebGL.

These improvements to the HTML standard brought benefits to professional involved in software development. One advantage for application developers is the abstraction of communication between the image capture device, which increase productivity since it not needed to handle the communication with the device directly with the driver or to configure to use it (for JMF-base applications). The company also has another great advantage when adopting the HTML5 standard: cost reduction. Software companies that create their commercial applications or games using HTML5 reduce development costs because the code is implemented only once and it needs to go through a few (or none) adjustment or conversion to other platforms. Additionally, because HTML5 is a multiplatform web solution, code fixes automatically become available to all client devices, from computers to smartphones and tablets.

On the other hand, compliance with the specification laid down by the W3C (World Wide Web Consortium) regarding functions names and features available wasn’t always taken seriously by the browsers available in the market since there is always individual interests and different views on the pattern. Unfortunately, this is a problem that we still have to face today and that inevitably ends up leading developers to implement browser-specific code. Still, the benefits of using the HTML5 standard are large and it should be adopted when it is possible and feasible.

Another aspect worth mentioning is that, despite the fact that video capture is supported by some browsers for mobile devices like iOS Safari, Chrome and the Android Robot (according to their documentation), the browsers may have a problem or do not fully support this functionality disabling the use of the camera by HTML5 pages. Apparently, for now, access to the camera can only be fully achieved by applications developed in native code for these platforms. However, recent news suggests that HTML5 camera access in certain browsers tends to improve in future versions.

Based on this context, the aim of this article is present a solution for capturing and storing images using a frontend application in HTML5 and a Tomcat server running Java as backend to centralize and store images.

Environment setup

An image capture device connected to the computer is required before starting the development. To this end, a previously installed and configured webcam is enough. We suggest the reader to test the webcam before running the example in this article (you can test it with a program like Skype). This article used the Eclipse IDE to develop the HTML and server code, which will run on the Apache Tomcat servlet container. A browser that supports the HTML5 standard is required for the project presented in this article.

As for the browser, Google Chrome was chosen for two reasons: 1) It is the most used browser in the world today (surpassed IE for some time, according to the StatCounter site); and, 2) It has more JavaScript debugger features compared to other browsers like Safari or Firefox (extensions or third-party plugins for this review were not considered). The following list indicates the complete list of prerequisites for the construction of the practical part of this article:

  • A Tomcat server;
  • The Eclise IDE;
  • The Google Chrome browser;
  • A webcam.

The project is composed of a HTML5 page that will capture the images through the web browser and send it to a servlet running on Tomcat. The servlet in turn generates a random file name and store the image received on a folder located on the server root (C:\repositorio), which will be the images repository. Therefore the first step is the creation of a dynamic web project in Eclipse as shown in Figure 1.

Creating a dynamic web project in Eclipse

Figure 1. Creating a dynamic web project in Eclipse.

While creating the project, give it a name and select Tomcat as the server. The first file that we will create is named web.xml and its content should be similar to the code on Listing 1. The servlet that receive images via an AJAX call made in JavaScript will be created later on.

Listing 1. Web.xml file for the servlet that will receive the images

 <?xml version="1.0" encoding="ISO-8859-1"?>
 <web-app version="2.5" <br>
 <description>Capture images tutorial</description>
 <!-- Servlet  -->
 <servlet-name>Image Receiver</servlet-name>
 <servlet-name>Image Receiver</servlet-name>

Creating the HTML page

Now that we have the initial configuration of web.xml let’s see the implementation the HTML5 page responsible for performing the image capture. Create a file called index.html inside Eclipse, insert it on the root folder of the web folder (usually in WebContent), and fill it with the Listing 2 code.

Listing 2. Initial code of the HTML5 page.

<!DOCTYPE html>
     <meta charset="UTF-8">
     <title>Capturing Images with HTML5</title>

The image is captured from the video display obtained from your webcam in your browser. The video will be displayed through the new HTML tag called <video>. The content of this tag has nothing more than a sequence of images provided by the webcam. Therefore, we need to add this tag to display the webcam video on our website and also position it properly. Another tag that we include is the <canvas> because it is used to display the captured photo when pressing a button that will add later on.

An important point to notice is that you must include an ID for each tag in order to facilitate the search for components in JavaScript code. Let's add two more buttons on our page. The first will be responsible for capturing an image obtained by <video> tag and call the JavaScript function capture(). When this function is called it will draw the image inside a frame provided by the <canvas> tag. The second button will send the image draw on the canvas to the servlet. The inclusion of these buttons allows the user to choose the best photo before sending it to the server. The <div> tags and the CSS were used only to improve the appearance and components’ arrangement. See Listing 3.

Listing 3. HTML5 tags to display the video and the image draw inside the canvas.

<div><video id="videoID" autoplay style="border: 1px solid black;"></video></div>
 <div><canvas id="canvasID" style="border: 1px solid black;"></canvas></div>
 <input type="button" value="Take photo" onclick="capture()" 
 style="width: 200px; height: 30px;"/>
     <input type="button" value="Send" onclick="send()" 
          style="width: 200px; height: 30px;"/>

With just this code our page is prepared to handle the capture of images. Now we implement the JavaScript code. We will start with a snipet of JavaScript code in our page creating the global variables video and canvas and assign them references to the respective components so we can use them later. We will also use a global variable called context which was created from the canvas and represents a set of features (e.g. drawing functions on the canvas), as shown in Listing 4.

Listing 4. Global variables referencing the HTML components.

<script type="text/javascript">
 var video = document.getElementById('videoID');
 var canvas = document.getElementById('canvasID');
 var context = canvas.getContext('2d');
 ...code show in the next listings...

Moving on, we will implement the function used to capture the image. However, as mentioned above, browsers not always call standard JavaScript code. Thus, the name of the function implemented in the browser that will capture the image capture cannot follow the same pattern across all browsers (yet) without a prior check. We will include the code in Listing 5 after the previous code to handle this issue.

Listing 5. Handing compatibility issues with native functions for image capture.

window.URL = window.URL || window.webkitURL;
 navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || 
                          navigator.mozGetUserMedia || navigator.msGetUserMedia;

The window.URL property has utility functions to locate our page (index.html) that returns data such as the URL string (http://localhost:8080/artigo). Adjusting the call window.URL is required because it will be used later to redirect the images flow (webcam video) obtained in our browser to the <video> tag. Thus, we do a check to assign the reference to an existing function we want to use. In our example, if the property does not exist in window.URL browser running our page, it will be created and assigned to the value of the property window.webkitURL. If the property has no implementations addressed in the article the project will not work and another browser should be used.

Likewise, the getUserMedia () function of the navigator object is used to capture the flow of frames (images) provided by the <video> tag. This function has the same issue as the previous one: not all browsers have the getUserMedia() function available. In this case, verification is made to find out the known functions. If one exist it will be assigned to getUserMedia() function of the navigator object.

Next we use our corrected property above (window.URL) to prompt the browser to capture and redirect the flow of images from the webcam to the <video> tag. This step is performed by createObjectURL() function and we must update the video stream obtained from the browser (the variable stream), as shown in Listing 6.

Listing 6. Handling compatibility issues with native functions for image capture and video redirection.

 video : true
 }, function(stream) {
 video.src = window.URL.createObjectURL(stream);
 }, function(e) { console.log(An error happened:', e); });

So far we only prepared our page to work with JavaScript code. Now it is time to implement the functions that will be called on click events of our buttons. They will take the picture and send it to the server via an AJAX request.

The first function shown is called capture() and its call was added to the onclick event of the ‘Tirar Foto’ button. The implementation of this function is shown in Listing 7.

Listing 7. Function to capture the image.

function capture() 
 context.drawImage(video, 0, 0, canvas.width, canvas.height);

The capture() functions calls the drawImage() function of the context object. The drawImage() function render an image that is sent as the first parameter. In this article the global variable video will serve as a source image rather than a reference to a PNG file. Thus, when the T’ake Photo button is pressed the image being displayed on <video> tag is captured and drawn inside the <canvas> tag.

The function called by the second button is send () and it was assigned for the onclick event of the ‘Send’ button. When this function is invoked it performs two actions: first it gets the previously drawn image on the <canvas> and the reference it on the imageData variable. Note that this first part is the call to the canvas.toDataURL (), which returns the image data in the PNG format. The data returned for the imageData variable are formed by a description of the captured image, its type, and the binary data (e.g. ''...) . The information displayed after the comma in the sentence "data:image/png; base64," (in this case " iVBOR "...) is the representation of the image in base 64. This information will be handled and converted into our servlet later. After obtaining the image data the function starts a XMLHttpRequest request to send this data to the servlet providing its address (/article/receiver) and sending the imageData by the POST method of the HTTP protocole. The implementation of the function capture() is provided on Listing 8 .

Listing 8. Function to send the captured image to the servlet.

function send()
 var imageData =  canvas.toDataURL();
 var xmlhttp = new XMLHttpRequest();
 xmlhttp.open("POST", "/article/receiver", true);

Creating the Servlet

So far we got a HTML5 page that captures an image and send its data to a specific URL (/article/receiver). We must create a servlet to receive the image, change and save it in the folder C:\repositorio. The class com.artigo.control.ImageServlet was created and the doPost() method was overridden to receive the image and convert it to a file. The complete servlet code is shown on Listing 9.

Listting 9. Servlet code

package com.artigo.control;
 import java.io.File;
 import java.io.FileOutputStream;
 import java.io.Reader;
 import java.util.Random;
 import javax.servlet.http.HttpServlet;
 import javax.servlet.http.HttpServletRequest;
 import javax.servlet.http.HttpServletResponse;
 import sun.misc.BASE64Decoder;
 public class ImageServlet extends HttpServlet
 private static final long serialVersionUID = 1L;
 public void doPost(HttpServletRequest request, HttpServletResponse response)
 StringBuffer buffer = new StringBuffer();
 Reader reader = request.getReader();
 int current;
 while((current = reader.read()) >= 0)
 buffer.append((char) current);
 String data = new String(buffer);
 data = data.substring(data.indexOf(",") + 1);
 System.out.println("PNG image data on Base64: " + data);
 FileOutputStream output = new FileOutputStream(new File("/C:/repositorio/" + 
 new Random().nextInt(100000) + ".png"));
 output.write(new BASE64Decoder().decodeBuffer(data));
 catch (Exception e)

The first thing the servlet does is recover the image that was sent to the address /article/receiver. An object called buffer of the StringBuffer datatype was instantiated to facilitate this work. The image data is included in the buffer by getting each character of the string data in a while loop. After the data is fully recovered, we will also got inside the buffer the part of our data that contain information about the format of the image ("data:image/png;base64" ) that does not matter since we already know that we are dealing with a PNG image format in base 64. To build a binary array without this part we have to eliminate it and separate out the portion of the base 64 PNG image we want to convert. The relevant image data is positioned after the first comma at the beginning of the string, as mentioned previously. In order to separate the image data we get the substring by using the data.substring() method. To verify the result a call to System.out.println() method is made to print the value of the String in the console.

Once we got the data we have to convert it from the base 64 to a byte array before writing it to a file. If this step fails the data can be corrupted and the image cannot be read by an image viewer application. To do this conversion we import the sun.misc.BASE64Decoder class that performs this job by calling the decodeBuffer() method, which returns an array of bytes that can be used to write the output file.

Finally, the servlet creates a file to store the image in our repository. If all went well after running our code and accessing the URL http://localhost:8080/article we can see the browser’s screen similar to the ones in Figures 2 and 3.

Page running on Google Chrome

Figure 2. Page running on Google Chrome.

Image captured after pressing the

Figure 3. Image captured after pressing the "Take Photo" button.


As we discussed in the article, the new HTML5 standard allows the use of hardware resources previously available only in native desktop applications run or created to run locally on a JVM. These new features can be leveraged to create rich web applications and obtain many benefits that can meet specific market needs. If desired, HTML5 can still be used together or integrated with other technologies and frameworks like jQuery, JSP, Servlets, EJBs, Web Services, JavaServer Faces, etc., required on both client side and server side, to further expand the range of applications that can use the features presented in this article.

Web applications become increasingly attractive as the communication bottlenecks between servers and clients are reduced and the networks became faster and more reliable. The adoption of the HTML5’s new features to centralized applications can bring us many benefits, such as simplicity, portability, reduced development costs, faster decision making for bug fixes, and others.

The features of HTML5 can be a major competitive advantage for companies (especially startups) who have great ideas but lack money, time, or resources to create and port applications to different platforms and languages.

Based on this scenario the integration of HTML5 with Java solutions shows a promising future with a powerful combination of a great design on the front end and a rich and extensive services layer on the backend.

SEO Specialist working at VML Brazil www.ricardoarrigoni.com.br

What did you think of this post?
To have full access to this post (or download the associated files) you must have MrBool Credits.

  See the prices for this post in Mr.Bool Credits System below:

Individually – in this case the price for this post is US$ 0,00 (Buy it now)
in this case you will buy only this video by paying the full price with no discount.

Package of 10 credits - in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download few videos. In this plan you will receive a discount of 50% in each video. Subscribe for this package!

Package of 50 credits – in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download several videos. In this plan you will receive a discount of 83% in each video. Subscribe for this package!

> More info about MrBool Credits
You must be logged to download.

Click here to login