Free Online Courses for Software Developers - MrBool
× Please, log in to give us a feedback. Click here to login
×

You must be logged to download. Click here to login

×

MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

×

MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

Getting Started with Java Collections Framework

In this article, well see a complete overview around the Java Collections Framework, the official and must used in the market.

Since version 1.2 of the JDK (after it was renamed to Java 2 SDK), J2SE platform includes a collections framework (the Collections Framework). A collection is an object that represents a group of objects. A collections framework is a unified architecture for representing and manipulating collections, allowing them to be manipulated independently of the implementation details.

The main advantages of the collections framework are:

  • Programming effort reduction, providing useful data structures and algorithms, so you do not need to rewrite them.
  • Increased performance, providing high-performance implementations. Because various implementations are substitutable for each interface, the programs can be easily refined by exchanging implementations.
  • Interoperability between unrelated APIs by establishing a common language to pass collections.
  • Reduction of APIs learning effort by eliminating the need to learn multiple APIs from different collections.
  • Reduction of the effort to design and implement APIs by eliminating the need to create own collections APIs.
  • Promotion of the software reuse by providing a standard interface for collections and algorithms to manipulate them.

The Collections Framework comprises:

  • Collections interfaces - represent different types of collection, such as sets, lists and associative arrays. These are interfaces form the basis of the framework.
  • General-purpose implementation - basic implementations of the collection interfaces.
  • Legacy implementations - the collection classes from earlier releases, Vector and Hashtable, have been remodeled to implement the collection interfaces (maintaining compatibility).
  • Wrappers implementations - add functionality, such as synchronization, to other implementations.
  • Convenient implementations - "mini-implementations" with high performance collections interfaces.
  • Abstract implementations - partial implementations of the collection interfaces to facilitate the creation of custom implementations.
  • Algorithms - static methods which perform useful functions on collections, such as sorting a list.
  • Utilities arrays - Utility functions for primitive arrays and objects. Strictly speaking, it does not belong to the Collections Framework, but this functionality was added to the Java platform at the same time, and makes use of the same infrastructure.

Collection

Basic interface for all types of collection. It defines the most basic operations for collections of objects, such as abstract addition (add) and removal (remove) (no information about the ordering of the elements), emptying (clear), size, conversion array (toArray), object iteration (iterator), and existence checks (contains and isEmpty).

List

Interface that extends Collection, and which defines ordered collections (sequences), where you have full control over the position of each element identified by a numerical index. In most cases, it can be seen as a "variable length array" because, as primitive arrays, it is accessible by indexes, but additionally have insertion and removal methods.

ArrayList

List implementation that internally uses an array of objects. In an insert where the size of the internal array is not enough, a new array is allocated (of equal size to 1.5 times the original array), and all content is copied to the new array. In an insert in the middle of the list (index < size), the later content to the index is shifted in a position. This implementation is recommended when the list size is predictable (avoiding reallocations) and the insertion and removal operations are done mostly at the end of the list (avoiding displacement), or when the list is more read than modified (optimized for random).

LinkedList

List implementation that internally uses a linked list. The location of an element in the nth position is made by scrolling the list of the nearest tip to the desired content. The insert is made by adding new nodes between adjacent nodes, and before the location of this position is necessary. This implementation is recommended when modifications are made mostly at the beginning and at the end of the list, and transversal is done sequentially (via iterator) or at the ends, and not random (for indexes). An example of use is as a queue (FIFO - First-In-First-Out), where the elements are removed from the list in the same sequence in which they are added.

Vector

List implementation with the same behavior of ArrayList, however, fully synchronized. By having their synchronized methods, we also have lower performance than an ArrayList, but can be used in a multitasking environment (accessed by multiple threads) without danger of consistency loss of its internal structure.

In real systems, this synchronization has just added an unnecessary overhead, because even when there is multitasking access to the list (which does not happen in most cases), it is almost always necessary to make a new synchronization in business methods, to

Let's see an example: a call to contains() method, then a call to add() method if the element does not exist, can cause race conditions if there is access to various threads concurrently. Thus, an additional synchronization is eventually needed, encompassing these two operations and obviating the inherent class synchronization. Therefore, unless there are simultaneous access of multiple threads to the list, a simple synchronization provided by Vector is enough, so prefer other implementations such as ArrayList and LinkedList, which offer superior performance.

Stack

List implementation that offers access methods to use the list as a stack (LIFO - Last-In-First-Out) like push(), pop() and peek(). Vector extends, therefore, inherits the advantages and disadvantages of this synchronization. It can be used to take advantage of the specific implementations of stack operations.

Set

Interface that defines a collection or group that does not contain duplicate objects. That is, it ignores additions if the object or an equivalent object already exists in the collection. Equivalent objects mean objects that have the same hash code (returned by the hashCode() method) and return true in a comparison made by the equals() method.

Is not guaranteed the ordering of objects, i.e., the iteration order of the objects do not necessarily have any relation to the order of object insertions. Therefore, you cannot index the elements by numerical indices, such as a List.

HashSet

Set implementation that uses a hash table (the Sun's implementation uses the HashMap class internally) to store its elements. It does not guarantee the iteration order, or the order will remain constant over the time (a collection modification can change the general ordering of the elements). By using the hash table algorithm, access is fast, both for reading and for modification.

LinkedHashSet

Set implementation that extends HashSet, but adds the predictability iteration order over the elements, that is, an iteration over its entirety (using Iterator) maintains the order of insertion (insertion of duplicated elements does not change the previous order). Internally, it maintains a doubly linked list that keeps this order. By having to keep a list parallel to the hash table, the modification of this type of collection brings in a slight drop in performance relative to the HashSet, but it is still faster than a TreeSet, which uses comparisons to determine the order of the elements.

SortedSet

Interface that extends Set by adding the natural ordering semantics of the elements. The position of the elements in the traversal of the collection is determined by the return of the compareTo(o) method, if the elements implement the Comparable interface, or compare(o1, o2) method of a helper object that implements the Comparator interface.

TreeSet

SortedSet implementation that uses internally a Treemap, which, in turn, uses the red-black ordering algorithm for the element tree. This ensures the ascending order of the collection, according to the natural order of the elements defined by the implementation of the Comparable or Comparator interface. Use this class when you need a set (unique elements) and it must always be ordered even after undergone changes. For cases where writing is made at a time before reading the elements, it may be more advantageous to make the sorting in a List, then a copy to a LinkedHashSet (depending on the size of the collection and the number of elements repetitions).

Map

Interface that defines an associative array, i.e., instead of numbers, objects are used as keys to recover the information. The keys cannot be repeated (using the same principle of Set interface), but the values can be repeated for different keys. A Map also does not necessarily have a defined order for traversal.

HashMap

Map implementation that uses a hash table to store its elements. The access time to the elements (reading and modification) is constant (very good) if the hash function is well distributed, that is, the chance of two different objects return the same value for hashCode() method is small.

LinkedHashMap

Map implementation that extends HashMap, but adds the predictability iteration order over the elements, that is, an iteration over its entirety (using Iterator) maintains the order of insertion (insertion of duplicated elements does not change the previous order). Internally, it maintains a doubly linked list that keeps this order. By having to keep a list parallel to the hash table, the modification of this type of collection brings in a slight drop in performance compared to HashMap, but it is still faster than a TreeMap, that uses comparisons to determine the order of the elements.

Hashtable

Just like the Vector, a Hashtable is a legacy of the first versions of the JDK, also synchronized in each of its operations. For the same reasons of the Vector class, give preference to other implementations such as HashMap, LinkedHashMap and TreeMap, to gain in performance.

Properties

Class that extends Hashtable, specialized to work with Strings. It is a collection itself (since it is restricted to strings), but serves as an example of a specialized implementation of the Map interface.

IdentityHashMap

Map implementation that intentionally violates the contract interface, using the identity operator (==) to define equivalence of keys. This implementation should be used only in rare cases where the semantic referential identity have to be used for equivalence.

WeakHashMap

Map implementation that uses weak references as keys, allowing the deallocation by the garbage collection of objects contained therein. This class can be used as the basis for the implementation of temporary cache with data automatically deallocated in case of little use.

SortedMap

Interface that extends Map, adding the natural ordering semantics of the elements analogous to SortedSet. It also adds the collection partition operations, with

  • headMap(k) methods - which returns a SortedMap with previous key elements to k;
  • submap (k1, k2) - which returns a SortedMap with key elements between k1 and k2;
  • And tailMap(k) - which returns a SortedMap with key elements subsequent to k.

TreeMap

SortedMap implementation that uses the Red-Black algorithm for sorting the elements tree. This ensures the ascending order of the collection, according to the natural order of the elements defined by the implementation of the Comparable or Comparator interface. Use this class when you need a set (unique elements) and it must always to be ordered even after undergone changes. Analogous to TreeSet, in cases where writing is made at a time before reading the elements may be more advantageous to make the sorting in a List, then a copy to a LinkedHashSet (depending on the size of the collection and the number of repetitions keys).

Auxiliary interfaces

The Collections framework also defines a number of auxiliary interfaces that define operations objects returned by methods of collection interfaces.

Iterator

Interface that defines the basic operations for the traversal of the collection elements. It uses the same name pattern (iterator, GoF), disengaging the code that uses the collections of their internal structures. You can remove elements from the original collection using the remove() method, which removes the current element of the iteration, but this operation is optional implementation, and a UnsupportedOperationException will be thrown if this is not available.

ListIterator

Interface that extends Iterator, adding specific functions for the List type collections. It is obtained by ListIterator() method of a List, and has traversal operation in both directions (permitted by the indexing elements carried by the List).

Comparable

Interface that defines the comparison operation of the object itself with another, used to define the natural order of the elements of a collection. It can be used if the objects to be added to the collection already implement the interface (Integer, Double, String, etc.), or will be implemented by the developer, which has a chance to embed this interface in his code.

Comparator

Interface that defines a comparison operation between two objects by an external object. It is used when the objects to be added cannot be modified to accept the Comparable interface (a third-party library, for example), or exchange the runtime ordering strategy (the ordering of the table rows is needed, for example).

This interface provides greater flexibility without significant additional cost. Prefer its use instead of the Comparable interface, unless this is necessary (that is, you use a class that expects objects to implement it) or the very semantics of objects requires this natural order (which is quite subjective and specific domain/system).

Enumeration

\'Ancestral\' of the Iterator interface, which provides the same functions, except the removal of elements from the collection. The names of its methods are much longer - hasMoreElements() and nextElement() against hasNext() and next() - which makes the typing and the code reading easier. It is used primarily only in the Vector and Hashtable classes and can be practically ignored and replaced by the Iterator interface.

RandomAccess

Marker interface, which indicates that only certain List implementations are optimized for random access (numerical index, instead of iterators). This information can be used for some classes, so that they can change their internal strategies to gain performance.

Utility classes

Utility classes, as assumed here, are classes that have only static methods, providing common algorithms.

Collections

It provides methods that perform operations on collections. Some operations are available: binary search in lists; element replacement of a list by a certain object; search for the smallest or largest element (according to the natural order defined by Comparable or Comparator interfaces); reversing a list; shuffling; elements displacement; obtaining synchronized or immutable collections, based on existing collections (through Decorator pattern).

Arrays

Utility class that provides operations on arrays, including: sorting, binary search, comparison of arrays, fill, hashCode calculation based on the elements of an array, processing String and the casing of an array by implementing fixed length List.

Programming Interfaces

The Java Collections framework provides the infrastructure for a whole data structures. In addition, the use of well-defined basic interfaces (Collection, List, Set, SortedSet, Map, SortedMap) also creates abstractions that increase the flexibility and ease of modification.

When using these classes for programming, we must always reference them by interfaces. Someone may say, "But, we can’t instantiate interfaces!" Of course not. When we need to instantiate a collection, we use the most appropriate concrete implementation, but the types of parameters and the methods returns should always be one of these basic interfaces.

For example, let’s see the Listing 1 code.

Listing 1. Shopping List example.

private Vector ShoppingList;  
public void setShoppingList(Vector list) {  
    this.ShoppingList = list;  
}  
public Vector getShoppingList() {  
    return this.ShoppingList;  
}  

Suppose we have noticed that there is no concurrent access to this object, and the inherent synchronization Vector is affecting system performance. To alter the type from Vector to ArrayList, eliminating unnecessary overhead synchronization, we have to change all the classes that use this method, as they pass and expect a Vector, not an ArrayList.

Worse, suppose these classes using our shopping list (created by other teams thus more difficult to change) also have their own internal data structures, using the LinkedList class. To pass parameters or receive values you need to convert between different collections by copying elements from one to another, further increasing the performance problem, and making the code more difficult to read and maintain.

In these cases, instead of Vector had used the List interface, would be enough to change the location of class instantiation, that the rest of the code would automatically have the new implementation without modifications. A collection created in one component could easily be used by another, because all it hopes is an implementation of List, and not a concrete class.

Therefore, use interfaces whenever you can, especially in public methods signatures, and the maximum limit references to concrete classes, even internally in the class.

What interface to use?

We now have at least six interfaces to choose from: Collection, List, Set, SortedSet, Map and SortedMap. If we add the collection to access aids, we still have the iterator and ListIterator. Which one to use?

The Collection is the most generic, so it is the most suitable forever, right?

Wrong! Because it is so general, its operations are very limited, making it difficult to handle. For example, it does not allow direct access to an element, all the reading should be made through your iterator. If the collection will always be read as a sequence of elements, and the random access (index) does not make sense, there is no problem. However, if such access is necessary (or even desirable), it makes more sense to use a List interface.

This is a problem without a definitive answer, because a lot depends on each case. But, I can list some basic rules, obtained from my own experience. Not that this is a big deal, but worth a try:

  • The Collection interface is typically used for disposable sets, they will be read only once and not internally stored or handled (at maximum, copied to another structure).
  • The Set interface serves more as a marker of the expected semantics; it does not add any operations to the Collection.
  • The SortedSet and SortedMap interfaces serve to force a contract, that the elements of the collection will come ordered. They appear mainly as parameters of methods that await this sort. However, in most cases, the Treemap and TreeSet classes are used only locally, as temporary sorting containers.

Examples of use.

Let’s see in the following listings some typical examples of Collection iteration.

Listing 2. Typical loop in a Collection using the Iterator

Iterator it = collection.iterator();  
while (it.hasNext()) {  
    String item = (String) it.next();  
    System.out.println(item);  
}  

Listing 3. Typical loop in a Map, using the Iterator

Map map = new HashMap();  
Iterator itKeys = map.keySet().iterator();  
while (itKeys.hasNext()) {  
    Object key = itKeys.next();  
    Object value = map.get(key);  
    System.out.println(key + " = " + value);  
}  

Listing 4. Using an iterator to remove elements of a collection

Iterator it = listColors.iterator();  
while (it.hasNext()) {  
    String item = (String) it.next();  
    if (item.charAt(0) == \'v\') { // removing the colors that begin with\'v\'  
        it.remove();  
    }  
}

Listing 5. Using a TreeSet to list the contents of a Map/Properties in an orderly way

Properties p = new Properties();  
//...  
Iterator it = new TreeSet(p.keySet()).iterator(); // Sort the keys before creating the iterator  
while (it.hasNext()) {  
    String key = (String) it.next();  
    System.out.println(key + "=" + p.getProperty(key));  
}  

Listing 6. Initiating a Collection

// Normal version:  
List listColors = new ArrayList();  
listColors.add("red");  
listColors.add("green");  
listColors.add("yellow");  
listColors.add("white");  

// Abbreviated version using an array
List listColors = new ArrayList(Arrays.asList(new String[] {  
    "red", "green", "yellow", "white" }));  

Listing 7. Creating and ordering a Collection

List listColors = new ArrayList(Arrays.asList(new String[] {  
    "red", "green", "yellow", "white" }));  
  
Collections.sort(listColors);  

Listing 8. Intersection of sets using Set

Set hotColors = new HashSet(Arrays.asList(new String[] {  
    "red", "orange", "yellow" }));  
  
Set setColors = new HashSet(Arrays.asList(new String[] {  
    "red", "green", "yellow", "white" }));  
  
setColors.retainAll(hotColors); // setColors == {"red", "yellow"}  

Conclusion

This is just some of the real power of the Java Collections framework. We can do almost everything with it, and it’s, for sure, the most mature library of collections in Java of the world. Go ahead, and try a little!



Web developer and passioned for web design, SEO and front end technologies.

What did you think of this post?
Services
[Close]
To have full access to this post (or download the associated files) you must have MrBool Credits.

  See the prices for this post in Mr.Bool Credits System below:

Individually – in this case the price for this post is US$ 0,00 (Buy it now)
in this case you will buy only this video by paying the full price with no discount.

Package of 10 credits - in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download few videos. In this plan you will receive a discount of 50% in each video. Subscribe for this package!

Package of 50 credits – in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download several videos. In this plan you will receive a discount of 83% in each video. Subscribe for this package!


> More info about MrBool Credits
[Close]
You must be logged to download.

Click here to login