INFO 450

The Collections API

"Java: How To Program - 5th Ed." Chapter 21


More Cool Stuff from the API

We have learned to create our own data structures and have realized their importance and potency in terms of making advanced applications.  The business of making our own data structures and algorithms for searching them is difficult and the potential for error exists in most steps. 

Java comes with an API designed to implement most of the popular data structures and algorithms in a comprehensive framework known as the Collections API.  Ideally, it is good to know the internals of the various data structures and algorithms in order to know which is appropriate in a given setting, but implementing them can be difficult.  In the spirit of reuse, the Collections API provides a solid and tested implementation of most classic data structures and algorithms so that you don't have to. 

The Collections API was designed to take the place of some older facets of the Java API like:

  • Hashtable
  • Stack
  • Vector

and provide a more comprehensive system of data structures and algorithms that Java programmers and use, reuse and build upon. 

the Collections API and the STL in C++

The Collections API is very similar to the Standard Template Library in C++.  This is good as any transition you would want to make "backwards" to C++ would be enhanced by your awareness of the STL and the power of reuse.

A Collection

A collection is a loose, higher-level reference to a data structure.  The Collections framework presents interfaces to each of the data structures that you are already familiar with such that you merely use methods to populate them and manipulate them.

The real power of the collections within the Collections API is the fact that they are optimized in terms of speed, memory use and efficiency.  What is best above all is the fact that the collections are ready for reuse.  Each of these collections classes are available in the java.util package and the classes can be roughly categorized about the following interfaces:

  • Collection
  • Set
  • List
  • Map

The Arrays Class

One useful class is Arrays.  Arrays provides higher-level methods for array manipulation and relieves the programmer of some of the nitty-gritty management of arrays.  Among the more useful methods are:

  • equals() - compare arrays
  • fill() - placing values into an array
  • sort() - sorting an array
  • binarySearch() - searching a sorted array

Each of these methods are available for arrays of java.lang.Object (and derivatives) and primitives. 

Arrays Class Examples

UsingArrays.java

UsingAsList.java

The Collections Interface - Root of it All

As you would imagine, the Collections API makes heavy use of inheritance and polymorphism just as the rest of the Java API does; this is where the Collection interface comes in.  Both List and Set stem directly from the Collection interface. 

The majority of the methods provided by Collection are bulk operations for an entire collection for adding, clearing, comparing and retaining objects within the collection. 

Convert to Array

Any collection can be converted to an array, which are sometimes easier to deal with for sorting purposes.

Iterator

Each collection can also produce an Iterator, which is a class similar to Enumeration which has the ability to traverse any collection.  Iterator goes further than Enumeration in that an Iterator may remove nodes from a collection.

Collections Class

The Collections class provides a series of polymorphic methods for collection manipulation with regards to sorting and searching.

Lists

A List is an ordered collection which can contain duplicate elements.  We know about the Linked-List, which can be considered, for the purposes of the Collections API, as a List.  The List is a linear data structure or sequence which is indexed starting at zero. 

ArrayList

A List can be manipulated via indexing, must like a Vector.  In fact, the ArrayList, a subclass of List, is designed to be a direct replacement for the Vector class.

LinkedList

Another subclass of List is LinkedList, which is self-explanatory.

List Examples

CollectionTest.java

ListTest.java

UsingToArray.java

Algorithms

The majority of the algorithms we have learned are related to sorting and searching linear dynamic data structures.  The Algorithms classes provide implementations of many familiar methods for sorting and searching ready to go.  Some of the Algorithms available are:

  • Sort
  • BinarySearch
  • Reverse
  • Shuffle
  • Fill
  • Copy
  • Min
  • Max

Sort

The Sort class is a generic and stable-sort (does not reorder equal-valued elements in a structure) method.  As we have studied, using the Big Oh notation, the relative prowess and complexity theory of various algorithms previously, the Sort in the Collections API is a linearithmic (N log N) method.

Sort Example

Sort1.java

Sort2.java

Shuffle

This is an interesting algorithm designed to do just what it says it will: randomly reorder elements in a List.

Shuffle Example

Cards.java

Reverse, Fill, Copy

Each of these are also self-evident and are designed to primarily work on Lists.

Min and Max

These can be used to find the minimum and maximum value in any Collection.

Using fill, copy, min and max

Algorithms1.java

Binary Search

This is also self-evident and represents a well-honed, high-speed implementation of this search algorithm.  The binarySearch algorithm will find an element in a List (Vector, ArrayList, LinkedList).

binarySearch Example

BinarySearchTest.java

Set

A Set is a Collection which contains only unique elements (no repeating values).  There are two primary implementations of the Set:

  • HashSet - replaces Hashtable
  • TreeSet - a tree implementation

As we already know these two approaches to non-duplicate data storage, there is not much else that needs to be said about them

Set Examples

SetTest.java

SortedSetTest.java

Map

A map is a key-value association which is familiar to the hashing process and discussion.  A Map will associate a key value to a record value and may not contain duplicate keys.  Each key map can contain only one value, making a one-to-one mapping.  Whereas a Set will only contain a key value, a Map contains both the key and value together; making the Map different than a normal hash table.  The implementations of Map are then very similar to those of Set:

  • HashMap - contains the key and value
  • TreeMap - contains the key and value

Sorted Map

The SortedMap interface extends Map by keeping the keys in a sort order.

Map Examples

WordTypeCount.java

Synchronization and Immutability

There are some things to think about with the Collections API with respect to multi-threaded programs.  None of the collections in the Collections API are synchronized and thus not considered thread-safe. There are a series of wrapper classes provided in order to provide synchronized thread safety:

  • Collection synchronizedCollection( Collection c )
  • List synchronizedList( List l )
  • Set synchronizedSet( Set s )
  • SortedSet synchronizedSortedSet( SortedSet s )
  • Map synchronizedMap( Map m )
  • SortedMap synchronizedSortedMap( SortedMap m )

Another series of wrapper classes exist to provided immutable read-only version of each Collection:

  • Collection unmodifiableCollection( Collection c )
  • List unmodifiableList( List aList )
  • Set unmodifiableSet( Set s )
  • SortedSet unmodifiableSortedSet( SortedSet s )
  • Map unmodifiableMap( Map m )
  • SortedMap unmodifiableSortedMap( SortedMap m )

Abstract Classes

An additional "beauty" of the Collections API are the presence of abstract classes from which a programmer can customize their own versions of the collections API.  These "bare bones" abstract classes can allow the programmer to tweak and customize and remain within the larger confines and framework of the Collections API.  Some of the abstract classes available are:

  • AbstractCollection
  • AbstractList
  • AbstractMap
  • AbstractSequentialList
  • AbstractSet