Introduction

Overview

Regardless of the programming language, we often distinguish between scalar data types, which can only hold one piece of data, and structured types, which can contain multiple data elements. Records or objects (again, in non-language-specific terms) are structured types containing multiple data elements, each one potentially of a distinct type, and with each element referenced by its name. But a more fundamental structured type is a sequence, containing multiple data elements of the same type, referenced by position or index. In Java (as in many other languages), the most basic—and most important—kind of sequence is an array.

Concepts & lineage

We can picture an array as a row of sequentially numbered post office boxes. We can open any individual box and see what it contains; we can also open any individual box and put something inside it. We usually refer to these boxes as elements, and the box numbers as indices or positions. Java arrays are zero-based: the first element position is 0, and an array containing n elements has elements in index positions 0 through (n - 1).

In this row of post office boxes, each box is the same size and shape; therefore, each box must contain the same type of content. Continuing the analogy, imagine a row of very small, letter-sized boxes: We could put a letter in each one, but not a package. The post office boxes are homogenous—and so are Java arrays.

Java arrays are not resizable: Once memory is allocated for an array of n elements, that block of memory can’t be expanded or contracted. The only thing we can do is allocate a new array of the modified size, and copy some or all of the element values to the new array. (Fortunately, this allocation and copying can often be done with a single invocation of the Arrays.copyOf method; see “Creation: Copy creation” for more information.)

All of the above characteristics—zero-based indexing, homogeneity, and non-resizability—are due, in large part, to Java’s lineage: Java’s low-level syntax is derived directly from C syntax, and this extends to arrays. There are other similarities—and some important differences—between Java and C arrays; some of them will be addressed in “Declaration” and “Creation”.

Element types

The most basic arrays are defined to contain a primitive value in each element. For example, we have this array of int values, of length 5:

*index*	0	1	2	3	4
value	2	7	1	8	2

Arrays may also be defined to hold objects. It’s important to remember, however, that all Java symbols declared to be of an object type actually hold references to those objects; this is also true of arrays. For example, the elements of a String[] are not actually the string values, but references to those strings. We usually don’t need to worry about this distinction in Java, but it comes into play in array initialization, and also in the creation and use of multidimensional arrays.

Accessing elements

We access an individual element of an array by specifying the index position in square brackets after the array name. For example, if we have an array called distances, we can refer to element 2 of that array with distances[2]. Such an element reference can be used to include the value of the referenced element in a computation (e.g. distances[2] / 1609.0); it can also be used on the left-hand side of an assignment statement, to assign a value to the referenced element (e.g. distances[2] = 800;).

Exceptions

If we have an array with n elements, any attempt to reference an element at position n or higher, or an element at a negative position, will result in a java.lang.ArrayIndexOutOfBoundsException being thrown.

Multidimensional arrays

In Java (as well as in C), it’s more accurate to say that there are arrays of arrays, rather than multidimensional arrays. This may sound like merely a semantic difference; that’s arguably true for most purposes—but not all.

Going back to the elements types of an array, the elements of an array can even be arrays themselves. This actually makes perfect sense when we remember that Java arrays are objects. So a multidimensional array is actually a simple, one-dimensional array, but one in which each element is a reference to an object—in this case, an array. For a two-dimensional array, we usually visualize this as an array of rows, where each row is itself an array. We can access any element in such an array by first specifying the row index, and then the column index. This is illustrated in the figure that follows, where data has been declared and allocated with 3 rows, each with 4 columns.

Two-dimensional array indexing

Take it a step further: A three-dimensional array is actually a one-dimensional array, where each element is a reference to another one-dimensional array, and each element of those inner arrays are themselves references to arrays! We can extend this to practically any number of dimensions.¹

Jagged arrays

Since each element of the outer array in a multidimensional array is a reference to a distinct array, there’s no intrinsic reason for all of the separate inner arrays to have the same number of elements. Of course, if the Java syntax required us to allocate all of these inner arrays with the same length, it would be a different story; but as we’ll see in “Creation: Jagged arrays”, this isn’t the case. In fact, in a multidimensional array, the inner arrays may well have different lengths, and there are several applications in which this capability is not only acceptable, but appropriate.

The Java language doesn’t limit the number of dimensions, but all current JVMs support a maximum of 255 dimensions. Arguably, if this becomes an issue for some program, then arrays may not be the best approach for that program anyway. ↩