Arrays in Java: Creation

How are Java arrays allocated, and the elements given initial values?

Overview

Like any other Java object, space for an array is allocated on the heap, and an initialization process sets up the initial state of the array. However, the constructor for an array is not invoked in the same way as it is for other objects, and the initialization process can be a bit confusing.

Once again, the Java array initialization will look familiar to those who have programmed in C—but there are some important differences.

Memory allocation

In understanding memory allocation for an array, it helps to remember that an array is, above all, an object. The memory for all objects is allocated on the heap, usually via the new keyword. For an array, we follow new with the element type, then square brackets; in the square brackets, we specify the length of the array (or the length of the corresponding dimension in a multidimensional array).

(We can omit the array lengths if we specify initial values, as described in “Initial values”.)

Simple arrays

For example, to allocate space for an array of 10 int elements, we would write

new int[10]

This is an expression that returns a reference to the allocated array on the heap. However, the memory allocation isn’t very helpful by itself: we need to consume the reference returned, usually by assigning it to a variable. So if we declared durations as

int[] durations; 

we could then allocate space and assign the reference to durations with

durations = new int[10];

Of course, like other Java fields and variables, we can combine these two, using declaration with assignment:

int[] durations = new int[10]; 

Multidimensional arrays

Because a multidimensional array is actually an array of arrays, we can allocate space for the outer array alone (e.g. allocate space for an array of row references in a two-dimensional array), or allocate space for the outer array and one or more inner array dimensions as well. For example, to allocate the space for the two-dimensional data array shown in “Introduction: Multidimensional arrays”, and assuming the elements are of the int type, we would write

int[][] data = new int[3][4];

This declares data as a two-dimensional array of int (i.e. an array of int[] arrays); allocates an array of length 3, where each element is a reference to an int[] of length 4; and assigns the reference returned by new to data.

However, if at the time of allocation of the outer array, we don’t know what the length of the inner arrays will be—or if the inner arrays will be of different lengths—we would simply allocate the outer array first, without allocating space for the inner arrays. For example, we could write

int[][] data = new int[3][];

Note that we must still match dimensions: If (for example) we’re assigning a reference to a two-dimensional array-valued field or variable, then there must be 2 sets of brackets in the allocation expression on the right-hand side. Also, the length for at least one dimension must be specified, and any unspecified dimension lengths (empty brackets) must follow all of the specified dimension lengths.

After allocating the outer array, we can allocate the inner arrays, assigning each value returned by new to the corresponding element of the outer array, e.g.:

data[0] = new int[4];

Jagged arrays

When allocating space for the inner arrays, we can (if appropriate to the application) allocate arrays of different lengths. For example, consider this code:

int[][] data = new int[3][];
data[0] = new int[3];
data[1] = new int[2];
data[2] = new int[4];

We now have this jagged structure:

Two-dimensional jagged array

Zero-length arrays

In Java, it is allowed (and sometimes appropriate) to allocate an array of length zero (0). This is a significant difference from C, where an array is simply a contiguous block of memory, and a zero-length array isn’t of much use.1 In Java, a zero-length array is still an object, and can be quite useful.

The length field

Every array has a final field called length. The value of this field is set (when the array is allocated) to the number of elements in the array. This is one of the contexts in which the fact that a multidimensional array is actually an array of arrays is relevant: the length of such an array is the number of elements in the outer array, not the total number of elements in all of the inner arrays. In the two examples above, data.length has a value of 3.

Exceptions

If we attempt to allocate an array (simple, outer, or inner) with a negative length, java.lang.NegativeArraySizeException is thrown.

Initial values

When space for an array is allocated as described above, all of the elements are automatically filled with a default value for the declared element type:

This is true even of arrays declared and allocated as local variables in methods. (This is another important difference from the behavior of C/C++.)

This default behavior is often precisely what is needed; however, there are also many cases where we want to assign other values to array elements immediately upon or after allocation.

Of course, we can assign values to individual elements of an array (as seen in “Introduction: Accessing elements”). However, it’s often much more useful to assign values to all the elements of an array as part of the allocation statement. This functionality is supported through array initializer expressions.

Simple arrays

An array initializer expression is simply a brace-enclosed list of array values. In an array allocation expression, it follows immediately after the square brackets which would otherwise contain the array length(s). However, when an array initializer is used, no lengths are specified in the brackets; instead, the compiler gets the array length from the array initializer. For example, assume we declare weights with this statement:

int[] weights;

The following statement will allocate space for 5 int elements, assign the values specified to those elements, and assign the resulting reference to weights:

weights = new int[]{7, 3, 2, 5, 8};

With this syntax, specifying an array length in the square brackets will cause a compilation failure.

If we use declaration-with-assignment, we can choose to omit the allocation part of the statement; if we do, the allocation will be inferred by the compiler. Thus, the following two examples are equivalent:

int[] weights = new int[]{7, 3, 2, 5, 8};
int[] weights = {7, 3, 2, 5, 8};

Multidimensional arrays

Array initializers can also be used to assign values to the elements of multidimensional arrays—including jagged arrays. Once again, it’s a good idea to remember that this is actually an assignment to an array of arrays.

Take this declaration-with-assignment statement, for example:

int[][] data = {
    {10, 3, 7},
    {12, 6},
    {2, 0, 5, 6}
};

After the above statement executes, data has the following structure and content:

Two-dimensional array after initialization

Copy creation

Simply assigning an existing array to another array-valued variable doesn’t create a copy of the first. Instead, since all Java objects are accessed by reference, it simply assigns one array reference to a second one; there are now two variables referring to the same array, and changes to the element contents referenced by the first will also be reflected in the second.

Fortunately, there are other ways to create one array as a true copy of another:

Note: Both of the above approaches come with a significant caveat: Both perform shallow copies. In other words, a new array is being created in both cases—but if the element type of the original is an object type (including an array), then the element values copied are the references to the objects, rather than the objects themselves.

The implication of this shallow copying is that for arrays of mutable objects, as well as for multidimensional arrays (even if the element type of the innermost arrays is primitive), additional code will be needed to fully execute a copy of the objects referenced by the elements.

  1. A zero-length array is sometimes included as the final element of a C struct declaration; at runtime, the struct will usually be allocated with sufficient memory for the array to have as many elements as needed, and this length will be assigned to an element of the struct. In many ways, this is a low-level analogue to a Java array, where every array has a length field. However, in this case, it is not the array that is being allocated, but the struct—and with a non-zero size.