Java Building Blocks: Literal Values

Boolean, numeric, character, and String literal values

Overview

The Java compiler recognizes literal values for all primitive types and the String object type. In this page, we’ll explore how these literal values are written in code and interpreted by the compiler.

Notes on examples

  1. When typing (or copying and pasting) any of the examples on this page into JShell, the comment portion—starting with the two slashes (//) and continuing to the end of the line—may be omitted.

  2. Some of the literal values in the examples are rendered in different colors; these simply reflect the code syntax highlighting library’s understanding (or, in some cases, misunderstanding) of proper Java syntax. When these examples are typed or pasted into JShell, the colors seen on this page will not be displayed by JShell.

Boolean

The only literal values of the boolean type are true and false. These are case-sensitive; for example, none of True, TRUE, False, and FALSE is recognized as a boolean literal.

Integer

Formats

An integer literal value is expressed using the following elements (some are optional), which must be in the order shown:

  1. An optional sign:

    • + (default) denotes a positive value.
    • - denotes a negative value.
  2. Zero or more whitespace characters.

  3. An optional case-insensitive radix (number base) prefix:

    • No prefix (default) denotes a base-10 (decimal) value.
    • 0b denotes base 2 (binary).
    • 0x denotes base 16 (hexadecimal).
    • 0, not followed by b or x, denotes base 8 (octal).
  4. 1 or more digit characters, with underscores (_) allowed as digit-group separators.

    • In a decimal value, only the digit characters 09 are allowed, along with the underscore (_) digit-group separator.
    • In a binary value, only the digit characters 0 and 1 are allowed, along with the digit-group separator.
    • In a hexadecimal value, only the digit characters 09 and af (case-insensitive) are allowed, along with the digit-group separator.
    • In base-8 values, only the digit characters 07 are allowed, along with the digit-group separator.
    • An underscore is not allowed to appear before the first digit character, or after the last digit character. In an octal representation, the 0 prefix is counted as the first digit character; this is not true of the 0b and 0x prefixes for binary and hexadecimal representations.
  5. An optional type suffix:

    • No suffix (default) for an int value.
    • L for a long value. (This is case-insensitive; however, the use of a lowercase l is strongly discouraged, to reduce the potential for confusing it with the digit 1.)

    As the above implies, there is no such thing as a byte or short literal value in Java. However, an int literal value may be assigned to a byte, short, or char variable or field; if the value assigned is outside the range of the variable type, compilation will fail.

    If the value specified is too large for the int type, and the L suffix is not used, compilation will fail.

Examples

Base 10 (decimal)

262143 // Valid: Recognized as base-10 representation of int 262143.
262_143 // Valid: Underscores allowed as digit-group separators.
262,143 // Invalid: Comma (,) not allowed as a digit-group separator.
2621_43 // Valid: Digit groups may be of any positive length.
_262_143 // Invalid: Underscore not allowed before first digit.
262_143_ // Invalid: Underscore not allowed after last digit.
26Z_143 // Invalid: Only 0-9 recognized as base-10 digit characters.
4_311_810_305 // Invalid: Outside the range of int.
4_311_810_305L // Valid: Recognized as long 4311810305.
4_311_810_305_L // Invalid: Underscore not allowed after last digit.

Base 2 (binary)

0b111111111111111111 // Valid: Recognized as base-2 form of int 262143 (base-10).
0B111111111111111111 // Valid: 0b prefix in base-2 representation is case-insensitive.
0b11_1111_1111_1111_1111 // Valid: Underscores allowed as digit-group separators.
0b11,1111,1111,1111,1111 // Invalid: Comma (,) not allowed as a digit-group separator.
0b111111_11111_111_1111 // Valid: Digit groups may be of any positive length.
0b_11_1111_1111_1111_1111 // Invalid: Underscore not allowed before first digit.
0b11_1111_1111_1111_1111_ // Invalid: Underscore not allowed after last digit.
0b11_1111_1111_1111_2111 // Invalid: Only 0 and 1 recognized as base-2 digits.
0b1_00000001_00000001_00000001_00000001 // Invalid: Outside the range of int.
0b1_00000001_00000001_00000001_00000001L // Valid: Recognized as long 4311810305 (base-10).
0b1_00000001_00000001_00000001_00000001_L // Invalid: Underscore not allowed after last digit.

Base 8 (octal)

0777777 // Valid: Recognized as base-8 representation of int 262143 (base-10).
0777_777 // Valid: Underscores allowed as digit-group separators.
0777,777 // Invalid: Invalid: Comma (,) not allowed as a digit-group separator.
077_77_77 // Valid: Digit groups may be of any positive length.
0_777_777 // Valid: 0 prefix also treated as digit in base-2 representation.
0777_777_ // Invalid: Underscore not allowed after last digit.
0778_777 // Invalid: Only 0-7 recognized as base-8 digit characters.
040_100_200_401 // Invalid: Outside the range of int.
040_100_200_401L // Valid: Recognized as long 4311810305 (base-10).
040_100_200_401_L // Invalid: Underscore not allowed after last digit.

Base 16 (hexadecimal)

0x03FFFF // Valid: recognized as base-16 representation of int 262143 (base-10).
0X03ffff // Valid: 0x prefix in base-16 is case-insensitive, as are digits a-f.
0x03_FF_FF // Valid: Underscores allowed as digit-group separators.
0x03,FF,FF // Invalid: Invalid: Comma (,) not allowed as a digit-group separator.
0x03F_FFF // Valid: Digit groups may be of any positive length.
0x_03_FF_FF // Invalid: Underscore not allowed before first digit.
0x03_FF_FF_ // Invalid: Underscore not allowed after last digit.
0x03_FF_FG // Invalid: Only 0-9, a-f, and A-F recognized as base-8 digits.
0x0001_0101_0101 // Invalid: Outside the range of int.
0x0001_0101_0101L // Valid: Recognized as long 4311810305 (base-10).
0x0001_0101_0101_L // Invalid: Underscore not allowed after last digit.

Character

Formats

A character literal actually specifies an integer value, but in a different format than that used for integer literals; in most contexts, this value is interpreted as a Unicode code point—the position of the character in the table of all Unicode characters—in the range $0 \ldotp \ldotp 65,535$; this range is also known as the Unicode Basic Multilingual Plane).

A character literal comprises the following parts, in order:

  1. An opening single quote (').

  2. A single character, written in one of the following forms:

    • A literal Unicode character, e.g. a, 6, @, ¿. If a Java source code file contains any characters with a code point greater than 127 (such as ¿), UTF-8 is highly recommended for the file encoding.

    • A backslash followed by a Unicode code point specified in base 8 or base 16:

      • An octal number in the range 0377. (The 0 prefix required for octal integer literals is not used here.)

        The characters that can be represented this way are limited to those with a code point in the range $0 \ldotp \ldotp 255$; this is the range occupied by the Basic Latin and Latin-1 Supplement blocks.

      • Lowercase u followed by 4 hexadecimal digits. (The 0x prefix required for hexadecimal integer literals is not used here.)

      Underscores are not permitted in the code point expression.

    • One of these escape sequences:

      • \t: tab
      • \b: backspace
      • \n: newline
      • \r: carriage return
      • \': single quote
      • \": double quote
      • \\: backslash

    As described below, every character in a String literal is also written in one of the three forms described above.

  3. A closing single quote.

Examples

'a' // Valid: Recognized as the value 97 (the code point of 'a').
'aa' // Invalid: character literal must contain exactly 1 character.
"a" // Beware: Not a character literal, but a String literal.
'¿' // Valid: Recognized as the value 191 (the code point of '¿').
'' // Invalid: A character literal must consist of exactly 1 character.
'\u00BF' // Valid: Recognized as the decimal value 191, which is the code point of '¿'.
'\u00bf' // Valid: Hexadecimal digits are case-insensitive.
'\U00BF' // Invalid: Leading 'u' must be lowercase.
'\u00_BF' // Invalid: Underscore not allowed as digit group separator in character literals.
"\u00bf" // Beware: Not a character literal, but a String literal.
'\277' // Valid: Recognized as the decimal value 191, which is the code point of '¿'.
'\287' // Invalid: Only 0-7 recognized as base-8 digit characters.
'\400' // Invalid: Maximum value permitted in octal character is \377 (decimal 255).

Floating-point

An integral literal, expressed in base-10 form, may be followed by one of these case-insensitive type suffixes, to force recognition of the value as a floating-point literal:

Beyond that, literal values expressed in the following formats will automatically be treated as floating-point literals.

Formats

Typically, a floating-point literal value is stated with the following parts (some are optional). These must be in the order shown.

  1. An optional sign:

    • + (default) denotes a positive value.
    • - denotes a negative value.
  2. Zero or more whitespace characters.

  3. Zero or more base-10 digit characters, giving the integral part of the value. If zero digits are provided, the integral part is interpreted as 0. Underscores (_) may be included as digit-group separators, but no underscore may appear before the first digit character of this part, or after the last digit character.

  4. A decimal point.1

  5. Zero or more base-10 digit characters, giving the fractional part of the value. If zero digits are provided, the fractional part is interpreted as 0. Underscores (_) may be included as digit-group separators, but no underscore may appear before the first digit character of this part, or after the last digit character.

  6. An optional exponent part, indicating that the floating-point value before the exponent should be multiplied by a specified power of 10. (This is part of the convention followed for scientific notation; however, while that convention has one non-zero digit for the integer part, the format recognized by the compiler does not enforce that.) This consists of 2 subparts, in the order shown:

    1. The letter e (case-insensitive).
    2. An optional sign:

      • + (default) denotes a positive exponent.
      • - denotes a negative value.
    3. One or more base-10 digit characters, with leading zeros allowed.
  7. An optional case-insensitive type suffix:

    • d denotes a double value.
    • f denotes a float value.

    If no type suffix is used, double is assumed.

There is also a (rarely used) hybrid hexadecimal/decimal floating-point format. In this format, part 3 of the above format uses the 0x prefix, and parts 3 and 4 (but not 6) are written using hexadecimal digit characters. This format requires the exponent part, as shown in part 6; however, the letter p (case-insensitive) is used instead of e.

Examples

0.15

Valid.

0.15

Valid.

-0.15

Valid.

.15

Valid.

-.15

Valid.

- 0.15

Valid: whitespace allowed after sign, before integer part.

0 .15

Invalid: whitespace is allowed only after the sign character.

0.-15

Danger: A sign after the first digit character is interpreted by the compiler as an arithmetic operator. Thus, this will be interpreted as $(0.0 - 15)$, or -15.0.

0.15e3

Valid: interpreted as $(0.15 \cdot 10^3)$, or 150.0.

0.15E3

Valid: e token is not case-sensitive; recognized as $(0.15 \cdot 10^3)$, or 150.0.

0.15e-3

Valid: interpreted as $(0.15 \cdot 10^{-3})$, or 0.00015.

0.1_5e3

Valid: digit-group separators allowed between digits.

String

Format

A Java String literal is expressed as a sequence of zero or more character values, enclosed in double quotes. Each of the individual characters follows the format described above for the single character of a character literal, but without single quotes enclosing each character.

Java 15 formalizes a multiline text block format for String literals (introduced in experimental form in Java 13 & 14); prior to that version, the only way to express a String literal consisting of multiple lines of text was with embedded \n (newline) characters. (Multiline text blocks are out of scope for this introduction.)

Examples

"Hello, World!"

Valid.

"Hello,\nWorld!"

Valid: string contains an embedded newline.

"\u00A1Hello, World!"

Valid: first character of the String literal is the ¡ character, for a resulting value of "¡Hello, World!"

""

Valid: a String may be empty—that is, with a length of zero.

'@'

Danger: While the sequence of characters in a String literal may consist of a single character, this example is a character literal (since it is enclosed in single quotes), which is not assignment-compatible with a String.

Array

Formally, there is no such thing as an array literal in Java; however, there is an array initializer, which offers a similar compactness of expression and more flexibility. For more information, see “Arrays in Java”.

  1. More generally, we call this a fraction point, since its usage does not imply using base 10 (decimal), and it is not always expressed via the . character.