Friday, 27 May 2016

Java Integer Cache

This Java article is to introduce and discuss about Integer Cache. This is a feature introduced in Java 5 to save memory and improve the performance. Let us first have a look at a sample code which uses Integers and showcases the Integer Cache behavior. From there lets study how and why it is implemented. Can you guess the output of the following Java program. Obviously there is some twist in it and that’s why we have this Java article.
package com.javapapers.java;

public class JavaIntegerCache {
 public static void main(String... strings) {

  Integer integer1 = 3;
  Integer integer2 = 3;

  if (integer1 == integer2)
   System.out.println("integer1 == integer2");
  else
   System.out.println("integer1 != integer2");

  Integer integer3 = 300;
  Integer integer4 = 300;
  
  if (integer3 == integer4)
   System.out.println("integer3 == integer4");
  else
   System.out.println("integer3 != integer4");
    
 }
}
What we generally expect is to have both statements to return false. Though the values are same, the compared objects should be different as they will have different references. If you are a beginner, then in Java == checks for object references and equals() checks for values. So here in this case, different objects should be having different reference and so when compared, they should return a false boolean value. What is strange here is, the behavior is not same. Two similar if-conditions returns different boolean values.
Now lets look at the above Java program’s output,
integer1 == integer2
integer3 != integer4
Cache

Java Integer Cache Implementation

In Java 5, a new feature was introduced to save the memory and improve performance for Integer type objects handlings. Integer objects are cached internally and reused via the same referenced objects.
  • This is applicable for Integer values in range between –127 to +127 (Max Integer value).
  • This Integer caching works only on autoboxing. Integer objects will not be cached when they are built using the constructor.
Automatic conversion done by Java compiler from a primitive to its corresponding Java wrapper class type is called autoboxing. This is equal to using the valueOf as follows,
Integer a = 10; //this is autoboxing
Integer b = Integer.valueOf(10); //under the hood
So now we know where this caching should be implemented in the Java JDK source. Lets look at the valueOf method source from Java JDK. Following is from Java JDK 1.8.0 build 25.
   /**
     * Returns an {@code Integer} instance representing the specified
     * {@code int} value.  If a new {@code Integer} instance is not
     * required, this method should generally be used in preference to
     * the constructor {@link #Integer(int)}, as this method is likely
     * to yield significantly better space and time performance by
     * caching frequently requested values.
     *
     * This method will always cache values in the range -128 to 127,
     * inclusive, and may cache other values outside of this range.
     *
     * @param  i an {@code int} value.
     * @return an {@code Integer} instance representing {@code i}.
     * @since  1.5
     */
    public static Integer valueOf(int i) {
        if (i >= IntegerCache.low && i <= IntegerCache.high)
            return IntegerCache.cache[i + (-IntegerCache.low)];
        return new Integer(i);
    }
There is a lookup to IntegerCache.cache before constructing a new Integer instance. Then there is a Java class taking care of the Integer caching.

IntegerCache Class

IntegerCache is a private static inner class of Integer class. Lets have a look at that class. It is nicely documented in the JDK and gives most of the information.
 
  /**
     * Cache to support the object identity semantics of autoboxing for values between
     * -128 and 127 (inclusive) as required by JLS.
     *
     * The cache is initialized on first usage.  The size of the cache
     * may be controlled by the {@code -XX:AutoBoxCacheMax=} option.
     * During VM initialization, java.lang.Integer.IntegerCache.high property
     * may be set and saved in the private system properties in the
     * sun.misc.VM class.
     */

    private static class IntegerCache {
        static final int low = -128;
        static final int high;
        static final Integer cache[];

        static {
            // high value may be configured by property
            int h = 127;
            String integerCacheHighPropValue =
                sun.misc.VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
            if (integerCacheHighPropValue != null) {
                try {
                    int i = parseInt(integerCacheHighPropValue);
                    i = Math.max(i, 127);
                    // Maximum array size is Integer.MAX_VALUE
                    h = Math.min(i, Integer.MAX_VALUE - (-low) -1);
                } catch( NumberFormatException nfe) {
                    // If the property cannot be parsed into an int, ignore it.
                }
            }
            high = h;

            cache = new Integer[(high - low) + 1];
            int j = low;
            for(int k = 0; k < cache.length; k++)
                cache[k] = new Integer(j++);

            // range [-128, 127] must be interned (JLS7 5.1.7)
            assert IntegerCache.high >= 127;
        }

        private IntegerCache() {}
    }
The Javadoc comment clearly states that this class is for cache and to support the autoboxing of values between 128 and 127. The high value of 127 can be modified by using a VM argument -XX:AutoBoxCacheMax=size. So the caching happens in the for-loop. It just runs from the low to high and creates as many Integer instances and stores in an Integer array named cache. As simple as that. This caching is doing at the first usage of the Integer class. Henceforth, these cached instances are used instead of creating a new instance (during autoboxing).
Actually when this feature was first introduced in Java 5, the range was fixed to –127 to +127. Later in Java 6, the high end of the range was mapped to java.lang.Integer.IntegerCache.high and a VM argument allowed us to set the high number. Which has given flexibility to tune the performance according to our application use case. What is should have been the reason behind choosing this range of numbers from –127 to 127. This is conceived to be the widely most range of integer numbers. The first usage of Integer in a program has to take that extra amount of time to cache the instances. 

Cache Enforcement in Java Language Specification

In the Boxing Conversion section of Java Language Specification (JLS) it is stated as follows,
If the value p being boxed is an integer literal of type int between -128 and 127 inclusive (§3.10.1), or the boolean literal true or false (§3.10.3), or a character literal between ‘\u0000’ and ‘\u007f’ inclusive (§3.10.4), then let a and b be the results of any two boxing conversions of p. It is always the case that a == b.
A above statement ensures that the reference of objects with values between -128 and 127 should be the same. Reasoning for this decision is also provided as below,
Ideally, boxing a primitive value would always yield an identical reference. In practice, this may not be feasible using existing implementation techniques. The rule above is a pragmatic compromise, requiring that certain common values always be boxed into indistinguishable objects. The implementation may cache these, lazily or eagerly. For other values, the rule disallows any assumptions about the identity of the boxed values on the programmer’s part. This allows (but does not require) sharing of some or all of these references. Notice that integer literals of type long are allowed, but not required, to be shared.
This ensures that in most common cases, the behavior will be the desired one, without imposing an undue performance penalty, especially on small devices. Less memory-limited implementations might, for example, cache all char and short values, as well as int and long values in the range of -32K to +32K.

Other Cached Objects

This caching behavior is not only applicable for Integer objects. We have similar caching implementation for all the integer type classes.
  • We have ByteCache doing the caching for Byte objects.
  • We have ShortCache doing the caching for Short objects.
  • We have LongCache doing the caching for Long objects.
  • We have CharacterCache doing the caching for Character objects.
Byte, Short, Long has fixed range for caching, i.e. values between –127 to 127 (inclusive). For Character, the range is from 0 to 127 (inclusive). Range cannot be modified via argument but for Integer, it can be done.

SOURCE

No comments:

Post a Comment