Java gotcha: anArray.hashCode() isn’t deep, but aList.hashCode() is

By Adrian Smith14 Feb 2008400 words2 mins to read

Every object has a hashCode and an equals method. These are used to determine where to place an object within a hashing algorithm, and if two objects with the same place in the hashing algorithm actually are the same, respectively. If you want to add objects to a Set—which stores only unique objects—it uses these methods to determine whether two objects are the same and thus shouldn't both be stored.

If you have code like:

Set<byte[]> uniqueArrays = new HashSet<byte[]>();
uniqueArrays.add(new byte[] { 1,2,3 });
uniqueArrays.add(new byte[] { 1,2,3 });
uniqueArrays.add(new byte[] { 1,2 });
System.out.println(uniqueArrays.size() + " unique byte arrays");

This code prints 3. You might expect this program to print 2, as there are only two unique arrays within the Set. But arrays' hashCode methods do not return the same result for two different arrays with the same contents. This is in contrast to, for example, the String class, which does indeed consider the String's contents when computing the hashCode.

Set<String> uniqueStrings = new HashSet<String>();
uniqueStrings.add(new String("123"));
uniqueStrings.add(new String("123"));
uniqueStrings.add(new String("12"));
System.out.println(uniqueStrings.size() + " unique strings");

This code prints 2. (The slightly strange-looking "new String" here is to make sure that there are actually different object instances with the same content being passed to the add method; otherwise the Java compiler would use the same object instance for the two calls, as the string-content is the same.)

The solution is to use the Arrays.hashCode(anArray) method.

This isn't particularly convenient if you want to store unique arrays in a set. But if you have an object with e.g. a byte[] instance variable, then you can implement the hashCode method on that object to use Arrays.hashCode, or you can use the code:

Map<Integer, byte[]> map = new HashMap<Integer, byte[]>();
map.put(Arrays.hashCode(anArray), anArray);
Collection<byte[]> uniqueByteArrays = map.values();

Note: This only applies to arrays. Lists and Sets do consider their entries when computing their hashCodes, and thus can usefully be used as they keys of Sets.

This article was written by Adrian Smith on 14 Feb 2008

Follow me: Facebook | Twitter | LinkedIn | Email

More on: Java | FAIL | Language Design | Coding