If you ever were working on a Java application that had anything to do with the web you are probably familiar with java.net.URL class. I have a puzzle for you then: let’s say in your application you refer to JavaBlogging site with two URL using slightly different host name. Will those URL objects be equal to each other? See the following code snippet:
1: 2: 3: 4: 5: 6: 7: |
public static void main(String[] args) throws Exception {
URL url1 = new URL("http://www.javablogging.com");
URL url2 = new URL("http://javablogging.com");
System.out.println(url1.equals(url2));
}
|
So what will get printed out? One could argue that those links refer to the same site so we should get ‘true’… but on the other hand the address spelling is different so maybe the correct answer is ‘false’? You may not be decided which solution is the right one, but since its a deterministic code we know at least that one of them is correct… right?
Wrong! You may get either ‘true’ or ‘false’. The strange thing is that the result does not depend on JVM, on your machine specifics or Java version, but… on whether or not you are connected to the Internet!
This strange behavior is explained partially in URL JavaDoc (Doh!): one of the conditions for the two URLs to be considered equal is that the host names are resolved to the same IP. Therefore first step in checking for equality is resolving the IP address and this can be done only with Internet connection. Without it the host names are compared as Strings.
What are the consequences of this behavior, besides all of the confusion? One thing is that because of the IP resolution the equals() method for URL is very slow. What is even worse the equals() method is inconsistent and therefore breaking the Object.equals() contract! (see the requirements for equals() method for comparison)
Now check out the following ‘improvement’ to the code:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
public static void main(String[] args) throws Exception {
URL url1 = new URL("http://www.javablogging.com");
URL url2 = new URL("http://javablogging.com");
while (true) {
System.out.println(url1.equals(url2));
}
}
|
disconnect yourself from Internet, run the code, connect again and enjoy URL changing its mind about the result of equality!
So what is the lesson from all of this? Remember that URL is cheating and its implementation of equals breaks the contract stated in Object class. Therefore URL should never be used in situations when it can be tested for equality as the results may be unpredictable. Beware of using it with collections: not use it as a Map key or as an element of Set or you may get yourself in trouble…
9 Comments until now
Nice post.
IIRC URIs equals() is well implemented, so you should prefer URI over URL.
[...] This post was mentioned on Twitter by Naresh Khalasi. Naresh Khalasi said: RT: @rahul_garg: How equals() works for URLs http://bit.ly/njWT4 [...]
Nice ppow. Didn’t know this.
I have never used equals on a URL so far. Never found the need to. An interesting equals implementation indeed
Is it really inconsistent? The JavaDoc for Object.equals() states:
It [equals()] is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
So it seems to be consistent in terms of the information used in its implementation. On the other hand, the JavaDoc for URL.equals() says:
Note: The defined behavior for equals is known to be inconsistent with virtual hosting in HTTP.
I believe you cannot think about internet connection in terms of URL’s ‘information used in equals comparisons’. The objects themselves were not changed by us and the equals() result changes – this is IMHO inconsistency.
Another thing: this result has nothing to do with virtual hosting (two different hosts like http://www.abc.com and http://www.xyz.com resolving to the same IP), but only with a different ways the same host name is written. So basically the note on URL.equals() means that even though http://www.abc.com and http://www.xyz.com are different hosts you can get that: (new URL(”www.abc.com”)).equals(new URL(www.xyz.com)) == true
This behavior makes non-sense. Whether a Host-Header matches the same virtual server is not being decided upon the IP bound to any interface. It may be on the same machine and equal even as it differs, and it may be different even if the IP matches.
[...] equals ( http://www.javablogging.com/how-equals-works-for-urls/ ) Es un articulo interesante y a tener muy en cuenta, de hecho el dia anterior a leer este blog [...]
Are they the same though?
The use of the subdomain (’www’) is superfluous because www is implicit when using the HTTP URL scheme; the second URL is using the implicit nature of HTTP to refer to www by default.
Women’s vibram five fingers kso shoes Gray Pink Shoes can keep your feet in the freest condition,just as if your were barefooted. vibram five fingers kso shoes is made of the best and proper materials,so you don’t need to doubt of its comfort with so thin sole and upper.
Gray match with a dreamlike and romantic color,pink vibram five fingers kso shoes is very consistent with the style of young girls.In this busy and tense times,people need relax without any fetter.
Add your Comment!