Try fast search NHibernate

19 November 2009

The .NET’s Attribute hole

Reading this post try to imagine how much I have worked, with my code, before create a test to test the behavior of the class Attribute.

Assertion

In .Net an Attribute is a class as any other and may have state and behavior.

True ? so and so… first of all, as class, it has some restriction as: it must inherit from System.Attribute and can’t be a generic class; second as any class it was implemented by a human with his fantasy (and this is the problem).

Perhaps, add behavior to an Attribute is not so common but, since even Microsoft is doing it, we can assume that it is a legal usage… anyway this is not the problem.

The Equal and GetHashCode issue

[AttributeUsage(AttributeTargets.Property)]
public class MyMarkerAttribute: Attribute
{
}
var attribute = new MyMarkerAttribute();
var another = new MyMarkerAttribute();
Console.WriteLine("Using == the result is: " + (attribute == another));
Console.WriteLine("Using Equals the result is:" + attribute.Equals(another));
Console.WriteLine("attribute hash:" + attribute.GetHashCode() + " another hash:" + another.GetHashCode());

This is the output:

Using == the result is: False
Using Equals the result is:True
attribute hash:10001680 another hash:10001680

Note: I haven’t implemented neither Equals nor GetHashCode. Every thing is as expected ? In a normal class this is not the expected behavior, by the way so far I can accept the situation.

Now I will change the attribute adding a integer property (about integer you know that its hash-code is its value):

[AttributeUsage(AttributeTargets.Property)]
public class MyMarkerAttribute: Attribute
{
public int Min { get; set; }
}

and now the output is:

Using == the result is: False
Using Equals the result is:True
attribute hash:0 another hash:0

What ? The HashCode now is cero ? … wait… wait… it does not end here.

[AttributeUsage(AttributeTargets.Property)]
public class MyMarkerAttribute: Attribute
{
public int Min { get; set; }
public int Max { get; set; }
}
var attribute = new MyMarkerAttribute { Min=5, Max= 456 };
var another = new MyMarkerAttribute { Min = 5, Max = 123 };

And the output is:

Using Equals the result is:False
attribute hash:5 another hash:5

As you can see the hash code is the hash of the first property (the same if it would be a field) and the most important inconsistence is that the two instance result as NOT EQUALS but they has the same HashCode.

[AttributeUsage(AttributeTargets.Property)]
public class MyMarkerAttribute: Attribute
{
private int myBergamota = 123;
public int Min { get; set; }
public int Max { get; set; }
}
var attribute = new MyMarkerAttribute {Min = 5, Max = 456};
var another = new MyMarkerAttribute { Min = 5, Max = 456 };

The output is:

Using Equals the result is:True
attribute hash:123 another hash:123

So the HashCode is the hash-code of the first property and the Equals work using the state of each property of instance of the attribute… such fantasy…

[AttributeUsage(AttributeTargets.Property)]
public class MyMarkerAttribute: Attribute
{
private int myBergamota = 123;
private int min;
public int Min
{
get { return min; }
set { min = value; }
}

public int Max { get; set; }

public void SetFieldValue()
{
myBergamota = 789;
}
}
var attribute = new MyMarkerAttribute {Min = 5, Max = 456};
var another = new MyMarkerAttribute { Min = 5, Max = 456 };

Console.WriteLine("attribute hash:" + attribute.GetHashCode() + " another hash:" + another.GetHashCode());

another.SetFieldValue();

Console.WriteLine("Using Equals the result is:" + attribute.Equals(another));
Console.WriteLine("attribute hash:" + attribute.GetHashCode() + " another hash:" + another.GetHashCode());

And the output is:

attribute hash:123 another hash:123
Using Equals the result is:False
attribute hash:123 another hash:789

What ? The hash changing at run-time ? but one of your recommendation is not that the hash-code should never change during execution ? what happen with Attribute ?

Wait a minute… who said that the hash-code change at run-time; it depend!!! depend about what… try to guess…

Now I will change only the position of the declaration of the private field:

[AttributeUsage(AttributeTargets.Property)]
public class MyMarkerAttribute: Attribute
{
private int min;
private int myBergamota = 123;
public int Min
{
get { return min; }
set { min = value; }
}

public int Max { get; set; }

public void SetFieldValue()
{
myBergamota = 789;
}
}

As you can see the only change is the position of the declaration, of the field myBergamota, from the first position to the second inside the class implementation; nothing more than the position inside the implementation!!! and now the output is:

attribute hash:5 another hash:5
Using Equals the result is:False
attribute hash:5 another hash:5

Re WHAT!???!!???! now I have the same hash-code ?!?!?!?!? only moving a declaration inside the class ?!?!?!?

Now you know that there is an ugly issue inside the .NET’s Attribute implementation and you may have a very very ugly surprise if you are writing a framework using attribute to cache metadata especially if the Attribute has behavior changing the internal state of the instance.

Resuming
The Attribute class override Equals and GetHashCode. The Equals is using the whole internal state, the GetHashCode is using only state of the first field and its value may change during execution.

The workaround
If you want a deterministic behavior the solution is simple and is the same you are applying in your classes: override Equals and GetHashCode.

If you don’t need a special behavior the implementation to use the Reference (the base of any other class) is:
public override bool Equals(object obj)
{
return ReferenceEquals(this, obj);
}

public override int GetHashCode()
{
return RuntimeHelpers.GetHashCode(this);
}

Conclusion

I like the application of your fantasy during software development, but be moderate please.

18 comments:

  1. I'm curious in knowing how exactly you are using attributes. For once, I am never newing up instances of attributes in code. The fact that
    it's possible is somewhat a bit of a mystery. Second, I would never use attributes in a set of any kind. My understanding of attributes are to mark types, members etc. statically, in which case your tests makes no sense whatsoever.

    Now, I might be wrong on this one, but I thing you are misusing the feature...

    ReplyDelete
  2. Hi,

    msdn GetHashCode documentation says that if two objects DO NOT compare as EQUAL, the GetHashCode methods for the two object do not have to return different value.

    Nice post!

    ReplyDelete
  3. Mmmmmmmm, ugly, ugly...

    Types inheriting from ValueType have behaviors like this (implementation of GetHashCode for ValueTypes is very strange) but the Attribute type inherits from Object.

    Remains to see why the Equals and GetHashCode implementation for Attribute behaves that whay.

    Go for a mate (spanish sense of that word) :-)

    ReplyDelete
  4. By the way, I'm not saying that the implementation of the Attribute type is correct, I'm just questioning whether this actually tests proper usage of the feature.
    I've never seen any code newing up an instance of an attribute...(?)

    ReplyDelete
  5. @TigerShark
    Microsoft is using Attribute with behaviour; the last most famous place is DataAnnotation but you can see the same in some WCF example.
    btw I never seen a code where the position of a declaration takes a sense in the behaviour of the class.

    ReplyDelete
  6. I took a brief look at the Attribute class in Reflector. My guess is that some clever developer thought that he needed some mechanism for automatically creating the hashcode runtime. I believe the idea is fine, but the implementation fails to take into account the fact that the member placements affects the actual hashcode calculation in the implementation.
    Funny issue, but has it ever been a real problem? I'm just curious...

    ReplyDelete
  7. @Mauricio
    The hash-code should never change during the object life-cycle. If the hash-code change you can't use that class neither in a Dictionary nor HashSet or you should specify a custom EqualityComparer.

    ReplyDelete
  8. @Fabio
    Are you saying that the result of GetHashCode() shouldn't change if member values are changed? This can only be correct for immutable types. Every type with mutators would change their hashcode whenever one of the member values changes. At least that's the way every implementation I've ever seen of GetHashCode() behaves.

    ReplyDelete
  9. I'm sorry... must be tired... got it now :)

    ReplyDelete
  10. @TigerShark
    Probably you are not using your classes as key in a Dictionary nor HashSet because if you are changing the HashCode, after add an instance to a Dictionary, you may have the same instance multiples times.
    Who work building base-frameworks knows this issue very well.

    ReplyDelete
  11. @Fabio

    Many types in the .NET framework do change their hashcodes after instantiation. If you are using these types as keys in a dictionary, it's only at the time they are inserted, the hashcode makes a difference. Afterwards, if your are changing some member value of the type, the dictionary doesn't care.

    That said, I'm still not quite sure I understand why this is a problem with attributes. Why would anyone use them as key in a dictionary?

    ReplyDelete
  12. The recommended approach is to override GetHashCode and to a XOR of all datamember "key" values. In your case you have a Min and Max:

    int override GetHashCode()
    {
    return Min.GetHashCode() ^ Max.GetHashCode;
    }

    This is documented behavior as the framework cannot guess your intended behavior for hashing.

    ReplyDelete
  13. @TigerShark
    That is not the point.
    In your opinion, calculate the hash using the first field declared in a class is something sane ? can you do the same ?
    If you are using the whole state for Equals, why you should use a different concept for GetHashCode ?
    That said... I can accept a restriction (as you can't use a mutable ValueType as key) but please give me something with a predictable behaviour and, where possible, document your fantasy (at the end somebody is paying you for that too).

    ReplyDelete
  14. @Ramon
    That is the workaround I wrote in the post and is what the base impl. is using for Equals (and that is one of the inconsistencies).

    ReplyDelete
  15. @Fabio

    Yeah, the difference between GetHashCode and Equals is quite absurd and I completely agree with you on that...

    But still, why is this a real problem?

    ReplyDelete
  16. @TigerShark
    too large to explain here.

    ReplyDelete
  17. @Fabio

    Maybe I failed to explain myself, you wrote

    "...and the most important inconsistence is that the two instance result as NOT EQUALS but they has the same HashCode."

    and according to msdn that's not an inconsistency

    ReplyDelete
  18. Regarding the hash code changing - as far as I could see, the hash code was only changed when the equality was effectively changed too. This would screw you up if you used the object as a key in a dictionary anyway.

    A hash code shouldn't change if nothing related to equality changes, but it's fine for it to change if you're changing something that *is* used in equality.

    This is why if you *are* going to use mutable types as dictionary keys, you should avoid mutating them after you've added them into the dictionary.

    I agree that the hashcode calculation is very strange, but I don't think it's as fundamentally broken as you're implying.

    ReplyDelete