Try fast search NHibernate

03 September 2009

Why don't choose NHibernate

Well… I’m here again over the same matter… again performance.

This post is about some thoughts after this thread on nhusers list and not strictly a comparison between NHibernate vs EntityFramework.

The comparison

To compare performances I have used again the same environment and the same domain used in this post but, this time, with 50 companies having 50 employees having 50 reports; total entities 127550.

So far I’m NOT an expert in EntityFramework (I’m waiting for EF4) so I have used the same configuration Gergely Orosz have used in his performance test. The code used for NHibernate is the same of the previous post and the code used, to materialize all entities, for EntityFramework is the follow:

using (var ctx = new EFContainer())
{
var companies = from c in ctx.EFCompanySet select c;
foreach (var company in companies)
{
company.Employees.Load();
foreach (var employee in company.Employees)
{
employee.Reports.Load();
}
}
}

The results of the test are

EntityFramework:

Store data seconds: 209,25 ( 209249 miliseconds)
Read data seconds: 57,77 ( 57773 miliseconds)

NHibernate:

Store data seconds: 24,29 ( 24287 miliseconds)
Read data seconds: 24,6 ( 24603 miliseconds)

Which is your conclusion ? NHibernate is faster than EntityFramework ? NHibernate is slow because the materialization of 127550 entities took 24.6” ?

That is what you are seeing ?

My conclusions

The first time I chose NHibernate was because:

  • I’m looking for a persistent-layer based on ORM
  • the license is LGPL
  • its development is based on the mature, and standard de-facto, JAVA world persistent-layer: Hibernate

On what is based your daily work ?

TDD? DDD? code maintainability? business request? YAGNI? easy to implements? XP? dead-line? elegant-code solutions? performance? sprint?

If the performance is the most important concept why you are not programming using assembler instead C# ?

Are you applying only one of these concepts or your work is a continuous balance of each ?

What I’m seeing, in many teams, is that, for various reasons, we are loosing the most important concept: the balance of each.

In front of the question “why you chose this solution ?” many times the answer is “because the dead-line is tomorrow” or “because it is the minimal code to pass my test” and so on. The worst results is not the mere answer itself but what will happen after… perhaps when you have a performance issue. If you saw that the solution you have chose does not satisfy the expectance why, rethink the whole solution, shouldn’t be an option ?

Have a quick look to your toolbox perhaps you having the right tool to solve the issue. What ? using that tool the code is not elegant ? it will be more hard to maintain ? it is more hard to implements ?… and? which is the main target of that specific task ?

As you can see above a persistent-layer based on ORM is not the right tool to perform bulk operations, is not the right tool to perform massive data mining, is not the right tool to provide data for OLAP cube… this mean that a persistent-layer based on ORM is not a good tool ? absolutely NOT my friend, that only mean that in your application you have a lot of different tasks and you should choose the right tool for each specific task…have a look to your toolbox!!!

34 comments:

  1. I've had positive luck using NHibernate for data integration where we do batch processing. It's not huge volume but it's large enough. I've never had a performance issue with NHibernate. Performance is usually a process flow issue.

    ReplyDelete
  2. Scott, your "balance" have produced an acceptable solution.

    ReplyDelete
  3. Fabio as you pointed out, IMHO; those numbers hasn't value. Even if you tell me that nhibernate is 10x slower than EF for this kind of operations, I would still choose nh as ORM for other reasons! and obviously: ¡I've never used an ORM in that way!.

    ReplyDelete
  4. "If the performance is the most important concept why you are not programming using assembler instead C# ?"

    Hehehe. Good post.

    ReplyDelete
  5. The most important aspect for me is the answer to this question:
    How long will it take to add a completely new feature to your existing, 2 and more years, product?

    Yes, the time to add the feature will increase definetely, but will be increase be n*n? n* logn? or simply a constant rise (because there will be more lines of code).

    ReplyDelete
  6. Hey Fabio, you are using Lazy Load for getting the relationships (incurring in the N+1 issue). Was that the original plan ?. Entity Framework and NHibernate do not deserve more comparations. A lot of improvements need to be done to EF to have something mature and stable as NHibernate.

    ReplyDelete
  7. @Pablo
    No, I'm not using lazyload... this is a test about "materialization" of the whole entities domain.
    Using LazyLoad the time of the query would be 0.001".

    ReplyDelete
  8. Fabio, could you send me a test code? I can check if your EF code is optimal.

    Btw, would you like to see similar benchmark @ ORMBattle.NET? ;) The idea behind is to read the graph from its root as fast as it's possible - and this seems a very good case.

    ReplyDelete
  9. > "If the performance is the most important concept why you are not programming using assembler instead C# ?"
    > Hehehe. Good post.

    That's wrong: EF and NH are pretty close by features. Much closer than assembler to C#, and even C++ and C#.

    ReplyDelete
  10. @Alex
    Try to read again the post and the related links and you will find the code.
    btw, as explained in the previous post, for NH the materialization of the whole graph happen immediately calling IQuery.List<T> (obviously because NH's users know how drive NH).

    ReplyDelete
  11. @Alex
    Is clear that your are not understanding the meaning of the post.

    ReplyDelete
  12. Fabio, have you used something like prefetch for NH here? If so, have you used prefetch for EF?

    For me it's clear such a test can run in two "flavors":
    - With prefetch. In this case it shows prefetch pipeline quality.
    - Without prefetch. So it shows how fast framework operates with collections.

    And it's a good idea to check underlying queries here. E.g. by default DO4 tries to load up to 32 entities into a collection on the first attempt to access it. So if collection is small, it will load it completely, but if it is large, it won't waste time on likely unnecessary operation. Imagine what if you'd make just few seeks (entitySet.Contains calls) on it later.

    On the other hand, in this case we'd load the whole collection completely: enumeration implies this, and this is the first "request" we get. But if you'd make a .Contains or .Count check first, we'd get additional "preload" query fetching up to 32 items. And as you see, this could decrease the performance dramatically.

    Note that what DO does here is not a lack: it is an optimization we intentionally wrote to increase collection performance in real life cases. Most of collections in real life contains quite small amount of items, but in your case they always contains some "average", that works against such optimizations. Of course it's easy to tune up the system to workaround this (e.g. increase default preload limit for a particular EntitySet), but the main point is that you must be aware you have properly configured the tool for your test.

    So that's why I think it's necessary to study the code and queries anyway.

    ReplyDelete
  13. Fabio, I fully understand the post. It's clear that there is a trade between performance and features - it is so well known that isn't necessary to publish a test to show this.

    And it is absolutely clear that there are better tools than ORM to perform massive data mining and bulk data modifications.

    On the other hand, here you're showing NH is faster and EF on such operations. Why? Likely, because you think results are worth to be shown. Thus I'd like to check if it was a honest comparison.

    ReplyDelete
  14. Btw, I came to conclusion you really wanted to honestly compare EF and NH after reading the first link.

    ReplyDelete
  15. Anyway, if your point is that test results aren't important at all here, the bets thing you can do is to remove them ;)

    But if they are at least a bit important, you must do all to prove they are honest.

    ReplyDelete
  16. So please publish the code ;) And if you can, some output from SQL Server Profiler for read data test.

    Note that I don't "suspect" results are wrong. Moreover, I'm pretty sure NH will beat EF on store data test.

    I'm just asking you to publish the underlying code to allow the community to analyze it.

    ReplyDelete
  17. Forgot to add:

    > for NH the materialization of the whole graph happen immediately calling IQuery.List(of T)

    I don't see how this is related to this test. If you could write a query fetching all these entities at once, materialize them and later iterate over the same way without any queries, that would be a perfect example of prefetch.

    But here you're actually making a set of queries. Likely, 1+50+50*50.

    Anyway, it does not matter how you materialize the result here - at once (NH) or during its processing (DO4). Materialization time is anyway included into the test.

    > (obviously because NH's users know how drive NH)

    What do you really mean? ;)

    ReplyDelete
  18. I just compared NH from the previous post and EF code. So have you tried to use eager loading with EF as well?

    See e.g. http://weblogs.asp.net/zeeshanhirani/archive/2008/07/13/eager-loading-child-entities-in-entity-framework.aspx

    ReplyDelete
  19. "NH from the previous post" -> "NH code from the previous post".

    ReplyDelete
  20. This comment has been removed by the author.

    ReplyDelete
  21. I'm curious, is it really true that EF is JUST 2 times slower than NH on this test, although it sends ~ 2500 queries vs one in NH?

    ReplyDelete
  22. Aren't emails a better way of conveying a productive conversation?

    ReplyDelete
  23. It depends. I don't see anything to hide here, thus I think it's ok.

    ReplyDelete
  24. alex please stop spamming (also becouse if this was my blog i had already banned you after the 4th consecutive comments ).

    ReplyDelete
  25. @TFQUPPMq0Jl7aBuvicVhObvQqpMU
    Alex is showing his name.

    @Alex
    I'm living in the other side of your world (in any sense). What I'm doing, with the behaviour, is explained in the previous post. Seeing that code, reading my explication and reading NHibernate reference you can recreate my results.

    ReplyDelete
  26. Ok, it's a nice topic for my blogs.

    ReplyDelete
  27. @ TFQUPPMq0Jl7aBuvicVhObvQqpMU:

    I like the people saying "if it WOULD BE my ..." most of all :)

    ReplyDelete
  28. @Alex
    "I'm living in the other side of your world" is literately. Most of your comments was during my "sleeping time".

    ReplyDelete
  29. > I'm living in the other side of your world (in any sense).
    > "I'm living in the other side of your world" is literately.

    LOL. Fabio, I'm not angry at all anyway. The only important point is test itself and the numbers you show. See ORMBattle.NET blog for details.

    > Most of your comments was during my "sleeping time".

    I see this. That's why there are lots of them. It was my working time, and I tried to figure out how such a result could appear.

    ReplyDelete
  30. @Alex
    You said: "The only important point is test itself and the numbers you show."

    Now I'm sure you haven't understand this post.

    ReplyDelete
  31. Fabio, why do you argue if I understood this or not? Ok, for me (not for you!) the only important point is test itself and the numbers you show. Yes, it's clear you're going to illustrate the main idea on this test, but I don't care about this, and don't argue about it.

    Please answer my question: "is it really true that EF is JUST 2 times slower than NH on this test, although it sends ~ 2500 queries vs one in NH?".

    It must be quite simple: only 2 or 3 key presses. Don't waste your time on explaining me if I missed the point or not. I know you're professional, ant your time is money ;)

    ReplyDelete
  32. If you don't understand the fact is that I'm not interested in comparisons... but I know that is the only thing you are interested. I hope you can understand that each one may have different interest.
    About money: which is your OSS profile ?

    ReplyDelete
  33. > I know that is the only thing you are interested.

    No, this isn't the only thing I care about. True that I care about performance. But it is definitely not #1 on list of my priorities in software design.

    ReplyDelete