Try fast search NHibernate

26 February 2009

NH2.1.0: generators behavior explained

After this post and this other, perhaps is the moment to dive in generators behavior. In the last year we saw an increment of NHibernate’s users and sometimes they are ignoring some facts, by the way I'm hoping this post will be useful to “old” users too. I will try to show you what happen behind the scene and hopefully this will be one of the bases, of your decision, when you must choose a POID generator strategy.

The Domain

public class Person
{
private IList<Pet> pets = new List<Pet>();
public virtual string Name { get; set; }
public virtual IEnumerable<Pet> Pets
{
get { return pets; }
}

public virtual Person Partner { get; set; }
public virtual void Add(Pet pet)
{
pet.Owner = this;
pets.Add(pet);
}
}

public class Pet
{
public virtual Person Owner { get; set; }
public virtual string Description { get; set; }
}

One Full-mapping:
<class name="Person">
<id type="int">
<generator class="native"/>
</id>
<property name="Name"/>
<many-to-one name="Partner" class="Person"/>
<bag name="Pets" cascade="all" inverse="true" access="field.camelcase">
<key column="ownerId" on-delete="cascade"/>
<one-to-many class="Pet"/>
</bag>
</class>

<class name="Pet">
<id type="int">
<generator class="native"/>
</id>
<property name="Description"/>
<many-to-one name="Owner" class="Person" column="ownerId"/>
</class>
Resuming:
  • A class (Person) with two relationship
  • One relation (Partner) doesn't have cascade
  • One relation is a collection with all cascades
  • The POID is only part of the persistence and we don’t have the ID property in the domain. This is useful especially in this case because I want run the same code changing only the id-generator-strategy, and the ID-Type, in the mapping.

The Demo

Each demo will be ran in two environments:
  1. without set the adonet.batch_size property in the NHibernate configuration.
  2. setting the adonet.batch_size to 10
For all demos the the loggers I’m watching are : "NHibernate.SQL", "NHibernate.Id", "NHibernate.AdoNet.AbstractBatcher"
The main demo is the follow
using (ISession session = factory.OpenSession())
using (ITransaction tx = session.BeginTransaction())
{
log.Debug("Saving");

var person = new Person {Name = "Person"};
var patner = new Person {Name = "PatnerOfPerson"};
person.Partner = patner;
person.Add(new Pet {Description = "Reptile"});
person.Add(new Pet {Description = "Dog"});
person.Add(new Pet {Description = "Cat"});
session.Save(patner);
session.Save(person);

log.Debug("Saved");

log.Debug("Committing");
tx.Commit();
log.Debug("Committed");
}
In a session+transaction I’m creating a Person, its partner, 3 pets associated to the Person; I’m saving the first the Partner (because it don’t has cascade actions) and then the Person. The log is to understand when “things” happens.
Two additional demos showing a “not recommended” usage of NHibernate (without enclose actions in one transaction) but useful to understand the behavior of generators.
The first:
public void AllInOneSession()
{
var animal = new Pet();
using (ISession session = factory.OpenSession())
{
session.Save(animal);
animal.Description = "Boa";
session.Update(animal);
session.Delete(animal);
}
}
I’m creating a Pet and Save+Update+Delete it all in the same session.
The second:
public void SaveWithoutFlushShouldDontPersist()
{
object savedId;
using (ISession session = factory.OpenSession())
{
var animal = new Pet();
savedId=session.Save(animal);
animal.Description = "Boa";
}

using (ISession session = factory.OpenSession())
{
var found = session.Get<Pet>(savedId);
if(found != null)
log.Debug("A Pet was found even if the session was never flushed.");
}
}
In the first session I’m saving a Pet without flush the session and in the second session I’m going to DB to ensure that the Pet was not saved.
In the first demo I will explain the meaning of the messages in the follow, when you see the same sequence, you can apply the same explication.
Ok, now I’m ready to begin the dance.

Using guid.comb or "guid

The “id” mapping here is:
<id type="guid">
<generator class="guid.comb"/>
</id>
Main demo results without use batcher:
Saving
Saved

Committing
INSERT INTO Person ... 'PatnerOfPerson'
INSERT INTO Person ... 'Person'
INSERT INTO Pet ... 'Reptile'
INSERT INTO Pet ... 'Dog'
INSERT INTO Pet ... 'Cat'
Committed
Between “Saving” and “Saved” there is NO iteration with RDBMS, all thing happens during the nh-Transaction commit. In this case we are going five times to DB (one per command).
AllInOneSession results :
There is absolutely nothing nor a single log message.
Save without flush results:
Nothing.
Main demo results with BATCHER:
Saving
Saved

Committing
Opened new IDbCommand, open IDbCommands: 1
1)Building an IDbCommand object for the SqlString: INSERT INTO Person (Name, Partner, id) VALUES (?, ?, ?)
...
2)Executing batch
Batch command: INSERT INTO Person ... 'PatnerOfPerson'
Batch command: INSERT INTO Person ... 'Person'
Closed IDbCommand
Opened new IDbCommand, open IDbCommands: 1
3)Building an IDbCommand object for the SqlString: INSERT INTO Pet (Description, ownerId, id) VALUES (?, ?, ?)
...
4)Executing batch
Batch command: INSERT INTO Pet ... 'Reptile'
Batch command: INSERT INTO Pet ... 'Dog'
Batch command: INSERT INTO Pet ... 'Cat'
Closed IDbCommand
Committed
Another time all happen during nh-transaction commit and what happen is:
  1. A command to insert People (note the plural) is created and prepared.
  2. In only one command the two persons are inserted.
  3. A command to insert Pets (note the plural) is created and prepared.
  4. In only one command the three animals are inserted.
We are doing the same insertions but going to DB only two times instead of five.

Using “hilo”

The “id” mapping here is:
<idtype="int">
    <
generatorclass="hilo"/>
</
id>
Main demo results without use batcher:
Saving
Reading high value:select next_hi from hibernate_unique_key with (updlock, rowlock)
Updating high value:update hibernate_unique_key set next_hi = @p0 where next_hi = @p1
The New high value is: 2
Saved

Committing
INSERT INTO Person ... 'PatnerOfPerson'
INSERT INTO Person ... 'Person'
INSERT INTO Pet ... 'Reptile'
INSERT INTO Pet ... 'Dog'
INSERT INTO Pet ... 'Cat'
Committed
Between “Saving” and “Saved” there is one iteration with RDBMS to retrieve the High value. Note that in a fresh new RDBMS you will see one more iteration to initialize “hibernate_unique_key” table with the first high-value. If you repeat the demo inside the same session-factory instance nothing happen between “Saving”-“Saved”  for long time (if I well remember the default max-low is Int16.MaxValue).
AllInOneSession results :
Nothing.
Save without flush results:
Nothing.
Main demo results with BATCHER:
Saving
Reading high value:select next_hi from hibernate_unique_key with (updlock, rowlock)
Updating high value:update hibernate_unique_key set next_hi = @p0 where next_hi = @p1
The New high value is: 2
Saved

Committing
Opened new IDbCommand, open IDbCommands: 1
Building an IDbCommand object for the SqlString: INSERT INTO Person (Name, Partner, id) VALUES (?, ?, ?)
...
Executing batch
Batch command: INSERT INTO Person ... 'PatnerOfPerson'
Batch command: INSERT INTO Person ... 'Person'
Closed IDbCommand
Opened new IDbCommand, open IDbCommands: 1
Building an IDbCommand object for the SqlString: INSERT INTO Pet (Description, ownerId, id) VALUES (?, ?, ?)
...
Executing batch
Batch command: INSERT INTO Pet ... 'Reptile'
Batch command: INSERT INTO Pet ... 'Dog'
Batch command: INSERT INTO Pet ... 'Cat'
Closed IDbCommand
Committed
The same behavior of “guid” (only two round-trips).

Using “identity”

As last your best friend Identity (or “native” if you are using Microsoft SQL server).
The “id” mapping here is:
<id type="int">
<generator class="identity"/>
</id>

And now, my friend, fasten your seat belt because I’m going to show you what happen in the roller-coaster.
Main demo results without use batcher:
Saving
INSERT INTO Person ...; select SCOPE_IDENTITY(); … 'PatnerOfPerson'
INSERT INTO Person ... ; select SCOPE_IDENTITY(); … 'Person'
INSERT INTO Pet ... ; select SCOPE_IDENTITY();… 'Reptile'
INSERT INTO Pet ... ; select SCOPE_IDENTITY(); … 'Dog'
INSERT INTO Pet ... ; select SCOPE_IDENTITY(); … 'Cat'
Saved

Committing
Committed
As you can see the behavior is exactly the inverse of what you see using “guid”: here all happen during saving instead during commit. Each “INSERT INTO” ran in the same moment of session.Save(object) even if we don’t close the UnitOfWork and, to be more exactly, even if we don’t know what we will do with the UnitOfWork (we don’t know if we will close with commit or rollback). Five iteration with RDBMS are happening during UnitOfWork usage instead at the end of the UnitOfWork (exactly what we want avoid using UoW pattern).
AllInOneSession results :
INSERT INTO Pet (Description, ownerId) VALUES (@p0, @p1); select SCOPE_IDENTITY.
Ups… something appear here and it is something really bad.
Save without flush results:
INSERT INTO Pet (Description, ownerId) VALUES (@p0, @p1); select SCOPE_IDENTITY()
A Pet was found even if the session was never flushed.
WOW! We are having a record in the DB even if we don’t flush the session ?
Yes, nothing so strange if you are using identity-style generators.
Main demo results with BATCHER:
And now let me show you how much is useful the batcher when you are working with identity-style generators.
Saving

Opened new IDbCommand, open IDbCommands: 1
Building an IDbCommand object for the SqlString: INSERT INTO Person (Name, Partner) VALUES (?, ?); select SCOPE_IDENTITY()
INSERT INTO Person ...; select SCOPE_IDENTITY(); ... 'PatnerOfPerson'
Opened IDataReader, open IDataReaders: 1
...
Closed IDbCommand
Opened new IDbCommand, open IDbCommands: 1
Building an IDbCommand object for the SqlString: INSERT INTO Person (Name, Partner) VALUES (?, ?); select SCOPE_IDENTITY()
INSERT INTO Person ...; select SCOPE_IDENTITY(); ...'Person'
Opened IDataReader, open IDataReaders: 1
...
Closed IDbCommand, open IDbCommands: 0
Opened new IDbCommand, open IDbCommands: 1
Building an IDbCommand object for the SqlString: INSERT INTO Pet (Description, ownerId) VALUES (?, ?); select SCOPE_IDENTITY()
INSERT INTO Pet ...; select SCOPE_IDENTITY(); ...'Reptile'
Opened IDataReader, open IDataReaders: 1
...
Closed IDbCommand, open IDbCommands: 0
Opened new IDbCommand, open IDbCommands: 1
Building an IDbCommand object for the SqlString: INSERT INTO Pet (Description, ownerId) VALUES (?, ?); select SCOPE_IDENTITY()
INSERT INTO Pet ...; select SCOPE_IDENTITY(); ... 'Dog'
Opened IDataReader, open IDataReaders: 1
...
Closed IDbCommand, open IDbCommands: 0
Opened new IDbCommand, open IDbCommands: 1
Building an IDbCommand object for the SqlString: INSERT INTO Pet (Description, ownerId) VALUES (?, ?); select SCOPE_IDENTITY()
INSERT INTO Pet ...; select SCOPE_IDENTITY(); ... 'Cat'
Opened IDataReader, open IDataReaders: 1
...
Closed IDbCommand, open IDbCommands: 0
Saved
Committing
Committed
Do you see ? … try to see again… Do you see ?
The target of the Batcher should be: Have better performance and less round-trips. Using identity-style generators, the benefits of the batcher are nullified and, obviously, don’t forget the rollback of the transaction because you don’t have another chance to nullify what happen in the session.Save().

Conclusions

This time I leave the conclusions to you; please let me know leaving a comment here ;)


kick it on DotNetKicks.com

18 comments:

  1. Very insightful... I remember a couple years ago when I was using identity and saw all this stuff in the log and thought "No wonder DBA's hate ORMS"!

    So I moved to hilo and now I typically just use guids.

    ReplyDelete
  2. for DBA oriented generator without lost all benefits my advise is : "seqHilo", "seuqence" or "guid.native"
    About "guid.native" take a look to
    http://fabiomaulo.blogspot.com/2009/02/nh210-new-generators.html

    ReplyDelete
  3. I have some questions about hilo, I know nh uses a table to store the next hi value, but I wonder where it stores the current low value? If I shut down the server, can nh remember where it is? And I find, if I set the max low to 100, the Id of the first table starts from 101, the second table starts from 201, it's a little strange. If I choose hilo, that is to say, I can't use ado.net or other client to insert new records, is it correct? Is there any advice about the max low value? Set it as 100, 1000 or just the default value?

    ReplyDelete
  4. I will try to be short...
    The High start from 1. Giving 100 your are saying : I want a new high each 100 new entities (in the same table).
    This mean that the Low start from 01 and up to 00. For that table the first ID will be 101 and then 102, 103... to 199 at this point a new high is needed (new round-trip).
    With this behavior you have an "incremental" ID, per table, between each High value request.
    In reality the High value is not per-table but per-persister (only to simplify we can talk about "table") and until the Low is less than MaxLow the High value has the life-cycle of the persister so... it is per-SessionFactory. When you restart the application a new High is required per each persister. This behavior is a little bit different than the HighLow-pattern described in ORM (an improvment IMHO).

    ReplyDelete
  5. Sorry Karron... your comment as 4 questions.
    Why you can't use ado.net ? NH is using ADO.NET to manage HiLo. You can create a class to manage the HighLow pattern outside NH... at the end it is only a pattern and you can implement it where you want.
    About the MaxLow value... more high is the MaxLow value, less roundtrips you have during the life of your application; after that it is more a matter of "style" than a matter of final result. If you like numbers starting from 10001 you can use MaxLow= 10000, but for your RDBMS what it is really important is they are incremental/sequencial.

    ReplyDelete
  6. Thank you for your replies. I have a following question. I use hilo as Id generator within NH. But sometimes, I may need use sql script or other client to do some data migrations to our database, this is outside NH. I did a test, when my application is running, I use sql script to insert a record to database manually, after that, when my application continue to insert a new record, it popups a error, because the primary key exists.

    As you said, I must write a class/application to manage the hilo pattern, but not use sql script directly, is it correct? Or there is a way that NH can detect this, and request a new hi value automatically?

    ReplyDelete
  7. Karron I don't understand the situation.
    Do you have something else inserting records in your DB ? Another application ?
    If another application need to do something with data of your application you should public a service and not allow direct-access to DB. The ID management is only the first problem... BTW nobody said you to use HiLo... you can use GUID.

    ReplyDelete
  8. Two small typos : "don’t has cascade" - doesn't have cascade.

    Great article , nicely explained :).

    ReplyDelete
  9. Thank you, Fabio, I just want to figure out something about HiLo. Because I don't know this before. Thanks for your patient explanation.

    ReplyDelete
  10. @Karron
    Guid is usually used in this situation, however as Fabio said you can manage HiLo values outside of the NHibernate. If I recall correct, HiLoIdentity generator takes some parameter including tha table and column, you can use(and change) this table to obtain new Hi value (don't forget to take concurrency precautions)

    ReplyDelete
  11. I think that identity should be generated only when sessoin.Flush() is called and not session.Save().

    For me, going with identity is fine because the id is a database id and should only be generated when the object is added to the database and not before.

    I think the problem is not with identity but with how nHibernate is doing it (too early).

    ReplyDelete
  12. @duraid
    You are watching only part of the picture forgetting relationships.

    ReplyDelete
  13. @Fabio

    Can you please explain more?

    let me give an example: let's say i created a person object and added 3 pets to it. after that I associated them with the UoW by using session.Save(). for me i don't need to have the database id's added to the objects until i do session.Flush(), don't you think so? please convince me otherwise. If it's a database id then it should be created when the object is actually added to the database, no?

    I think the confusion comes from how nHibernate method names. If i was them i would use session.Add() instead of session.Save() and session.Commit() instead of session.Flush().

    because session.Add() will acurately mean it's adding to the UoW and session.Flush is bad name all together.. what's it is a toilet?!

    ReplyDelete
  14. I don't need to convince somebody.
    I'm sure you are watching only part of the picture. Try to implement your DAL with all features NH has, then we can talk about which is the right way to do something.

    P.S. For Master-Detail + identity, NH delay the ID of detail at Flush...

    ReplyDelete
  15. I recently came across your blog and have been reading along. I thought I would leave my first comment. I don't know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.

    Ann


    http://largepet.info

    ReplyDelete
  16. Fabio, great article.

    One question - does the MsSQL "identity" problem apply also to other id generators that are using db mechanisms, such as "sequence" ?

    Thanks,
    Andrzej

    ReplyDelete
  17. @severian
    No, there is only an additional round-trip on session.Save but nothing more.
    If you are using sequence take a look to "seqhilo" that are combining HighLow with sequence.

    ReplyDelete