Try fast search NHibernate

03 April 2009

Tuning NHibernate: Tolerant QueryCache

Before reading this post you should know something about QueryCache and its imply tuning NH.

Resuming:

  • Using IQuery.SetCacheable(true) you can put/get the entirely result of a query from the cache.
  • The cache is automatically invalidated when the query-space change (mean that the cache will be throw when an Insert/Update/Delete is executed for one of the Tables involved in the query).
  • Using IQuery.SetCacheMode(CacheMode.Refresh) you can force the cache refresh (for example if you need to refresh the cache after a Delete/Insert).

The Case

eBayShot

(the picture is a e-Bay snapshot)

Take a look to the left side. Near each option you can see a number and I’m pretty sure that it not reflect exactly the state in the DB. That number is only “an orientation” for the user probably calculated few minutes before.

Now, think about the SQLs, behind the scene, and how many and how much heavy they are. A possible example for “Album Type”, using HQL, could look like:

select musicCD.AlbumType.Name, count(*) from MusicCD musicCD where musicCD.Genre = ‘Classical’ group by musicCD.AlbumType.Name

How much time need each “Refine search” ?

Ah… but there is no problem, I’m using NHibernate and its QueryCache… hmmmm…

Now, suppose that each time you click an article you are incrementing the number of visits of that article. What happen to your QueryCache ? yes, each click the QueryCache will be invalidated and thrown (the same if some users in the world insert/update/delete something in the tables involved).

The Tolerant QueryCache abstract

The Tolerant QueryCache should be an implementation of IQueryCache which understands, through its configuration properties, that updates, to certain tables, should not invalidate the cache of queries based on those tables.

Taken the above example mean that an update to MusicCD does not invalidate all “Refine search” queries, if we are caching those statistics heavy queries.

The integration point

Well… at this point you should know how much NHibernate is extensible and “injectable”.

For each cache-region NHibernate create an instance of IQueryCache through an implementation of IQueryCacheFactory and, as you could imagine, the IQueryCacheFactory concrete implementation can be injected through session-factory configuration.

<property name="cache.query_cache_factory">YourQueryCacheFactory</property>

At this point we know all we should do to have our TolerantQueryCache :


  1. Some configuration classes to configure tolerated tables for certain regions.
  2. An implementation of IQueryCacheFactory to use the TolerantQueryCache for certain regions.
  3. The implementation of TolerantQueryCache.

The Test

Here is only the integration test; all implementations are available in uNhAddIns.

Domain
public class MusicCD
{
public virtual string Name { get; set; }
}

public class Antique
{
public virtual string Name { get; set; }
}
<class name="MusicCD" table="MusicCDs">
<
id type="int">
<
generator class="hilo"/>
</
id>
<
property name="Name"/>
</
class>

<
class name="Antique" table="Antiques">
<
id type="int">
<
generator class="hilo"/>
</
id>
<
property name="Name"/>
</
class>
Configuration
public override void Configure(NHibernate.Cfg.Configuration configuration)
{
base.Configure(configuration);
configuration.SetProperty(Environment.GenerateStatistics, "true");
configuration.SetProperty(Environment.CacheProvider,
typeof(HashtableCacheProvider).AssemblyQualifiedName);

configuration.QueryCache()
.ResolveRegion("SearchStatistics")
.Using<TolerantQueryCache>()
.TolerantWith("MusicCDs");
}

The configuration is only for the “SearchStatistics” region so others regions will work with the default NHibernate implementation. NOTE: the HashtableCacheProvider is valid only for tests.

The test
// Fill DB
SessionFactory.EncloseInTransaction(session =>
{
for (int i = 0; i < 10; i++)
{
session.Save(new MusicCD { Name = "Music" + (i / 2) });
session.Save(new Antique { Name = "Antique" + (i / 2) });
}
});

// Queries
var musicQuery =
new DetachedQuery("select m.Name, count(*) from MusicCD m group by m.Name")
.SetCacheable(true)
.SetCacheRegion("SearchStatistics");

var antiquesQuery =
new DetachedQuery("select a.Name, count(*) from Antique a group by a.Name")
.SetCacheable(true)
.SetCacheRegion("SearchStatistics");

// Clear SessionFactory Statistics
SessionFactory.Statistics.Clear();

// Put in second-level-cache
SessionFactory.EncloseInTransaction(session =>
{
musicQuery.GetExecutableQuery(session).List();
antiquesQuery.GetExecutableQuery(session).List();
});

// Asserts after execution
SessionFactory.Statistics.QueryCacheHitCount
.Should("not hit the query cache").Be.Equal(0);

SessionFactory.Statistics.QueryExecutionCount
.Should("execute both queries").Be.Equal(2);

// Update both tables
SessionFactory.EncloseInTransaction(session =>
{
session.Save(new MusicCD { Name = "New Music" });
session.Save(new Antique { Name = "New Antique" });
});

// Clear SessionFactory Statistics again
SessionFactory.Statistics.Clear();

// Execute both queries again
SessionFactory.EncloseInTransaction(session =>
{
musicQuery.GetExecutableQuery(session).List();
antiquesQuery.GetExecutableQuery(session).List();
});

// Asserts after execution
SessionFactory.Statistics.QueryCacheHitCount
.Should("Hit the query cache").Be.Equal(1);

SessionFactory.Statistics.QueryExecutionCount
.Should("execute only the query for Antiques").Be.Equal(1);

Fine! I have changed both tables but in the second execution the result for MusicCD come from the Cache.



Code available here.



kick it on DotNetKicks.com

10 comments:

  1. This looks pretty sweet.

    So, if I understood correctly, the query cache is not invalidated by creates/updates/deletes to relevant entities, and the query cache is allowed to serve stale data? The benefit being that using this for expensive queries are cached for longer putting less strain on the app.

    I suppose for this scenario you'd run an hourly/nightly job to manually invalidate the cache and then prime it again?

    ReplyDelete
  2. No Tobin, the responsibility to evict the cache is delegated to the cache provider.
    In practice if you are using SysCache and you set the cache expiration to 10 minutes for the region "SearchStatistics", the region expire after 10 minutes.

    Thanks for ask I will change the example using SysCache.

    ReplyDelete
  3. Cheers.

    Wouldn't that mean that every 10 minutes one of your users is going to have a *really* slow page load as the cache is rebuilt during their HTTP request?

    ReplyDelete
  4. ...And, since the SysCache isn't cluster-scoped, this would multiply for each server in the farm?

    ReplyDelete
  5. -The TolerantQueryCache avoid the Cache invalidation for creates/updates/deletes.
    -If you use TolerantQueryCache with SysCache after 10 minute ONE user may have a slow response. Without TolerantQueryCache EACH user may have a slow response.
    -The decision about which cache you should use is not influenced by the TolerantQueryCache usage.

    ReplyDelete
  6. The example using SysCache, as cache provider, is available
    http://code.google.com/p/unhaddins/source/browse/trunk/uNhAddIns/uNhAddIns.Test/Cache/FullIntegrationFixture.cs

    ReplyDelete
  7. Nice test!. I guess it would be nice to overwrite the internal key used. Since sometimes not all the table needs such tolerance. But it will require a change in the NH APIs.

    Imagine this:
    var q = new DetachedQuery("from MusicCD where Author = :Author")
    .SetCacheKey("getCdsByAuthor" + cd.Author.Id);


    // And in another place, when you create a new MusicCD:
    SomeApiCache.Remove("getCdsByAuthor" + cd.Author.Id);

    That way the user would be able to decide which query needs to be invalidated. This is helpful for example if you want to invalidate the queries that you know are affected by the change you made, but not all the table.

    That happens a lot in multitenant apps, where you want just invalidate the queries that affect the current connected tenant.

    ReplyDelete
  8. Of course I'm not suggesting an API, I'm just talking about the requirement!

    ReplyDelete
  9. @Diego
    mmmm that is another thing....
    I would avoid "manual" work.
    Perhaps you are talking about "SensibleQueryCache" ;) (not implemented yet)

    ReplyDelete
  10. BTW Diego... which is the usecase ?
    You can use the uNhAddIns list if you want talk about the specific case.

    ReplyDelete