Use Sql Server Query Hints with NHibernate HQL and ICriteria

When you work with Big databases with many records and not uniform distribution of data into columns used for join or where conditions, you can have really bad performance problem due to Query Plan caching. I do not want to give a deep explanation of this problem, you can find information here, but I want to outline the general problem to make clear what is happening.

We can argue a lot why this problem happens in a database engine, but basically I have a really bad situation where the db stores data of multiple customers with really non uniform distribution of data (some of them have ~1000 rows for each table, others have ~100.000 rows in some of the tables).

The problem is the following one: the software issue a query (two tables in join) filtered with a customer Id that has few rows on these tables; SQL-Server creates a plan where tables are joined with nested loops and the query executes in 2 milliseconds.


Figure 1: Query with a value of parameter that filters for a customer that has ~100 rows in the two tables used in the join (elapsed time = 2ms).

As you can see the execution time is almost zero, now I execute the very same query, changing the value of the parameter to filter records for a customer that has nearly 100.000 records, since the query is the same, sql server uses the same execution plan.


Figure 2: Query gets executed filtering for a customer that has ~100.000 rows in the two table that joins together (elapsed time = 2824 ms).

The query is three order of magnitude slower (2.824 ms vs 2 ms) and for query that are more complex (three tables in join), sometimes the execution time is more than 5 minutes and the software gives a TimeoutException. As you can see from Figure 2, the gray arrow from the bottom table is really wider respect of the arrows of Figure 1, because this customer has more rows in database. If you issue a DBCC FREEPROCCACHE command to clear all cached query plans and execute again the query of Figure 2, you got an execution time of 28 ms (100 times faster than before XD).


Figure 3: Join is done with an hash match, a technique that is more efficient for joining two tables with a lot of rows. (elapsed time = 28 ms).

In figure 3 you can see that if you clear the query plan cache and execute again the query, SQL Server check the table, and since this customer has a lot of rows it uses HASH MATCH join and not a nested loop. Now if you execute the query for the customer used in Figure 1 that has few rows, the execution time is 28 ms, higher than 2 ms because an hash match is less efficient for few rows.

This lead to a nasty problem due to order of execution of the query: if I issue the same query, first for the customer with a lot of rows and then for the customer with few rows, I got an execution time of 28 ms for each query, if I reverse the order of the query I got 2 ms and 2800 ms thus the system is much slower . This happens because the Hash Match is not good for few number of rows, (28 ms against 2 ms), but the loss of performance when we have a small set of record is not so bad; at the contrary, the Nested Loop is a KILLER for big resultset and this can even cause execution Timeout. Having such a problem in production is like sitting on a bomb ready to explode. (A customer calls telling us that the software is completely unresponsive, you verify that this is true, but for other customers everything is quite good :(


The only solution to this approach is using the OPTION (NORECOMPILE) for all queries that present this problem, or you can instruct the query governor to prefer some join option (using OPTION (HASH JOIN) for example) if you already know that this is the right execution plan for all data distribution. All these techniques are called “query hint” and are the only way to solve bad performance problem of parameterized query when you have non uniform data distribution.

Now the problem seems solved, but, wait!!!!!, most of the queries are issued by nhibernate, and I need to find a way to add query hints to nhibernate query, a task that is not supported natively by NH. Solving this problem is a two phase process, first of all you need to find a way to insert text into nhibernate generated SQL, a task that can easily solved by an interceptor.

   1: [Serializable]

   2: public class QueryHintInterceptor : EmptyInterceptor

   3: {

   4:     internal const string QueryHintHashJoinCommentString = "queryhint-option-hash";

   5:     internal const string QueryHintRecompileCommentString = "queryhint-recompile";


   7:     public override NHibernate.SqlCommand.SqlString OnPrepareStatement(NHibernate.SqlCommand.SqlString sql)

   8:     {

   9:         if (sql.ToString().Contains(QueryHintHashJoinCommentString))

  10:         {

  11:             return sql.Insert(sql.Length, " option(HASH JOIN)");

  12:         }

  13:         if (sql.ToString().Contains(QueryHintRecompileCommentString))

  14:         {

  15:             return sql.Insert(sql.Length, " option(RECOMPILE)");

  16:         }

  17:         return base.OnPrepareStatement(sql);

  18:     }

  19: }

As you can see the code is really simple, the interceptor inherits from EmptyInterceptor and override the OnPrepareStatement(), adding the right Query Hint to the end of the query. This is probably not 100% production ready code, because I’m not 100% sure that for complex query, inserting the hint at the end of the query is the right choice, but for my specific problem is enough and it is a good starting point.

If you look at the code you can verify that I’m checking for certain string in query text to add the appropriate hint, but how can you add these strings to the query to enable query hint to be generated? The answer is “comments”. First of all I add this interceptor to NH configuration, so it got attached to every session.

   1: cfg.SetInterceptor(new QueryHintInterceptor());

But I also enable comments in SQL code in NH configuration setting the property use_sql_comments to true, now I only need a couple of extension methods like this one.

   1: public static IQuery QueryHintRecompile(this IQuery query)

   2:  {

   3:      query.SetComment(QueryHintInterceptor.QueryHintRecompileCommentString);

   4:      return query;

   5:  }

That enables me to write this HQL query.

   1: session.CreateQuery("Select y from LinkResult y where ClieId = :param")

   2:     .QueryHintRecompile()

   3:     .SetInt32("param", 11)

   4:     .List();

As you can see I have now the QueryHintRecompile() method that permits me to specify for each single HQL Query (you can add similar extension method to ICriteria) the query hint to use. Running this query output this query.

23-07-2011 10-36-27

Figure 4: The query with the query hint specified by the QueryHintRecompile()

Thanks to incredible number of extension points offered by NHibernate, adding Query Hint to queries is really a breeze.



NHibernate ICriteria and filters

I have a project where I need to access tables with millions of rows, the project was born without nhibernate so the database was not designed with ORM in mind. I decided to improve performances with an Indexed view that materialized a join between five tables, wrapped with another view that does a left outer joins with other 3 tables. Then I created an entity called XXXDto, and mapped with nhibernate, with update=”false” to prevent updating and pointed it to this view.

Everything runs fine, but now I have a little new feature to implement; that entity needs another property called Bookmark, and the value is specific of each user. In the software the user can search the aforementioned entity with some criteria, then he bookmarks the ones he likes most, and the software must signal bookmarked ones with a different image in the web ui. The problem is that bookmark table was already present in the database and has this schema.


This table works this way, if a user put a bookmark on an entity, a record with username and link_id (the foreign key to other table) is inserted, if the bookmark is removed the record is removed. Now I want to avoid the N+1 select, so I decided not to issue another query for each one of the original links, moreover I used ICriteria to recover entities, since the user can set complex filter through the UI and ICriteria makes really simple to compose the query.

Thanks to nhibernate I can use a formula in the mapping, but I need to put a condition on that formula, here is how I solved the problem.

formula="(select count(*) from Bookmark B where B.bookmark_link_id = Id and B.bookmark_user = :UserContext.UserName)" />

The formula is based on the fact that Bookmark is a boolean field, so if I issue a select count(*) everything is ok because if the bookmark is present the value is 1 if it is not present the value is 0, suitable for a boolean field. The condition is that bookmark_link_id = Id where id is the id column of the entity, and the user condition is B.Bookmark_User = :UserContext.UserName. This instruction uses a filter, a global parameter supported by nhibernate, that permits me to put parameters in mappings. Now I need to define this filter in a mapping.

<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2">
    <filter-def name='UserContext'>
        <filter-param name='UserName' type='System.String'/>

Now that the filter is defined I need to set the appropriate value in code before issuing the query. Since I wrapped the query in a Query object specific of that project I have a similar code that translate my Query object to a ICriteria + Session in nhibernate

foreach (Query.Filter filter in myQuery.Filters)
    IFilter nhibFilter = _session.EnableFilter(filter.Name);
    foreach (KeyValuePair<string, object> parameter in filter.Parameters)
        nhibFilter.SetParameter(parameter.Key, parameter.Value);

To create a filter you can simply call EnableFilter() method on a session, then you set specific filter on the IFilter object returned from the EnableFilter(). Here is the query generated by this mapping.

select top 20 this_.Id as Id31_0_, 
this_.Name, ...., 
(select count(*) from Bookmark B where B.bookmark_link_id = this_.Id and B.bookmark_user = @p0) as formula1_0_ FRom ..... 

The only drawback is that in formula mapping you must use the sql name of the column involved (in that example I need to use bookmark_link_id = Id where id is the real column name that contains id), but the query is correct and everything works perfectly.