Misusing an ORM

I’ve blogged some time ago that I’m starting to consider ORM an Antipattern, and recently Mr Fowler posted similar thoughts in his bliki, moreover I have the pleasure to be one of the organizer of the first RavenDB official Course in Italy, with my dear friend Mauro as teacher.

Since I’m strongly convinced that in a full OOP approach to problem objects should not have nor setter nor getter, most of the work and complexities of an ORM is simply not needed, because you usually retrieve objects from the storage with only one function GetById and nothing else. In my long experience with NHibernate, I verified that most of the problem arise when you need to show data in UI in specific format and you start to write complex Query in HQL or ICRiteria or LINQ, then you need to spend time with NHProfiler to understand if the queries are good enough to run on production system and when objects changes a little bit you need to rewrite a lot of code to suite the new Object Model. This last point is the real pain point in DDD, where you usually should create Object Model that will be manipulated a lot before reaching a good point, after all the main value of DDD approach is being able to create a dialog with a DOMAIN EXPERT and it is impossible to find a good Object Models at the first tentative. If refactoring a model become painful, you are not allowed to modify it with easy, you are going away from DDD approach.

This is where CQRS can help you, for all objects belonging to the domain you need only to Save, LoadById, Update and delete, because every read model should be defined somewhere else. In such a scenario an ORM is really useful, because if you need to store objects inside Relational Database you can leave the ORM all the work to satisfy the CRUD part, where the R is the method GetById. To start easily with this approach you can create SQL View or stored procedures for all the Read Models you need; this imply that whenever the structure of the Domain Model changes, you need only to change all affected Read Models, some view and some stored procedure, but you have no need to refactor the code.

In this situation the ORM can really helps you, because if you change the Domain Model, you should only change the mapping, or let some Mapping by convention do this for you (ConfORM for NH is an example), regenerate the database and update only affected Read Models. If your domain is really anemic, if you expose properties from objects, even only with getters, whenever you change a domain class you should answer the question “If I change this property, what other domain objects will be affected? How many service class will be affected? How many query issued from Views will be affected?”. If you are not able to create a Read Model with SQL View or stored procedure, you can write a denormalizer that listens for DOMAIN EVENTS and populate the Read Model accordingly. In my opinion this is the scenario where an ORM can really helps you.

In such a situation a NoSql database can dramatically simplify your life, because you do not need an ORM anymore, cause you are able to save object graps into the storage directly, and you can create Read Models with Map/Reduce or with denormalizers.

But sadly enough, ORM are primarily used to avoid writing SQL and persist completely anemic domain, where all the logic reside on services. In such a scenario it is easy to abuse an ORM and probably in the long term the ORM could become much more a pain than a real help.

Gian Maria.

Square peg in a round Hole

After lots of year working with NHibernate I started to think that probably the whole concept of ORM can be considered an Antipattern. Some people prefer a “classic” approach to the problem, data is the key concept and most of the logic is inside a storage based on Relational Model.

Is this wrong?

Absolutely not, after all, for many years this was the preferred way to structure your application, and it worked quite well, but after OOP has come to life, a lot of people started appreciating this new paradigm of programming and started to think in terms of “objects” instead of “data”.

This is a radical shift, because if you really possess “object thinking”, you tend to subdivide the problem in objects that  have methods instead that in table that have relations. This lead to an obvious problem, since we need to store the data somewhere and since we already have good Relational Database Engine, the simplest solution is finding a way to save Objects in Relational Storage system and an ORM is the answer.

But after years of ORM the initial enthusiasm is completely passed away and I start believing that I’m forcing a square peg in a round hole.

Since most programmers started to work in terms of data and relations (after all university taught us to think in this way), using an ORM lead usually to highly anemic domain and tend to lead people of thinking in term of objects that have state (properties) and relations with other objects instead of object that have methods and communicates with other objects (with events).

If you start designing your entities from their state (properties) it could be the sign that you moved from “tables with relations and stored procedures that operates on tables” to “objects with relations and services that operates on objects”, and you need to realize that the situation is not changed very much.

When people start to have problems with Session duration, Lazy Loading, eager fetching (or other issue related to ORM), is the sign that the square peg is not fitting in the round hole and the ORM becomes the Hammer that forces it down (usually bringing some pain with it).

This means that if you want to do OOP you should move everything to NoSql?

Absolutely not, because sometimes you will probably find yourself forcing a round peg in a square hole :). I’m starting to think that in a real, big, complex OOP project, you need to have both type of storage: Relational and Object based. This will give you  round holes and the square holes, so you can put each peg in the right place.

At least until some new technology comes to life that bring us a new Hexagonal Peg :) that will need an Hexagonal shaped data storage Smile.

Gian Maria.

Entity Framework, a super dataset?

Entity Framework is quite a good product, but in my opinion still misses some point to be called an ORM. I must admit that I never used the 4.1 Code first in real project, but there are some stuff that still does not convince me when using EF.

When I decide to use EF (and not NH)

Sometimes I need to access small legacy databases to do a small amount of operations and I must admit that adding a EF Model from Visual Studio is a matter of seconds and the LINQ provider of EF is quite good so it is a valuable alternative to create a quick DAL. I have a simple database (four tables), I created an EF model, and then issue this query.

image

Figure 1: A very simple piece of code that query database with EF

This is a very super simple piece of code, in the first line I grab an instance of Term from the database (actually I grab the first one, in real code I have a complex query that returns a specific term). Then I want to grab all instance of Verbs that are related to this term, so I issue the second query, what will be the result?

This image exceptionally capture my face the first time I ran this query.

SNAGHTML4bb74e

Figure 2: Strange error in the query.

I’m really disappointed from this exception for many reasons. First of all I do not know why it does not work, because this is a perfectly and reasonable query to issue.

Fix the query to make it works

To verify that the query is indeed ok lets modify it in this way.

image

Figure 3: Modified version of the query, now it works.

This version works simply changing the order of the Where and ToList operators. It now works because it loads the entire Verbs table in memory and then filters out with LINQ to Object. This clearly makes me sick, because this creates a big memory pressure. The second issue with the above error is the error message, that is really obscure. Unable to create a constraint value of type xxxxxxxx. If you read with caution you could eventually understand that you cannot put a condition based on equality of objects, but only primitive type.

This error could have a good reason to exists: if the Term entity overrides the Equality method, it could be impossible to create an equivalent query in SQL to mimic the same behavior by the EF LINQ provider. Suppose that the Term overrides the Equal operator comparing two entities by checking for equality of the Value and Type properties, how could the EF provider understand this behavior to issue an equivalent query to the database? I can agree with this reason, but only if the error would be something like “Could not issue a comparison between two entities in a EF query because it is impossible to mimic the semantic of equals operator.. bla bla bla”.

A real solution, but you need to make your hands dirty

But working with objects in DDD or Domain Model or even with Active Record pattern, the programmer in 99.99% of the situation, translate the condition V.Term == term with: “gives me all verbs that are connected with term”. This because we are used to the fact that a reference from Verb entity to Term entity is part of the object graph and this can easily be translated with a comparison between identity operators. Given that, you can rewrite the above query in this way.

image

Figure 3: The right way to issue the EF query

Now the query runs fine, but you can have two completely different opinion about it.

The first opinion is negative, because I need to understand the name of the property/properties used as id by the ORM. I know that by convention you can use a property named Id for every entity, but sometimes, especially with legacy database you can have entities with composite key or identity property named differently. LINQ is great to express query in the object model, but in this situation I need to be aware of details of the ORM Mapping to issue the right query and this is bad.

The second type of opinion is positive, because this query does not suffer from mismatch between the LINQ to entities counterpart when someone redefines the equals operator for the entity.

Conclusion

I tend to agree with the dislike part, because I want my ORM to permit me to express query without being aware of the details. So I tend to consider EF not a real ORM but something like a SuperDataset, it is quick to use with database first approach, it permits you to create a DAL quickly, but it forces you to know the detail of the database, like a dataset. Clearly EF has really a lot of interesting feature respect to a dataset, but it still miss something to be considered an ORM.

Alk.

Tags:

Do you ORM?

I really believe that I could not live anymore without ORM, and this is one simple reason.

Suppose in NorthWind you should query for all Customers that have at least four orders where the total discount  is greater than 100.

image

Figure1: LINQ query to select all customers that have at least three orders with total discount greater than 100

Doing this in EF with a LINQ query is really simple (Figure 1), because you can express it with condition on the object model, and you can use the Count() LINQ operator and calculate the discount with simple math. The most important part is that you could be a SQL newbie, but you are still able to create a complex query. Now suppose that you should also paginate this result server side.

image

Figure 2: Server side pagination of a complex query thanks to Count(), Skip() and Take() operators

This is SOOOOO SIMPLE, I can use Count() on the original query to take the total count of records that satisfy the query and then I can paginate with Skip() and Take() ignoring the complexity of the query. Now, verify with EF profiler what is issued to the database. This is the count part

image

Figure 3: The count part to paginate server side

pretty scaring isn’t it? And here is the query that retrieve a single page

image

Figure 4: The query generated to retrieve page number 2, you can verify how row_number() is used to paginate server side

Ok, now I can hear some DBA crying that this query is not optimized, and that probably a stored procedure with some temp table could perform faster, but consider that:

  1. I wrote this query in few seconds, it was really simple to express that condition in LINQ
  2. OK, I’m quite smart in SQL, but I could have wrote it without even know how to paginate with row_number() OVER(… SQL instruction. The concept is that a programmer that knows LINQ could actually be a NOOB in SQL and still be able to express complex query.
  3. I’m a really Oracle Newbie, I have not tried actually, but this query can be issued to an oracle database, and I completely ignore Oracle Syntax to paginate server side. The query is not bound to a specific database because it is generated by the provider. And this is probably the best reason to use an ORM.
  4. Executing against a local Northwind database shows that it executes in milliseconds so it is not an overkill query (even if Northwind database is not so big)

SNAGHTMLdbad81

Figure 5: Timing of the executed query.

Try to do this without an ORM :).

alk.