Inserting a Geography data in Sql Server with plain C# SqlCommand

Usually I access data with ORM like NHibernate or Entity Framework and resort to plain T-SQL only for reporting or manipulation of massive data, but yesterday I had to use some plain SQL to insert data in a table that has Geograpy column.

SNAGHTMLe9b70

Figure 1: A table with geography data

If you use EF 5 you can simply declare the property of type DbGeography and everything is handled by Entity Framework for you.

public virtual DbGeography Posizion { get; set; }

Using this type of data is simple, because EF gives you some operator to express some advanced concepts Es: Order results based on relative distance respect a given coordinate, and it takes care of inserting and loading data from DB.

var text = string.Format("POINT({0} {1})", currentLatitude, currentLongitude);
DbGeography geo = DbGeography.FromText(text, 4326);

results = query.OrderBy(c => c.Posizione.Distance(geo)).ToList();

When it is time to work with plain Sql things started to become a little bit complicated, because geography type in Sql Server is a CLR User Type. The simplest solution is to use standard geography instructions of T-Sql. Suppose you need to insert into database a record that has a geography column but you have coordinate in latitude and longitude, here is the insert you need.

 INSERT INTO  [dbo].[myTable]
 (
                [Id]
              ...
               ,[Position]
 )
 VALUES
 (
                @Id,
                ...
                 geography::STPointFromText('POINT(' + CAST(@lon AS VARCHAR(20)) + ' ' + CAST(@lat AS VARCHAR(20)) + ')', 4326)
)

The trick is using the geography::STPointFromText function to create a point from latitude and longitude. Probably this is not the most efficient way, but it is simple and permits you to specify latitude and longitude as plain SqlParameter of type Decimal.

CurrentDatabase.AddInParameter(c, "@lat", DbType.Decimal, Entity.Latitude);
CurrentDatabase.AddInParameter(c, "@lon", DbType.Decimal, Entity.Longitude);

This accomplish the insert without the need to reference any of the native geography types of Sql Server and using only plain T-Sql.

Gian Maria.

Using SaveOrUpdate in Entity Framework

This is a common question for people that used NHibernate before using EF. The problem is, I have an object and I do not want to care about if this is a new object that needs to be added for the first time to the database, or it is an object that needs update or if it was already attached to the context, etc. I want to call a single Upsert method called SaveOrUpdate and let the ORM takes care of the rest.

Since I use NHibernate a lot, I’ve come out in the past with the following simple code that can helps you to have similar functionalities in EF.

public static void SaveOrUpdate
    (this ObjectContext context, TEntity entity)
    where TEntity : class
{
    ObjectStateEntry stateEntry = null;
    context.ObjectStateManager
        .TryGetObjectStateEntry(entity, out stateEntry);

    var objectSet = context.CreateObjectSet();
    if (stateEntry == null || stateEntry.EntityKey.IsTemporary)
    {
        objectSet.AddObject(entity);
    }

    else if (stateEntry.State == EntityState.Detached)
    {
        objectSet.Attach(entity);
        context.ObjectStateManager.ChangeObjectState(entity, EntityState.Modified);
    }
}

public static ObjectContext GetObjectContext(this DbContext c)
{
    return ((IObjectContextAdapter)c).ObjectContext;
}

public static void SaveOrUpdate
    (this DbContext context, TEntity entity)
    where TEntity : class 
{
    context.GetObjectContext().SaveOrUpdate(entity);
}

This is some common code you can find around in the internet and thanks to extension method, it permits you to call SaveOrUpdate directly on your DbContext object and make your life easier. The principle on witch this code is based is simple, first of all a DbContext object can be cast to IObjectContextAdapter that has a property called ObjectContext that give you direct access to the real manager of the lifecycle of objects.

From that object you have access to the ObjectStateManager of the DbContext, that can be used to verify status of the entity thanks to the TryGetObjectStateEntry. If the object has no state entry of if the stateEntity has an EntityKey that IsTemporary, it means that this object should be added so you can call AddObject to the correct ObjectSet obtained from a call to the CreateObjectSet<T> of ObjectContext object. If the entity is in status Detatched I decided to attach and inform EF that the entity is in state Modified.

It is quite rude, but it works. The only problem with it is when you have inheritance, if you call SaveOrUpdate passing a derived class you can incur in the error

There are no EntitySets defined for the specified entity type xxx If xxx is a derived type, use the base type instead.
Parameter name: TEntity

This can be easily solved specifying the base type as type parameter in the call of SaveOrUpdate and everything should work as expected.

Gian Maria.

Entity Framework Code First and TPH Inheritance mapping

The major pain when you switch from two similar tools (from NH to EF in this specific scenario) is that you need to learn again how to obtain the same results, but in a different way. Yesterday I was mapping a hierarcy of objects in Entity Framework, I’ve a scenario where Table Per Type was the right solution and I’ve managed to create the fluent mapping in really short time, but when it was time to use Table Per Hierarchy the situation got worse.

If you simply map TPH, EF wants to use a discriminator nvarchar column called “Discriminator” where he store the name of the class to discriminate. This is not efficient and usually a simple integer would be a really better solution, so I started looking on how to specify a different type for the discriminator column. After a little googling I’ve found this.

 public class UserGoldMap : EntityTypeConfiguration<UserGold>
    {
        public UserGoldMap ()
        {
           this.Map(u =>  u.Requires("Discriminator").HasValue(1).HasColumnType("int"));
           //rest of the mapping
        }
    }

In the map of concrete types you can use the u.Requires with the name of the discriminator column. My first concern is the naming of the API, because I never could imagine to specify the discriminator column with a call to a method called Requires. I’m pretty sure that if I need to do such mapping again in 10 days, I’ve surely forgot the right call, because is everything but intuitive.

Apart this annoying fact, once I’ve modified the mapping my continuous test runner was still red, but the error now is:

System.Data.MetadataException: Schema specified is not valid. Errors:

MaxLength facet isn’t allowed for properties of type int.

Probably EF put a MaxLength constraint upon Discriminator column, because it originally declare as nvarchar(128) and it ignores the fact that I’ve changed the type in the mapping to an int column. The solution was trivial, just avoid using Discriminator as column name, you can use as an example UserType and everything went good.

I must admit that EF is now a really better product that it was in the past, but in my opinion it still has not-so-meaningful names for fluent API and some strange behaviour that can surprise you as a user.

Gian Maria.

Entity Framework error Some part of your SQL statement is nested too deeply

Today a colleague told me that he got a strange Entity Framework error

Some part of your SQL statement is nested too deeply. Rewrite the query or break it up into smaller queries

I immediately ask him what kind of huge and big query he was issuing to EF, but the answer is pretty simple.

Context.EntityName.Where(u => listOfId.Any(s => s.Equals(u.UserId))).ToList()

and listOfId is a List<Int32> with 100 integer

Basically he want to retrieve all entities that are related to a list of UserIds, having the list of UserId inside a simple list of integer. Even if the query looks good, this got translated to a monster Sql Query. Here it is a screenshot of the query (it is really too big to include in source)

image

Actually this probably happens because EF is not hable to handle well an Any operator applied to a list of integer and he treats it as any other ANY operator, so he creates basically a subquery for each of the number in the list.

The solution is quite obvious, rewrite the query

Context.EntityName.Where(u => listOfId.Contains(u.UserId)).ToList()

This is a much simpler query, it simply express that we want all the entities with a UserId property that is contained in the list of Id. EF is now capable of translating in a much simpler SQL.

image

This query is what we expect, a simple query with an IN operator.

The lesson learned from this example is: Try to express your query in the simplest way and always look at what EF generates, to avoid bad surprise.

Gian Maria

Entity framework and Table splitting with Code First

Nhibernate supports lazy loading for single properties and this solves perfectly the situation where a table has on or more field with a lot of data and you absolutely need to avoid loading all that data each time you load the entity. Basically with NH you can define Lazy Properties and the content of that property is loaded from the database only if you access it.

Entity framework does not supports this feature, but it can solve this scenario with Table Splitting, as described by Julie Lerman. The problem is that the example by Julie is based on EF designer, but I want to use Code First. In this post I’ll describe you the steps I did to move from a standard Entity mapped on a single table to Two entity mapped to the same table with Table Splitting with a Code First approach.

My scenario is: I have a table with a column that contains really lot of text data that is used rarely. The entity mapped on that table is used basically on every action of most of the controllers in a asp.NET web application. The overall performance is degraded, because each time more than 1 MB of data is loaded into memory, only to read a bunch of other values of that table. The database should not be refactored and we should find a solution to load the big text data only in the rare situation where it is really needed.

This is the original EntityA properties: Id, Name, Surname, BigData properties, where BigData contains lots of text rarely needed.

public partial class EntityA
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Surname { get; set; }
    public string BigData { get; set; }
 }

public class EntityAMap : EntityTypeConfiguration<EntityA>
{
    public EntityAMap()
    {
        // Primary Key
        this.HasKey(t => t.Id);

        // Properties
        this.Property(t => t.Name)
            .HasMaxLength(30);

        this.Property(t => t.Surname)
            .HasMaxLength(30);

        // Table & Column Mappings
        this.ToTable("EntityA");
        this.Property(t => t.Id).HasColumnName("Id");
        this.Property(t => t.Name).HasColumnName("Name");
        this.Property(t => t.Surname).HasColumnName("Surname");
        this.Property(t => t.BigData).HasColumnName("BigData");
    }
}

The solution to the problem is Table Splitting or mapping two distinct entities on the same table. The “master” entity maps all standard fields of the table, the other one maps data that are rarely needed (BigData in this example). These are the final entities.

public class EntityA
{
    public EntityA()
    {

    }

    public virtual int Id { get; set; }
    public virtual string Name { get; set; }
    public virtual string Surname { get; set; }

    public virtual EntityAPayload Payload
    {
        get { return _payload; }
        set { _payload = value; }
    }
    private EntityAPayload _payload;
}

public class EntityAPayload
{
    public virtual int Id { get; set; }
    public virtual string BigData { get; set; }
}

The key points are: they have both same Id property, EntityA has a reference to the EntityAPayloadMap to estabilish a 1:1 relation and finally all properties are virtual to enable Lazy Loading. It can sound weird, but the 1:1 relation is mandatory and also is clearer to comprehend. If you map two unrelated entities on the very same table, EF will gave you an error telling that you should establish a relation. This limitation is not a problem for me, because I do not like the “unrelated entities” approach, after all the Payload is part of EntityA, it has no sense without an EntityA, and I only want the loading to be done only when needed. Mapping those two is really simple.

public class EntityAMap : EntityTypeConfiguration<EntityA>
{
    public EntityAMap()
    {
        // Primary Key
        this.HasKey(t => t.Id);

        // Properties
        this.Property(t => t.Name)
            .HasMaxLength(30);

        this.Property(t => t.Surname)
            .HasMaxLength(30);

        // Table & Column Mappings
        this.ToTable("EntityA");
        this.Property(t => t.Id).HasColumnName("Id");
        this.Property(t => t.Name).HasColumnName("Name");
        this.Property(t => t.Surname).HasColumnName("Surname");
        this.HasRequired(t => t.Payload).WithRequiredPrincipal();
    }
}

public class EntityAPayloadMap : EntityTypeConfiguration<EntityAPayload>
{
    public EntityAPayloadMap()
    {
        // Primary Key
        this.HasKey(t => t.Id);
        this.ToTable("EntityA");
        this.Property(t => t.BigData).HasColumnName("BigData");
    }
}

You should have the same Id, this means that the two class should have an Id property mapping to the same column of the table and they should point to the same table. Then it is mandatory that EntityA has a 1:1 relationship with the payload, estabilished by the HasRequired().WithRequiredPrincipal. This is everything you need to do to enable Table Splitting, but you should be aware of a couple of issues that gave me a little headache.

You are not able to save an instance of EntityA if it has Payload property set to null, the relationship is mandatory and you cannot avoid populating it. This lead naturally to a simple solution: populate the Payload property on the constructor (or lazy initialize in the getter) so you will never have an EntityA without a Payload. The net result is that you cannot load the payload anymore. Suppose you initialize Payload in the constructor to make sure that each EntityA always have a Payload.

    public EntityA()
    {
        _payload = new EntityAPayload();
    }

If you try to load EntityA with an include to Payload, you will get: Multiplicity constraint violated. The role ‘EntityA_Payload_Target’ of the relationship ‘EfTableSplitting.Models.EntityA_Payload’ has multiplicity 1 or 0..1.

image

Figure 1: Error happens if you automatically initialize _payload in the constructor

Here is basically what happens inside EF: after issuing the query to the database, when he construct an instance of EntityA, it got the Payload property already populated with an Id of 0 and EF complains because he got different id from the database, so he is thinking that Entity2 refers two distinct PayloadEntity when the relation is 1:1 (they should share the very same id). This supposition is enforced because if you normally load an instance of EntityA, if you access Payload.BigData property it is always null. This happens because you are creating an empty instance of Payload in the constructor and EF does not populate Payload property with the Proxy object that will trigger lazy load.

This is usually a no-problem, because in DDD or more generally in OOP you should create Factory Methods to create entities in valid state. The best approach is make the constructor protected and create a factory method to create an instance in a valid state.

public static EntityA CreateEntity()
{
    return new EntityA()
    {
        _payload = new EntityAPayload()
    };
}

protected EntityA()
{

}

Thanks to this code, usage of the class is really simplified, here is how you can create a couple of instances, one with payload and the other without payload.

using (TestDbContext context = new TestDbContext())
{
    entity = EntityA.CreateEntity();
    entity.Name = "Gian MAria";
    entity.Surname = "Ricci";
    entity.Payload.BigData = "Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?";
    context.EntityAs.Add(entity);
    context.SaveChanges();

    EntityA entity2 = EntityA.CreateEntity();
    entity2.Name = "Gian MAria2";
    entity2.Surname = "Ricci2";
    context.EntityAs.Add(entity2);
    context.SaveChanges();
}

EF is able to use the protected constructor when it rebuild entities from database data, this will make everything working as usual. The usage of EntityA is simple, thanks to factory methods you should never worry of forgetting populating the Payload because it will happens automatically. Since all the properties are virtual, EF can enable Lazy load, and you are able to write the following code.

using (TestDbContext context = new TestDbContext())
{
    //now the load part
    var reloaded = context.EntityAs.Single(e => e.Id == entity.Id);
    Console.WriteLine("Entity id " + reloaded.Id);
    //lazy load.
    Console.WriteLine("Payload is:" + reloaded.Payload.BigData);
}

If you use EfProfiler or a standard Sql profiler, you can verify that this piece of code actually issues two distinct queries to the database. The first one is used to load all properties of EntityA, and only if you access the Payload object LazyLoad will load the BigData for you. This approach is more complex of Nhibernate Lazy Properties, but it gives you the very same result: Lazy loading one or more columns of a table.

This is the first query issued by EF

SELECT TOP (2) [Extent1].[Id]      AS [Id],
               [Extent1].[Name]    AS [Name],
               [Extent1].[Surname] AS [Surname]
FROM   [dbo].[EntityA] AS [Extent1]
WHERE  [Extent1].[Id] = 76 /* @p__linq__0 */

When you access the Payload property EF will issue another query to load properties mapped in the Payload.

SELECT [Extent1].[Id]      AS [Id],
       [Extent1].[BigData] AS [BigData]
FROM   [dbo].[EntityA] AS [Extent1]
WHERE  [Extent1].[Id] = 76 /* @EntityKeyValue1 */

I find this approach really appealing because it explicitly suggest to the user of the Entity that the Payload is something that is somewhat separate from the standard entity. While Nhibernate property lazy loading is much more simpler to use (just declare the property lazy), it completely hides from the user of the entity what is happening behind the scenes. EF approach create a structure that is much clearer IMHO. Thanks to this clearer structure, a user can ask EF to eager-fetch the Payload if he knows in advance that the code will use it.

var reloaded = context.EntityAs
    .Include("Payload")
    .Single(e => e.Id == entity.Id);

You can in fact use the .Include as for any other navigation property, ignoring that the underling database structure actually stores everything in the same table. Here is generated query.

SELECT [Limit1].[Id]       AS [Id],
       [Limit1].[Name]     AS [Name],
       [Limit1].[Surname]  AS [Surname],
       [Extent2].[Id]      AS [Id1],
       [Extent2].[BigData] AS [BigData]
FROM   (SELECT TOP (2) [Extent1].[Id]      AS [Id],
                       [Extent1].[Name]    AS [Name],
                       [Extent1].[Surname] AS [Surname]
        FROM   [dbo].[EntityA] AS [Extent1]
        WHERE  [Extent1].[Id] = 76 /* @p__linq__0 */) AS [Limit1]
       LEFT OUTER JOIN [dbo].[EntityA] AS [Extent2]
         ON [Limit1].[Id] = [Extent2].[Id]

This is where EF fails a little, because it generates the query as if the two entities would be mapped on different tables instead of issuing a standard query on the EntityA table. This is a minor issue, but I think that it should be addressed in future versions. The important aspect is that I’m able to load everything with a simple roundtrip to the database when needed.

My final conclusion is: Table Splitting in Entity Framework is a really good technique especially for legacy databases, where you cannot change the schema. Thanks to it, you can split columns of a single database table on two distinct entities, loading the less-used data only when needed. If you are not working with legacy database you should consider if a standard two tables approach is preferable, especially because you can also store the less-used table in other filegroup placed on slower disks. With the emerging trend of SSD, it is not uncommon to have server instances with 250 Gb of high Speed SSD and much more space on standard mechanical disks (much slower). Being able to decide where to place the data is a key factor in system where the speed is critical.

Gian Maria.