Manage Trees with entity framework

Quite often you need to store in database Hierarchical structure that are logically represented by a tree. There are a lot of techniques around there, but one of the most common is using a simple foreign key that refers to the same table, as in the following example

image

If you map this table with entity framework you will obtain automatically a tree structure, you only need to rename the Relationship field because they are named Employee and Employee1, I renamed them in Parent and Childs , because the resulting in-memory structure is clearer

image

Now I create a simple routine to print the tree in console, the main problem here is that for every object we need to issue a Load() call to the Childs EntityCollection to load the childs, thus to print the whole tree we issue N select, where N is the number of the node of the tree.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
public static void Print(Employee employee, Int32 level)
{
   Console.WriteLine("{0}{1}", new String('-', level), employee.Name);
   if (!employee.Childs.IsLoaded)
   {
      employee.Childs.Load();
   }
   foreach (Employee child in employee.Childs)
   {
      Print(child, level + 1);
   }
}

This permits me to load the entity with Parent = null (The root) and print every object in the tree.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
public static void Main()
{
   Console.WriteLine("Test");
   using (TestEntities context = new TestEntities())
   {
      Employee root = context.Employee
        .Where(e => e.Parent == null).First();
      Print(root, 0);
   }
}

The output is good, Entity Framework rebuild the whole tree in memory.

1
2
3
4
5
6
Test
Alkampfer
-Guardian
--Clark
--John
-Joe

The first problem is the need to issue N select, this is an overkill for performance of big trees. Updating it is really simple, you only need to change the Parent node

1
2
3
4
5
Employee Clark = context.Employee
  .Where(e => e.Name == "Clark").First();
Employee Joe = context.Employee
  .Where(e => e.Name == "Joe").First();
Clark.Parent = Joe;

Deleting a node is more problematic, because if you delete a node that contains childs you will gets an error, because of violation of foreign key, you need to delete nodes one by one from the node you want to delete to every child so it is better to build a simple function to visit a subtree

1
2
3
4
5
6
7
8
public static void Visit(Employee employee, Action<Employee> visitAction)
{
   visitAction(employee);
   if (!employee.Childs.IsLoaded)
      employee.Childs.Load();
   foreach (Employee child in employee.Childs)
      Visit(child, visitAction);
}

Then use it to find a node and all its descendants and finally issue delete statement as in the following snippet that deletes the whole tree

1
2
3
4
5
6
7
8
9
using (TestEntities context = new TestEntities())
{
   Employee root = context.Employee
     .Where(e => e.Parent == null).First();
   List<Employee> nodes = new List<Employee>();
   Visit(root, nodes.Add);
   nodes.ForEach(context.DeleteObject);
   context.SaveChanges();
}

But again, this snippet will execute N Delete queries, so it does not perform well.

The conclusion is that with EF managing a tree is quite simple, but performances can suffer of an excessive number of query issued to the database.

Alk.

Tags: Entity Framework .NET Framework.