“Unsupported filter” using ContainsAny in Mongo 2.x driver

Porting code from Legacy driver to new driver syntax is quite annoying for .NET MongoDb driver. In the new Drivers almost everything is changed, and unless you want to still use old legacy syntax creating a mess of new and old syntax, you should convert all the code to the new syntax.

One of the annoying problem is ContainsAny in LINQ compatibility driver. In old drivers, if you have an object that contains an array of strings, and you want to filter for objects that have at least one of the value contained in a list of allowed values you had to resort to this syntax.

  return Containers.AllUnsorted
                .Any(c => c.PathId.Contains(containerIdString) &&
                c.Aces.ContainsAny(aceList));

In this situation Aces properties is an HashSet<String> and aceList is a simple String[], the last part of the query uses the ContainsAny extension method from Legacy MongoDb driver. That extension was needed in the past because the old driver has no full support for LINQ Any syntax.

The problem that arise with the new driver is, after migrating code, the above code still compiles because it references the Legacy Drivers, but it throws an “Unsupported Filter” during execution. The solution is really simple, the new driver now support the whole LINQ Any syntax, so you should write:

return Containers.AllUnsorted
	 .Any(c =&gt; c.PathId.Contains(containerIdString) &amp;&amp;
			c.Aces.Any(a =&gt; aceList.Contains(a)));

As you can see, you can now write the Query with standard LINQ syntax without the need to resort to ContainsAny.

While I really appreciate that in the new Drivers LINQ support is improved, it is quite annoying that the old code still compiles but it throws at run-time.

Gian Maria.

Change how MongoDb C# Driver serialize Guid in new driver version

We have code based on the old legacy mongo driver that uses MemberSerializationOptionsConvention class to serialize all Guid as plain strings. This option is really useful because saving Guid in Bson format is often source of confusion (CSUUID, … ) for .NET users. On the contrary, having saving Guid as plain string solves some pain and made your Bson really more readable.

Here is the old code that runs with the old legacy driver.

var conventions = new ConventionPack
    {
        new MemberSerializationOptionsConvention(
            typeof (Guid),
            new RepresentationSerializationOptions(BsonType.String)
            )
    };
ConventionRegistry.Register("guidstring", conventions, t =&gt; true);

In the new Driver, these classes are removed, so we need to find a different way to serialize Guid as strings. Here is a possible solution.

public static void RegisterMongoConversions(params String[] protectedAssemblies)
{
    var guidConversion = new ConventionPack();
    guidConversion.Add(new GuidAsStringRepresentationConvention(protectedAssemblies.ToList()));
    ConventionRegistry.Register("guidstring", guidConversion, t =&gt; true);
}

This class uses a custom class called GuidAsStringRepresentationConvention, used to instruct Mongo serializer to serialize Guid as plain String. This class accepts a list of string that represents a list of protected assemblies. The reason behind it is that we do not want this convention to be applied to every type loaded into memory, and we want to avoid to change how Guid are serialized for types stored in assembly we do not have control to.

Here is the code of the class.

/// <summary>
/// A convention that allows you to set the serialization representation of guid to a simple string
/// </summary>
public class GuidAsStringRepresentationConvention : ConventionBase, IMemberMapConvention
{
    private List protectedAssemblies;

    // constructors
    /// <summary>
    /// Initializes a new instance of the  class.
    /// </summary>  
    public GuidAsStringRepresentationConvention(List protectedAssemblies)
    {
        this.protectedAssemblies = protectedAssemblies;
    }

    /// <summary>
    /// Applies a modification to the member map.
    /// </summary>
    /// The member map.
    public void Apply(BsonMemberMap memberMap)
    {
        var memberTypeInfo = memberMap.MemberType.GetTypeInfo();
        if (memberTypeInfo == typeof(Guid))
        {
            var declaringTypeAssembly = memberMap.ClassMap.ClassType.Assembly;
            var asmName = declaringTypeAssembly.GetName().Name;
            if (protectedAssemblies.Any(a =&gt; a.Equals(asmName, StringComparison.OrdinalIgnoreCase)))
            {
                return;
            }

            var serializer = memberMap.GetSerializer();
            var representationConfigurableSerializer = serializer as IRepresentationConfigurable;
            if (representationConfigurableSerializer != null)
            {
                BsonType _representation = BsonType.String;
                var reconfiguredSerializer = representationConfigurableSerializer.WithRepresentation(_representation);
                memberMap.SetSerializer(reconfiguredSerializer);
            }
        }
    }
}

The code is really simple, it scan each memberMap to verify if the memberMap is of type GUID, but if the type is declared in a protected assembly, simply return, because we want to leave serialization as-is. If the type is not in a protected assembly and is of type GUID, we simply change the representation of the serialization to be a BsonType.String.

The result is that now, each property of type GUID, will be serialized as plain string into MongoDb.

Gian Maria.

Why I’m not a great fan of LINQ query for MongoDb

I’m not a great fan of LINQ provider in Mongo, because I think that developers that start using only LINQ misses the best part of working with a Document Database. The usual risk is: developer always resort to LINQ queries to load-modify-save a document instead of using all powerful update operators available in Mongo.

Despite this consideration, if you need to retrieve full document content, sometimes writing a LINQ query is the simplest approach, but, as always, not every valid LINQ statement you can write can be translated to MongoQuery. This is the situation of this query.

//apply security filtering.
documentsQuery = documentsQuery
  .Where(d =&gt; d.Aces.Any(a =&gt; permittingAces.Contains(a)))
  .Where(d =&gt; !d.Aces.Any(a =&gt; denyingAces.Contains(a)));

I need to filter all documents, finding documents where Aces property (is a simple HashSet<String>) contains at least one of the aces in permittingAces list but should not contain any aces listed in denyingAces collection. While this is a perfectly valid LINQ query, if you try to issue it to Mongo you got a:

Any is only support for items that serialize into documents. The current serializer is StringSerializer and must implement IBsonDocumentSerializer for participation in Any queries.

You can use Any with sub-objects, but expressing an Any condition on an array of string is not supported. To overcome this limitation, .NET provider for MongDb provide a convenient ContainsAny extension operator to write previous query.

documentsQuery = documentsQuery
  .Where(d =&gt; d.Aces.ContainsAny(permittingAces))
  .Where(d =&gt; !d.Aces.ContainsAny(denyingAces));

This LINQ query works perfectly, and if you are curious how this query translated to standard MongoQuery, you can use the GetMongoQuery() method, as I’ve described in previous post.

This simple example shows you some of the limitation that you can encounter using LINQ provider in MongoDb, and my suggestion is to always prefer using standard MongoQuery because it gives you lots of more flexibility, especially for update operations.

Another reason in the past to stay away from the LINQ provider is that the older version of the driver, still used by large amount of persons, had a really bad implementation of the Select LINQ operator, because the projection is done client side, as stated here:

WARNING

Select does not result in fewer fields being returned from the server. The entire document is pulled back and passed to the native Select method. Therefore, the projection is performed client side.

This is a great problem, because the whole document is always returned from the server, using more bandwidth and more resource server side. Remember that one of the standard optimization when you issue query to MongoDb instance is reducing the amount of field you are loading from your document. If you use old LINQ provider and you are doing Select to retrieve less field from the server, you are wasting your time, because you are loading always the whole document.

Gian Maria.

Start ElasticSearch in windows with a different configuration file

When you start elasticsearch double clicking on Elasticsearch.bat in windows, it uses the standard config/elasticsearch.yml files that is contained in the installation directory. Especially for development, it is really useful to be able to start ES with different configuration file.

Probably my googleFu is not perfect, but each time that I need to find the correct option to pass to Elasticsearch.bat batch file I’m not able to find with the first search and I always loose some time, and this means that probably this information is not indexed perfectly.

If you are interested the configuration option is called –Des.config and permits you to specify the config file used to start your ES Node.

elasticsearch.bat -Des.config=Z:\xxxx\config\elasticsearch1.yml

You can now create how many config file you need, and simply create multiple link to the original bat file with different config file to start ES with your preferred options.

Gian Maria.

Mongo compression with Wired Tiger Engine

With 3.0 version of Mongo database the most welcomed feature was the introduction of pluggable storage engine. This imply that we are not forced to use standard NMAPv1 storage system, but we can use other way of storing data on our filesystem. The first and official alternative storage system is Wired Tiger.

One of the most interesting aspect of Wired Tiger is Data Compression, a feature that can reduce the space of your database on disk, and that is especially effective since Mongo stores document as BSON, where most of the data is text. Wired Tiger has three options for compression: none, snappy and zlib, bug even with none compression, the space occupied by your database on disk is usually lower than NMAPv1. Here is a simple and quick test done on a customer database.

  • NNMAPv1: 3.250.453KB
  • WiredTiger no compression: 1.219.696 KB
  • WiredTiger snappy: 603.674 KB
  • WiredTiger zlib: 466.548 KB

This particular database is full of text and this explain why Wired Tiger is so superior respect space occupied by the database, but the gain is really impressive. The version with Snappy compression is only a fraction of the database with NNMAPv1 and with lesser disk space occupied, there is less disk I/O activity to read data. The further gain you obtain with zlib comes at more CPU usage, and you need to measure to understand if it worth in your deployment.

The major drawback of using Wired Tiger engine is that RoboMongo, one of the most interesting UI to access Mongo, does not work because it still uses old version of the shell and that there is no automatic migration from NMAPv1 to Wired Tiger (you need to do a backup, then change storage system, and restore).

Gian Maria