Add a cache and… you’ve created a security bug

This is the history of a nasty bug happened today. There is some service that has a method called GetCustomers() and this function is used from a web application and a Windows application. Since landing web page use this function to show all customers data, and since  Customers entity changes rarely we decide to put the result of this service call in Cache with 1 hour absolute expiration.

Web interface is made in WebForms, so we used an ObjectDataSource to communicate with the service, here is the code.

   1: String cachekey = "34F56C51-1941-480A-9801-70C6B1E31DF0";

   2: IList<CustomerDto> result;

   3: if ((result = HttpContext.Current.Cache.Get(cachekey) as IList<CustomerDto>) == null)

   4: {

   5:    //result isnot in cache, get from service and add to cache

   6:    result = ItemService.GetCustomers();

   7:    HttpContext.Current.Cache.Insert(cachekey, result, null, DateTime.Now.AddMinutes(60),Cache.NoSlidingExpiration );

   8: }

This code works ok, it uses an unique guid as cache key because the GetAllCustomers() has no parameters, so we simply check if the result is in cache, if not, call the service and put the result in the cache.

Everything seems ok… or not?

The question is: is it correct to store result in cache if we know that we can wait for one hour to see updated data?


The correct answer is: you cannot answer to this question if you do not know the implementation of GetAllCustomers() service function. What we need to know is that service method answers with the same result when we call it with the same parameters or the method depends from some other data of the environment.

This code works fine, until requirements changed. A new requirements introduced a new role in the system (called SimpleUser), and users belonging to this role are allowed to see only a subset of all Customers, and this subset is handled from Administrator roles. This functionality was implemented inside the GetAllCustomers() service function. If current user belongs to this new role, the service will issue a query on Customers object with a join to the AllowedCustomers table and the name of current user to show only allowed customers… and this introduced a bug

Do you spot the problem???

Suppose this scenario, an administrator enters in the system, data source issued the call to GetAllCustomers() service method and put data in cache. Now a SimpleUser logs to the system, data source object find the result of GetAllCustomers() in cache and simply returns it.. showing the result of all customers…. wow, now if the user choose an unassociated user the system chrashes with a securityException but worst of all, he see data from all Customers, even those he have not access to….

The obvious solution is adding the name of the logged user to the cache key when the user belong to the SimpleUser Role, but this bug reminded me of how really difficult is creating a good cache strategy.


Read and learn to use a tool instead of Try and learn :)

I’m a very noob Blend user Smile and I learn it without reading or seeing any tutorial, just opened blend and begin to use it… this is no good Smile because I missed some basic features like for example that the grid can be put in Canvas or Grid mode….


Figure 1: The grid is in canvas mode, the upper left icon is reflecting this, but I did not know of this feature .. so bad

Now I want to change Column width, but when I move the cursor of the column…here is what happens


Figure 2: Textbox are the same even if I resize the column, blend adds for me margins to make the ui looks the same after column resize.

I was frustrated to manually change all margins and after one day I discovered that clicking in that upper left icon the grid can be put in Grid Mode, and now resizing columns behave as I expected.

the lesson is, take some online lesson or read a book before use a new tools, because try and learn can really slow down productivity and makes you miss some important functionality :)


Brian Harry programming practices

I’m exited on Brian Harry’s plan to do a series of posts regarding programming practices in his blog. The main reason is that Brian had really wrote tons of good code and it is really interesting knowing how he organize his projects. From this first post I strongly agree with a couple of concepts, the first is having a clean code at regular interval, no longer than few hours.

I made mistake in the past, and sometimes in the present, I begin coding a super new feature, with a lot of code, and after some hours I have nothing working because I’m still doing infrastructure code, etc etc. This is sometimes due to the fact that I’d like to start with beautiful and well architected code, but the risk is getting lost in development. Quite often it is better to start with some “quick and dirty” functional prototype, that gets you straight into the point, then refactor to a more clean solution if the prototype is going well. This avoid you the risk to work for a couple of days just to find that you took the wrong solution path. Proceeding step by step is the best way to reach a working solution.

This means that sometimes a “similar TDD” is my choice to develop some brand new piece of code I do not know anything about. I like TDD, but I do not take as a religious way of writing software, so I agree in part with Brian’s post about TDD. One of the point I agree most is the time spent on test refactoring while you are refactoring code. If you test every method and every single property of each class, when you refactor you need to spent time refactoring tests, or removing them :(. This is why I tend not to write too granular test, and this means that I’m not doing a full TDD :). Another risk of TDD is that sometimes lead to “Bad Code” because the only goal for the developer is having all tests green, and sometimes it lead to a lot of special cases.

So I resort to do similar TDD only on little part of my projects, and mainly when :

  • I need to use API I never used before
  • I’m working on complex algorithm
  • I’m depending on external data, and I know in advance that data will be dirty

I’ve other situations where I like to write TDD, but these three are the most important ones. Using TDD in the first is a way to gain confidence with the new API while testing how they are working. In the second scenario I can tackle algorithm complexity some piece at a time, and having a good test suite helps me to gain confidence with the code. In the third situation I need to have a lot of testing to verify external data that is flowing to my software, I had experiences where I sent a XSD schema to specify how I’m expecting data, and I have back pieces of data that does not validate and sometimes is not valid XML… and worst I need to make correction because I cannot rely on other to correct this. Having a test suite that helps me to face one problem at a time is invaluable.

But clearly Brian’s sentence I most agree with is that one “Step through everything in the debugger”. I really step at least once every line in the debugger, just to verify that code I wrote is behaving the way I think it need to work :) (often the two does not coincide ). Believing that TDD avoid you going into the debugger is mainly a Myth. Surely TDD and unit testing reduce the time you need to spent into the debugger, but they cannot avoid it.



Have you backup your data today?

I was reading this post, and I really agree with Jeff, every day is international Backup Awareness Day.

People usually learn the importance of a good backup the first time they have a great data loss, or when you see others having greats data loss. I remember, loong time ago, when I worked in a computer shop in my city, one day a person came with his computer and told us “it does not power up”. I checked the machine and the Hard Disk was completely gone, it did not even spin the disk so I told him, “the HD is gone, all data are lost”. I remember this 50 years old man almost crying to me “I have 5 years of work inside the HD, tons of autocad project, I need those data”

That episode makes me understand the importance of your data, so every day I begin to think, what will happen if my internal HD will fry for catastrophic power supply peak voltage? What happens if a supermagnet will erase all the data of all my HD in home, etc etc.

I learned also the importance of verify the restore procedure, nothing is more dangerous of a false security, you think that you have a good backup, and when you need it….

You will find that your backup is not so good, and you lost data or you are not able to recover all of your data.

Another important consideration is that you cannot rely on others making backup of your data. Data are YOURS and it is YOURS duty to backup them, especially if they are stored in some internet provider that gives you 5 GB of spaces for 50$ years.

For My blog I do regular database backup, I download via ftp the whole site periodically so I do not lose images, and I restore everything on a virtual machine of mine, and verify that the restored blog is ok, this gives me good confidence that if my provider completely lost all my data, I’m able to restore everything from the latest backup. Data of my blog, for each backup, are stored in multiple places.

1) on my host

2) on my external backup disk

3) on a virtual machine that resides into an internal disk.