Unable to Sysprep Windows 10 due to Candy Crush …

I was trying to sysprep a Windows 10 virtual machine hosted in Hyper-V but I got error messages like

Package CandyCrush.. was installed for a user, but not provisioned …

It turns out that Win 10 standard installer installs some application from the store that conflicts with sysprep. Now I need to uninstall one by one and to speedup the process I suggest you to use Get-AppxPackage powershell commandlet.

As an example to uninstall every application called Candy you can issue

Get-AppxPackage –Allusers *Candy* | Remove-AppxPackage

After you uninstalled all unwanted application you should be able to sysprep your Widnows 10 machine.

Gian Maria.

Retrieve image in Work Item Description with TFS API

When you try to export content of Work Item from Azure DevOps (online or sever) you need to deal with external images that are referenced in HTML fields of Work Item. I’ve dealt in the past on this subject, showing how you can retrieve images with Store and Attachment Work Item Property.

Sadly enough, I’ve encountered situation with on-premise version of TFS where I found this type of image src inside HTML fields.

https://nameoftfsserver/NameOfCollection/WorkItemTracking/v1.0/AttachFileHandler.ashx?FileNameGuid=AB6fc2e0-c449-4090-ab98-fac6c87fc219&FileName=temp1554203067610.png 

As you can see the image is stored inside TFS as attachment, because it is served by AttachFileHandler, but you do not find any information of this image in Work Item Attachments property.

Tfs  / Azure DevOps has different technique to attach images to Work Item Description and image file is not always a real Attachment of the Work Item

This format has no FileId to retrieve the image with the Store interface, nor is present in the Attachments collection of current Work Item, so we need to download it with a standard WebClient, instead of relying to some specific API call. The problem is authentication, because if you simply try to use a WebClient to download the previous url you will got a 401.

The solution is to populate the Credentials property of WebClient with the current credentials used to connect to the TFS, in my situation I’ve this value into an helper class called ConnectionManager.

image

Figure 1: How to correctly retrieve image attached to a work item.

The code is really simple, just get the credential from Connection manager, generate a temp file name and use WebClient to download image content.

TfsTeamProjectCollection class, once connected to TFS / Azure Devops instance, contains an instance of used credentials in Credentials property. This can be used to do standard request with WebClient to the server.

The GetCredentials() method of ConnectionManager is really simple because, after connection is stabilished, the instance of TfsTeamProjectCollection class has a Credentials property that contains credentials used to connect to the server.

Armed with correct credentials, we can use WebClient standard class to download images from the server.

Gian Maria

How to delete content in Azure DevOps wiki

Today I got a simple but interesting question about Azure DevOps, how can I completely delete the content of the wiki? There are not so many reason for this, but sometimes you really want to start from scratch. Now suppose you have your wiki:

image

Figure 1: Wiki with a simple page

You have created some pages, you played a little bit with the wiki, you attached some cute pets photo and content to the wiki itself, maybe just to gain familiarity with the wiki itself.

image

Figure 2: Wiki with some content on it.

Now you want to delete everything, such as that no member of the team should be able to retrieve pages and content anymore.

Azure DevOps Wiki are nothing more than a Git Repository with MarkDown content, so you can directly manipulate git repository if you need to alter wiki history

To do a low level manipulation of the wiki, you should simply clone wiki repository locally, you can simply find repository url in the UI

image

Figure 3: Clone wiki repository from the ui.

That menu option simply lets you to grab url of the repository, then you can simply clone the repository locally and verify all the commits done in the wiki. (I use command line but you can use any UI of you choiche)

image

Figure 4: Content of the wiki, a simple git repository

Now if you look a Figure 4 you can notice that the wiki is nothing more than a git repository with a commit for each modification you did to the wiki. Now, if you really want to reset everything and start wiki from scratch, you can simply issue a

git reset --hard SHA_OF_FIRST_COMMIT

Where SHA_OF_FIRST_COMMIT is the address of the very first commit, the one with the comment Initializing wiki, in my example 86ec4c9. After the command was executed your local wikiMaster branch point to the very first commit of the repository, an empty wiki.

image

Figure 5: Your local wikiMaster branch was reset to the very first commit, now wikiMaster point to an empty wiki

Now you can simply push with –force option to reset remote branch to the very same commit.

git push --force

Open again wiki page to verify that now it reverted to the original version. Actually the server still has the previous commit in the database, but they are not reachable anymore and they will be deleted over time by internal garbage collection.

Resetting to the very first commit actually delete everything from the wiki, restoring it to its pristine content

This scenario is not really common, but a real common scenario is when you mistakenly write something in the wiki, save the page and then you want to delete what you have written. There are lots of reason for this requirement, you mistakenly inserted sensitive data like passwords or tokens, or you simply write something that you want to permanently delete.

If you look to Figure 4, suppose you simply paste a wrong image and you want to remove that image and all related content from the history of the page. If you simply edit the wiki page, remove the image, then save again the page, the data is still in the history, anyone can find again the content you want to remove. The only solution is to rewrite git history.

Since a Wiki is a git repository, everything you did remain in history of the page, if you included sensitive information, even if you edit the page, removed that information and save again is not enough.

From Figure 4 you can verify that the incriminated commit is 97e520e. If you followed my previous example you can simply reset everything to the previous commit, actually deleteing every content that was inserted after that commit.

git reset --hard 97e520e^

Special char ^ indicates first parent of a commit, so previous instruction tell git to reset to the commit parent of bad commit. After this operation a git push – force will reset the branch from the server. The incriminated content is now gone, along with every content that was inserted after. Actually you restored wiki content to a past point in time.

Git reset –hard in your wiki repository allows you to restore a Wiki on a point in time, but everything that happened after that moment will be lost.

This is not a perfect approach, suppose you realize that someone stored a password in the wiki some days ago, you do not want to lose everything but simply remove that specific content and leaving other commit unchanged. Thanks to git flexibility you can obtain this operation with an interactive rebase.

git rebase 97e520e^ -i

This will actually trigger a complete rewrite of the history from the parent of the incriminated commit to the last commit of the wiki. I’m not going to give you a complete explanation of an interactive rebase, but basically you are presented with the list of all commits, starting with the commit you want to delete to the latest commit in the branch.

image

Figure 6: Delete the commit with interactive rebase.

In Figure 6 you are seeing an example in which I have a single commit after the one you want to remove, but nothing changes if you have tons of commits after. You simply need to change the command for the first commit (the commit you want to delete) from pick to d (delete). Leave all other rows unchanged. Then simply save the script to continue (if you are not familiar with VIM simply press I to edit the file, change the file then press ESC to come back in command mode and press : then w then q then ENTER).

This command actually deletes only the commit you want to delete, leaving all following commits unchanged. You actually scissor knife removed a single bad save from your wiki.

image

Figure 7: Commit was removed, local branch has not anymore commit 97e520e

Now you should be 100% sure that no one else modified the wiki in the short timespan you need to clone and rebase the repository so you can issue a git push –force to overwrite content of the repo on AzDo instance.

A git interactive rebase is an operation where you are rewriting history, so you can selectively remove a single commit from the history.

This will actually preserve all content of the wiki, you only removed a single commit from the wiki. There is no more history of that commit inside the Wiki. (actually deleted commit is still unreachable on the server, but there is no way for other to retrieve it).

If you want to completely remove a page with all the history of that page, you need to delete multiple commits, but luckily git has a filter-branch or more advanced comment. You can find more detail here https://help.github.com/en/articles/removing-sensitive-data-from-a-repository

Have I ever told you how much I love Git? :)

Gian Maria.

Application insight Snapshot debugger and strange production problem

I want to share with you an history of a problem we had in production last week, because after lots of internet search no article lead us to find the solution, so I hope that this article can help you if you experience the very same problem.

Situation: we deployed new version of a web application based on ASP.NET Web API, it was a small increment (we deploy frequently), but suddenly the customer starts experiencing slowness of the application and intermittent errors.

Logging into the machine we found a real strange situation, shown in Figure 1 we have two worker process for the application, one is running and the other one is in Suspended state, but it is using lots of RAM.

image

Figure 1: Two worker process in the same machine, but one is suspended

In Figure 1, the running process used 4 minutes of CPU time because this screenshot was taken after a process recycle, but we had situation where the running process was alive for more than 10 hours, and the suspended process has consumed not a single second of CPU.

In production server, suddently, another worker process (w3wp.exe) started, in suspended state, starts consuming RAM, then disappear.

If you search in internet, you can find lots of article related to process recycling, but this is not our situation, because the process does not get recycled. This is our usual situation on that production server

– Our process usually consume 1.5 GB / 2.0 GB of RAM
– Production machine has 8 GB of RAM
– We had other services in that machine, but usually  total memory consumption is around 6 GB / 7 GB

Performance of the application is good, we use almost all the memory of the server, but we never experienced problems. Now, after the last deploy, periodically we see this exact series of events from Task Manager:

– Another w3wp.exe was launched, it has the very same IIS user (jarvis) and the very same command line of the existing instance of w3wp.exe
– That process is in state suspended, but it starts allocating RAM
– That process is a child of the original w3wp.exe process
– The web application works normally, all requests are handled by the original w3wp.exe process and the site responds quickly
– Suspended process continue to grow in RAM usage, until it arrives approximately to the working set of original w3wp.exe process, then it is closed
the original w3wp.exe process was not recycled and continue worked perfectly

The real problem is memory usage, because when this sequence of events starts, the machine starts experiencing memory pressure and everything is really slow, sometimes a third w3wp.exe suspended process starts and this bring down the performance of the machine

image

Figure 2: Windows detects low memory, and w3wp.exe process is the problem

This is the problem when you did a good dimensioning of the machine, when something out of your control happens and some process is using more resource, performances starts to go down.

If you try to kill the new w3wp.exe suspended process, its working set drops to 4k, but it remains suspended and cannot be killed. This is a nasty situation, because performance of production server was poor, and the only workaround was adding another RAM to the virtual machine, but we really hate wasting RAM for unnecessary task.

After a couple of hours of searching we were frustrated because:

1) This has nothing to do with Recycling, the original process was not recycled, every article in the internet refers to some recycling problem
2) This happens only after last deploy, but we really did not change a lot the code in the last sprint
3) Whenever a w3wp.exe suspended process appears, it reach the same amount of memory of the official w3wp.exe process before disappearing, this is important because the new process must be related to the old one.

The only possible reason for this behavior is that something is creating another process in suspended state to create a dump of production w3wp.exe process

When you take a memory dump of a process it will freeze for the duration of the dump, but you can also spawn a new process, copy memory and then dump the other process. With this technique the original process continue to run.

Armed with this intuition we start searching for *.dmp files in all disks, and immediately we found a temp folder with some dump of 1.7 GB size. Bingo: this confirms our hypothesis, something is creating dump, and this creates too memory pressure.

By carefully analyzing what happened the only explanation of the issue is “something is creating dumps of w3wp.exe process.

Now we start searching for the possible culprit and after some unsuccessful search on internet (we found tons of article related to lots of reason but noone was applicable to our situation), we were a little bit frustrated.

This is the problem when your production server does not works well, you feel in an hurry and you made mistake, since we located the .dmp file we can simply use SysInternal Process monitor to verify what process was reading or writing in the folder containing dump.

BINGO we have a process call3ed SnapshotUploader64 that was using that file, and finally we immediately understand what happened, for some reason the web application has Application Insight Snapshot debugger enabled.

image

Figure 3: SysInternal process monitor shows you what process is using file and it is an invaluable tool to understand what is happening in your machine.

Looking at the source code we found that in this sprint we updated reference to Application Insight library and the nuget packet changed Application Insight configuration file enabling Snapshot Debugger of the site, no one in the team realized this, because was nothing we discussed.

This is a general rule, whenever you update nuget packages, always look at config files, because sometimes packages post update action manipulate them in a way that can cause problem.

Snapshot debugger is a really nice feature, you can read more here, it has a real nice feature that uses a Snappoint where the original process is forked in a new process that is immediately suspended. Reading the documentation we found that:

The snapshot is not a copy of the full heap of the app – it’s only a copy of the page table with pages set to copy-on-write. The Snapshot Debugger only makes copies of pages in your app if the page gets modified, minimizing the memory impact on your server. In total, your app will only slow down by 10-30 milliseconds when creating snapshots. As snapshots are held in-memory on your server, they do cost ~100s of kilobytes while they are active, as well as an additional commit charge. The overhead of capturing snapshots is fixed and should therefore not affect the throughput of your app regardless of the scale of your app.

But this is absolutely not true, in our situation, because forked process consumed lots of RAM. We choose not to investigate why, but instead we disabled snapshot debugger from web config and the problem went away.

It was a long morning, but our server is up and running at full speed again :)

Gian Maria.

Install latest node version in Azure Pipelines

I have a build in Azure DevOps that suddenly starts failing on some agents during the build of an angular application. Looking at the log I found that error

You are running version v8.9.4 of Node.js, which is not supported by Angular CLI 8.0+.

Ok, the error is really clear, some developer upgraded Angular version on the project and node version installed in some of the build servers is old. Now the obvious situation is logging in ALL build servers, upgrade node js installation and the build should run on every agent. This is not a perfect solution, especially because because someone can add another build server with an outdated Node Js version and I’m stuck again.

Having some strong pre-requisite on build agent, like a specific version of NodeJs is annoying and needs to be addressed as soon as possible.

In this scenario you have two distinct way to solve the problem. The first solution is adding a custom capabilities to all the agents that have Node 10.9 at least, or even better, mark all agents capable to build Angular 8 with a Angular8 capability. Now you can mark all builds for projects with Angular 8 to demand Angular8 capability, and lets Azure DevOps choose the right agent for you.

Matching Agent Capabilities with Build Demands is a standard way to have your build definition match the correct agent that have all prerequisites preinstalled

But even better is using a Task (still in preview) that is able to install and configure required node version on the agent automatically.

image

Figure 1: Use Node.js ecosystem Task configured

Thanks to this task, you do not need to worry about choosing the right agent, because the task will install required NodeJs version for you automatically. This solution is perfect, because it lessen pre-requisites of build agent, it is the build that will automatically install all prerequisites if missing.

When you start having lots of builds and build agents, it is better not to have too many prerequisites for a build agent, because it will complicate deploy of new agents and can create intermittent failure.

If you can you should be able to preinstall all required prerequisites for the build with build definition, avoiding to require prerequisites on agents.

Happy Azure Devops Pipeline