TFS 2018 Update 3 is out, what changes for Search

TFS 2018 Update 3 is finally out and in release notes there is a nice news for Search functionality, basic security now enforced through a custom plugin. Here is what you can read in release notes

Basic authorization is now enabled on the communication between the TFS and Search services to make it more secure. Any user installing or upgrading to Update 3 will need to provide a user name / password while configuring Search (and also during Search Service setup in case of remote Search Service).

ES is not secured by default, anyone can access port 9200 and interact with all the data without restriction. There is a commercial product made by ElasticSearch Inc to add security (Called Shield), but it is not free.

Traditionally for TFS search servers, it is usually enough to completely close port 9200 in the firewall (if the search is installed in the same machine of Application Tier) or to open the port 9200 of Search Server only for Application Tiers instances if Search services are installed on different machine, disallowing every other computer of the network to directly access Elastic Search instance.

Remember to always ensure minimum attack surfaces with a good Firewall Configuration. For ElasticSearch the port 9200 should be opened only for TFS Application Tiers.

Here is the step you need to perform when you upgrade to Update 3 to install and configure search services: first of all in your search configuration you can notice a warning sign, nothing was really marked as wrong, so you can teoretically move on with configuration.

2018-09-05_13-14-38

Figure 1: Search configuration page in TFS Upgrade wizard, notice the warning sign and User and Password fields

When you are in the review pane, the update process complain for missing password in the Search configuration (Figure 1). At this point people get a little bit puzzled because they do not know what to user as username and password.

2018-09-05_13-15-00

Figure 2: Summary of upgrade complains that you did not specified user and password in search configuration (Figure 1)

If you move on, you find that it is Impossible to prosecute with the update because the installer complains of a missing ElasticSearch plugin installed.

The error ElasticSearch does not have a plugin AlmSearchAuthPlugin installed is a clear indication that installation on Search server was outdated.

2018-09-05_13-22-37

Figure 3: During Readiness check, the upgrade wizard detect that search services installed in the search server (separate machine) missed some needed components.

The solution is really simple, you need to upgrade the Search component installation before you move on with Upgrading the AT instance. In my situation the search server was configured in a separate machine (a typical scenario to avoid ES to suck up too resource in the AT).

All you need to do is to copy search installation package (You have a direct link in search configuration page shown in Figure 1) to the machine devoted to search services and simply run the update command.

2018-09-05_13-25-32

Figure 4: With a simple PowerShell command you can upgrade the installation of ElasticSearch in the Search Server.

The –Operation update parameter is needed because I’ve already configured Search services in this server, but for Update 3 I needed also to specify a user and password to secure your ES instance. User and password could be whatever combination you want, just choose a secure and long password. After the installer finished, all search components are installed and configured correctly; now  you should reopen the Search configuration page (Figure 1) in the upgrade wizard, specify the same username and password you used during the Search Configuration and simply re-run readiness checks.

Now all the readiness checks should pass, and you can verify that your ElasticSearch instance is secured simply browsing port 9200 of your search server. Instead of being greeted with server information you will be first ask for user and password. Just type user and password chosen during Search component configuration and the server will respond.

2018-09-05_13-29-08

This is a huge step to have a more secure TFS Configuration, because without resorting to commercial plugin, ElasticSearch is at least protected with basic authentication.

Remember to always double check your TFS environment for potential security problems and always try to minimize attack surface with a good local firewall configuration.

I still strongly encourage you to configure firewall to allow for connection in port 9200 only from TFS Application Tier machines, because is always a best practice not to leave ports accessible to every computer in the organization.

Gian Maria.

Increase RAM for ElasticSearch for TFS Search

If you are experiencing slow search in TFS with the new Search functionality based on ElasticSearch a typical suggestion is to give more RAM to the machine where ES is installed. Clearly you should use HQ or other tools to really pin down the root cause but most of the time the problem is not enough RAM, or slow disks. The second cause can be easily solved moving data to an SSD disk, but giving more RAM to the machine, usually gives more space for OS File cache and can solve even the problem of slow disk.

ElasticSearch greatly benefit from high amount of RAM, both for query cache and for operating system cache of Memory Mapped Files

There are really lots of resources in internet that explain perfectly how ElasticSearch uses memory and how you can fine tune memory usage, but I want to start with a quick suggestion for all people that absolutely does not know ES, because one typical mistake is giving to ES machine more RAM but leaving ES settings unaltered.

Suppose I have my dedicated search machine with 4GB of RAM, the installation script of TFS configured ES instance to use a fixed memory HEAP of half of the available RAM. This is the most typical settings that works well in most of the situations. Half the memory is dedicated to Java Heap and gives ES space for caching queries, while the other 2 GB of RAM will be used by operating system to cache File access and for standard OS processes.

Now suppose that the index is really big and the response time starts to be slow, so you decided to upgrade search machine to 16 GB of RAM. What happens to ES?

image

Figure 1: Statistics on ES instance with HQ plugin shows that the instance is still using 2GB of heap.

From Figure 1 you can verify that ES is still using 2 GB of memory for the Heap, leaving 14 GB free for file system cache and OS. This is clearly not the perfect situation, because a better solution is to assign half the memory to java HEAP.

ElasticSearch has a dedicated settings for JAVA Heap usage, do not forget to change that setting if you change the amount of RAM available to the machine.

The amount of HEAP memory that ES uses is ruled by the ES_HEAP_SIZE environment variable, so you can simply change the value to 8G if you want ES to use 8 GB of Heap memory.

image

Figure 2: Elastic search uses an environment variable to set the amount of memory devoted to HEAP

But this is not enough, if you restart ElasticSearch windows service you will notice that ES still uses 1.9 GB of memory for the HEAP. The reason is: when you install ES as service, the script that install and configure the service will take the variable from the environment variable and copy it to startup script. This means that even if you change the value, the service will use the old value.

To verify this assumption, just stop the ElasticSearch service, go to ES installation folder and manually starts ElasticSearch.bat. Once the engine started, check with HQ to verify that ES is really using use 8GB of ram (Figure 2).

image

Figure 3: ES now uses the correct amount of RAM.

To solve this problem, open an administrator prompt in the bin directory of ElasticSearch, (you can find the location from the service as shown in previous post) and simply uninstall and reinstall the service. First remove the service with the service remove command, then immediately reinstall again with the service install command. Now start the service and verify that ES is using correctly 8GB of RAM.

When ElasticSearch is installed as a service, settings for Memory Heap Size are written to startup script, so you need to uninstall and reinstall the service again for ES_HEAP_SIZE to be taken in consideration

If you are interested in some more information about ES and RAM usage you can start reading some official article in ES site like: https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html

I finish with a simple tip, ES does not love virtual machine dynamic memory (it uses a Fixed HEAP size), thus it is better to give the machine a fixed amount of ram, and change HEAP size accordingly, instead of simply relying on Dynamic Memory. 

Gian Maria.

TFS 2018 search components in different server

When it is time to design topology of a TFS installation, for small team the single server is usually the best choice in term of licensing (one one Windows Server license is  needed) and simplicity of maintenance. Traditionally the only problem that can occur is: some component (especially the Analysis and Reporting services) starts to slow down the machine if the amount of data starts to become consistent.

Single machine TFS installation is probably the best choice for small teams..

The usual solution is moving Analysis Service, cube and Warehouse database on a different server. With such a configuration if the analysis service has a spike in CPU, Disk or Memory usage, the main machine with operational DB and AT continue to work with standard performance. The trick is leaving the core services in a server that is not overloaded with other tasks, this is the reason why running a build agent in TFS machine is usually a bad idea.

With TFS 2017 a new search component is introduced, based on ElasticSearch. ES is de facto the most used and most advanced Search Engine on the market, but it tends to eat CPU RAM and Disk if the workload is big. The suggestion is to start configuring search on the very same machine with AT and DB (Single server installation) and move search components if you start to notice that ES uses too many RAM or is using too much CPU when it is indexing data. Moving search components on another machine is a really simple process.

First of all you need to remove the search feature, using the remove feature wizard as shown in Figure 1 (select the first node in administration console to find the Remove Feature Link)

SNAGHTML1521f72

Figure 1: Remove feature from TFS instance

Now you should choose to remove the search feature.

image

Figure 2: Removing the Search Service functionality

After the wizard finished, you should go to the folder C:\Program Files\Microsoft Team Foundation Server 2018\Search\zip and with powershell run the command

Configure-TFSSearch.ps1 -Operation remove

to completely remove Elastic Search from your TFS instance.

Do not forget to use Configure-TFSSEarch PowerShell script to completely remove any trace of Elastic Search from the TFS machine

Now search component is not working anymore, if you issue a search you should get a message like shown in Figure 3:

image

Figure 3: Search component is not operational anymore.

At this point you open TFS administration console and start the wizard to configure the search, but this time, instead of configuring everything on the current machine, you will choose to use an existing Search Service.

image

Figure 4: Use an existing search service instead of install locally the search service

If you see in Figure 4, the instructions to install search service on another computer are really simple, you need to click the “search service package” link in the wizard to open a local folder that contains everything to setup ElasticSearch and the search service. Just copy content of that folder on another machine, install java and set the JAVA_HOME environment variable and you are ready to install Search Service.

You can find a README.txt file that explain how to configure the search service, just open a PowerShell console and then run the command

Configure-TFSSearch.ps1 -Operation install -TFSSearchInstallPath C:\ES -TFSSearchIndexPath C:\ESDATA

Please change the two folder if you want to change the location of ElasticSearch binary and ElasticSearch data. After the script ran without any error, it is time to verify that ElasticSearch is correctly installed and started. As usual, please create a DNS entry to avoid using the real name of the machine where you installed the service. In my installation I’ve configured the name tfssearch.cyberpunk.local to point to the real machine where I configured search services. Just open a browser and issue a request to http://tfssearch.cyberpunk.local:9200

image

Figure 5: Try to access local instance of ElasticSearch using DNS name

Please pay attention at firewall configuration, because ElasticSearch has no security in base installation and everyone can mess up and read data just browsing in port 9200

Now you should open your firewall to allow connection to port 9200 from every machine where an Application Tier is running. In my situation the machine with TFS installation has IP number 10.0.0.116. Remember that Elastic Search has NO Authentication (it is a module that is not free, called shield), thus it is really better to allow connections only from the TFS machine. All you need to do is creating a rule to open port 9200, but allowing connection only from the IP of the TFS machines.

image

Figure 6: Firewall configuration, the machine with the search service opens port 9200 only from the IP of the TFS machine.

Now remote desktop to TFS machine, and verify that you are indeed able to browse http://tfssearch.cyberpunk.local:9200, this confirm that configuration of the firewall allows TFS to contact the search service. Then try to access the very same address from another computer in the network and verify that it CANNOT access ElasticSearch instance. This guarantees that no one in the network can access Elastic Search directly and mess up with its data.

Now you can proceed with the Configuration Wizard in TFS instance, specifying the address of your new Search Server

image

Figure 7: Configure new search service in TFS.

Proceed and finish the wizard. At the end TFS machine will re-index all the data in the new search server, just wait some minutes and you will be able to use again search, but now all Search Components are on a dedicated server.

Gian Maria.

Decide to publish artifacts on Server or Network Share at queue time

A build usually produces artifacts and thanks to the extreme flexibility of VSTS / TFS Build system, you have complete freedom on what to include as artifacts and where you should locate them.

Actually you have a couple of options out of the box, a network share or directly on your VSTS / TFS Server. This second option is especially interesting because you does not need a special network share and you do not have permission issue (every script or person that can access the server and have build permission can use the artifacts). Having everything (build result and artifacts) on the server simplify your architecture, but it is not always feasible.

Thanks to VSTS / TFS  you can store artifacts of builds directly in your server, simplifying the layout of your ALM infrastructure.

The main problem with artifacts stored in the server happens with VSTS, because if you have really low upload bandwidth, like me, it is really a pain to wait for my artifacts to be uploaded to VSTS and is is equally a pain to wait for my releases to download everything from the web. When your build server is local and the release machine, or whatever tool need to consume build artifacts is on the same network, using a network share is the best option, especially if you have good Gigabit network.

The option to “where to store my artifacts” is not something that I want to be embedded in my build definition, I want to be able to choose at queue time, but this is not possible, because the location of the artifacts cannot be parameterized in the build.

The best solution would be to give user the ability to decide at queue time where to store artifacts. This will allow you to schedule special build that store artifacts on a network share instead that on server

The obvious and simple solution is to use Two “Publish Artifacts” task, one configured to store artifacts on the server, the other configured to store artifacts on a network share. Then create a simple variable called PublishArtifactsOnServer and run the Publish Artifacts configure to publish on the server only when this value is true.

SNAGHTML7ebdcd

Figure 1:

In Figure 1 there is the standard configuration of a Publish Artifact task that stores everything on the server and it is run on the custom condition that the build is not failed and the PublishArtifactsOnServer is true. Now you should place another Publish  Artifacts task configured to store drops on a network share.

SNAGHTML80160a

Figure 2: Another Publish Artifacts task, configured to run if PublishArtifactsOnServer is not true

In Figure 2 you can verify that the action is configured with the very same option, the only differences are the Artifact Type that is on a File Share with the corresponding network share and the Custom condition that runs this action if the build is not failing and if the PublishArtifactsOnServer variable is NOT equal to true.

This is another scenario where the Custom Conditions on task can allow you for a highly parameterized build, that allows you to specify at queue time if you want your artifacts stored on the server or onto a network share. The only drawback to this solution is that you need to duplicate your tasks, but it is a really simple things to do. Now if you queue a build where PublishArtifactsOnServer is true, you can verify that your artifats are indeed stored on server.

2017-07-07T16:28:21.0609004Z ##[section]Starting: Publish Artifact: primary drop 
2017-07-07T16:28:21.0618909Z ==============================================================================
2017-07-07T16:28:21.0618909Z Task         : Publish Build Artifacts
2017-07-07T16:28:21.0618909Z Description  : Publish Build artifacts to the server or a file share
2017-07-07T16:28:21.0618909Z Version      : 1.0.42
2017-07-07T16:28:21.0618909Z Author       : Microsoft Corporation
2017-07-07T16:28:21.0618909Z Help         : [More Information](https://go.microsoft.com/fwlink/?LinkID=708390)
2017-07-07T16:28:21.0618909Z ==============================================================================
2017-07-07T16:28:22.6377556Z ##[section]Async Command Start: Upload Artifact
2017-07-07T16:28:22.6377556Z Uploading 2 files
2017-07-07T16:28:27.7049869Z Total file: 2 ---- Processed file: 1 (50%)
2017-07-07T16:28:32.7375546Z Uploading 'Primary/xxx.7z' (14%)
2017-07-07T16:28:32.7375546Z Uploading 'Primary/xxx.7z' (28%)
2017-07-07T16:28:32.7375546Z Uploading 'Primary/xxx.7z' (42%)
2017-07-07T16:28:37.7827330Z Uploading 'Primary/xxx.7z' (57%)
2017-07-07T16:28:37.7827330Z Uploading 'Primary/xxx.7z' (71%)
2017-07-07T16:28:40.3306829Z File upload succeed.
2017-07-07T16:28:40.3306829Z Upload 'C:\vso\_work\13\a\primary' to file container: '#/534950/Primary'
2017-07-07T16:28:41.4309231Z Associated artifact 214 with build 4552
2017-07-07T16:28:41.4309231Z ##[section]Async Command End: Upload Artifact
2017-07-07T16:28:41.4309231Z ##[section]Finishing: Publish Artifact: primary drop 

As you can see from the output of the task, the task uploaded the artifacts to the server. Now you can schedule another build with PublishArtifactsOnServer to false and you should see that the other task was executed, now the artifacts are on a network share.

Gian Maria

Development and Related Work missing on new UI form after upgrade to TFS 2017

I’ve dealt on How to enable the new work Item in TFS 2017, and you could have noticed some differences from Visual Studio Team Service Work Item layout, or even with new Team Projects created after the upgrade. In Figure 1 you can see in the left the detail of a PBI on a Team Project that exists before the upgrade to TFS 2017, in the right the PBI on a Team Project created after the upgrade.

SNAGHTML113eac

Figure 1: Differencies of Work Item Form between Team Projects

This difference happens because the script that upgrade the Process Template to use the new form, does not add some part that are indeed present in newer Process Template. In the end you can be annoyed because new Team Project have different layout than existing ones. The solution is quite simple.

First of all install the Team Process Template Editor Extension for Visual Studio 2017, this will easy the process because will avoid resorting to command line. Now, as you can see in Figure 2, you should be able to use the menu Tools –> Process Editor –> Work Item Types that contains two commands: Import WIT and Export WIT.

image

Figure 2: Commands to import and export Work Item Definition of a Team Project directly inside VS

Process Template Editor will save you some command line details

With these two command you can export the definition of Work Items from a Team Project to your local disk (Export), then you can modify the definition and upload modified version to Team Project again (Import). If you choose Export, it will made you connect to a Project Collection, then choose the Team Project you want to use, and finally choose the Work Item Type, in my situation I’ve choose User Story Work Item Type and saved it to disk. When the export asks you to include the Global Definition List, please answer no.

It will be a good idea to save everything to a source controlled folder, so you can verify how you modified the Work Item Definition during the time.

After you downloaded the Work Item Definition of the Team Project you want to fix, Export the very same Work Item Definition, from a Team Project that was created after the upgrade to TFS 2017. It could be simpler instead to download the entire process template definition from the server and look for the Work Item Definition you need. Since I need to modify several Work Item Type definition, not only User Story, I decided to download the entire Process Template for the Agile definition

image

Figure 3: How to download an entire process template directly from Visual Studio.

The menu Team –> Team Project Collection Settings – > Process Template Manager will do the trick; this command will open a dialog with all the Process Template Definition and I simply grab the Agile (default) and download to a source controlled folder.

Now you should have two file User Story.xml, the first one of the TP that was upgraded (the one that missing the Development and Related Work UI Section) and the other is the one of the Process Template Definition, that contains the sections you miss.

You can just compare them with whatever tool you prefer. The result is that the two files usually are really different, but we do not care for now about all the differences, we can just scroll to the bottom to find the definition of the layout of the new UI as you can see in Figure 4.

image

Figure 4: Different UI definition section between existing and new Team Project

In the left we have Status and Planning section, while on the right we have planning and development, the two missing parts we want to see in our new Work Item Form Layout are instead on the new definition. Now you only need to adjust the layout on the left to make it include the missing two section. In Figure 5 is shown how I modified the file.

image

Figure 5: Modified Process template

You can rearrange the form as you like, as you can see I’ve made it equal to the one of the latest version of the Process Template, except that I left the Story Point in the Planning section.

Once you are done, you can use the Import WIT function from Visual Studio 2017 to upload the new definition to the Team Project you want to modify. If you made errors the Work Item Definition will not be imported, so you do not risk to damage anything.

If the import went OK, just refresh the page in the browser and you should immediately see the new layout in action, Development and Related Work section are now visible. You can also rearrange the UI if you like to move information in a way that are more suitable to your needs.

Now repeat the very same procedure for Features, Epics, Task and whatever Work Item Type that miss some section respect a newly created process template.

If you need to do this procedure for more than one Team Project, you need to repeat all the steps if the other Team Project is based on a different template (Agile, SCRUM, CMMI) or it contains different customization.

It is always a good procedure to always export the actual definition of a Work Item before starting to modify it, to avoid messing up the template.

If you did a good work and stored all customization into source control, you still need to export the definition from the template, to verify if the definition was changed from the latest time you modified it. Even if you are sure that the latest customizations are in source control, remember that when you upgrade TFS it could be possible that Process Template Definition was upgraded to enable new features (as for the new Work Item Layout Form).

Gian Maria Ricci