Habitat application portability and understanding dynamic linking of ELF binaries by Matt Wrock

I do not come from a classical computer science background and have spent the vast majority of my career working with Java, C# and Ruby - mostly on Windows. So I have managed to evade the details of exactly how native binaries find their dependencies at compile time and runtime on Linux. It just has not been a concern in the work that I do. If my app complains about missing low level dependencies, I find a binary distribution for Windows (99% of the time these exist and work across all modern Windows platforms) and install the MSI. Hopefully when the app is deployed, those same binary dependencies have been deployed on the production nodes and it would be just super if its the same version.

Recently I joined the Habitat team at Chef and one of the first things I did to get the feel of using Habitat to build software was to start creating Habitat build plans. The first plan I set out to create was .NET Core. I would soon find out that building .NET Core from source on Linux was probably a bad choice for a first plan. It uses clang instead of GCC, it has lots of cmake files that expect binaries to live in /usr/lib and it downloads built executables that do not link to Habitat packaged dependencies. Right out the gate, I got all sorts of various build errors as I plodded forward. Most of these errors centered around a common theme: "I can't find X." There were all sorts of issues beyond linking too that I won't get into here but I'm convinced that if I knew the basics of what this post will attempt to explain, I would have had a MUCH easier time with all the errors and pitfalls I faced.

What is linking and what are ELF binaries?

First lets define our terms:


There are no "Lord of the Rings" references to be had here. ELF is the Extensible and linkable format and defines how binary files are structured on Linux/Unix. This can include executable files, shared libraries, object files and more. An ELF file contains a set of headers and a number of sections for things like text, data, etc. One of the key roles of an ELF binary is to inform the operating system how to load a program into memory including all of the symbols it must link to.


Linking is a key part of the process of building an executable. The other key part is compiling. Often we refer to both jointly as "compiling" but they are really two distinct operations. First the compiler takes source code files and turn them into machine language instructions in the form of object files. These object files alone are not very useful to running a program.

Linking takes the object files (some might be from source code you wrote) and links them together with external library files to create a functioning program. If your source code calls a function from an external library, the compiler gleefully assumes that function exists and moves on. If it doesn't exist, don't worry, the linker will let you know.

Often when we hear about linking, two types are mentioned: static and dynamic. Static linking takes the external machine instructions and embeds them directly into the built executable. If all external dependencies of a program were statically linked, there would be only one executable file and no need for any dependent shared object files to be referenced.

However, we usually dynamically link our external dependencies. Dynamic linking does not embed the external code into the final executable. Instead it just points to an external shared object (.so) file (or .dll file on Windows) and loads that code into the running process at runtime. This has the benefit of being able to update external dependencies without having to ship and package your application each time a dependency is updated. Dynamic linking also results in a smaller application binary since it does not contain the external code.

On Unix/Linux systems, the ELF format specifies the metadata that governs what libraries will be linked. These libraries can be in many places on the machine and may exist in more than one place. The metadata in the ELF binary will help determine exactly what files are linked when that binary is executed.

Habitat + dynamic linking = portability

Habitat leverages dynamic linking to provide true application portability. It might not be immediately obvious what this means or why it is important or if it is even a good thing. So lets start by describing how applications typically load their dependencies in a normal environment and the role that configuration management systems like Chef play in these environments.

How you manage dependencies today

Lets say you have written an application that depends on the ZeroMQ library. You might use apt-get or yum to install ZeroMQ and its binaries are likely dropped somewhere into /usr. Now you can build and run your application and it will consume the ZeroMQ libraries installed. Unless it is told otherwise, the linker will scan the trusted Linux library locations for shared object files to link.

To illustrate this, I have built ZeroMQ from source and it produced libzmq.so.5 and put it in /usr/local/lib. If I examine that shared object with ldd, I can see where it links to its dependencies:

mwrock@ultrawrock:~$ ldd /usr/local/lib/libzmq.so.5
linux-vdso.so.1 =>  (0x00007ffffe05f000)
libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8 (0x00007f7e92370000)
libsodium.so.18 => /usr/local/lib/libsodium.so.18 (0x00007f7e92100000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f7e91ef0000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7e91cd0000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f7e91ac0000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7e917a0000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7e91490000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7e910c0000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7e92a00000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f7e90e80000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7e90c60000)

They are all linked to the dependencies found in the Linux trusted library locations.

Now the time comes to move to production and just like you needed to install the ZeroMQ libraries in your dev environment, you will need to do the same on your production nodes. We all know this drill and we have probably all been burned at some point - something new is deployed to production and either its dependencies were not there or they were but they were the wrong version.

Configuration Management as solution

Chef fixes this right? Kind of...it's complicated.

You can absolutely have Chef make sure that your application's dependencies are installed with the correct versions. But what if you have different applications or services on the same node that depend on a different version of the same dependency? It may not be possible to have multiple versions coexist in /usr/lib. Maybe your new version will work or maybe it won't. Especially for some of the lower level dependencies, there is simply no guarantee that compatible versions will exist. If anything, there is one guarantee: different distros will have different versions.

Keeping the automation with the application

Even more important - you want these dependencies to travel with your application. Ideally I want to install my application and know by virtue of installing it, everything it needs is there and has not stomped over the dependencies of anything else. I do not want to delegate the installation of its dependencies and the knowledge of which version to install to a separate management layer. Instead, Habitat binds dependencies with the application so that there is no question what your application needs and installing your application includes the installation of all of its dependencies. Lets look at how this works and see how dynamic linking is at play.

When you build a habitat plan, your plan will specify each dependency required by your application in your application's plan:

pkg_deps=(core/glibc core/gcc-libs core/libsodium)

Then when Habitat packages your build into its final, deployable artifact (.hart file), that artifact will include a list of every dependent Habitat package (including the exact version and release):

[35][default:/src:0]# cat /hab/pkgs/mwrock/zeromq/4.1.4/20161225135834/DEPS

At install time, Habitat installs your application package and also the packages included in its dependency manifest (the DEPS file shown above) in the pkgs folder under Habitat's root location. Here it will not conflict with any previously installed binaries on the node that might live in /usr. Further, the Habitat build process links your application to these exact package dependencies and ensures that at runtime, these are the exact binaries your application will load.

[36][default:/src:0]# ldd /hab/pkgs/mwrock/zeromq/4.1.4/20161225135834/lib/libzmq.so.5
linux-vdso.so.1 (0x00007fffd173c000)
libsodium.so.18 => /hab/pkgs/core/libsodium/1.0.8/20161214075415/lib/libsodium.so.18 (0x00007f8f47ea4000)
librt.so.1 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/librt.so.1 (0x00007f8f47c9c000)
libpthread.so.0 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libpthread.so.0 (0x00007f8f47a7e000)
libstdc++.so.6 => /hab/pkgs/core/gcc-libs/5.2.0/20161208223920/lib/libstdc++.so.6 (0x00007f8f47704000)
libm.so.6 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libm.so.6 (0x00007f8f47406000)
libc.so.6 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libc.so.6 (0x00007f8f47061000)
libgcc_s.so.1 => /hab/pkgs/core/gcc-libs/5.2.0/20161208223920/lib/libgcc_s.so.1 (0x00007f8f46e4b000)
/hab/pkgs/core/glibc/2.22/20160612063629/lib64/ld-linux-x86-64.so.2 (0x0000560174705000)

Habitat guarantees that the same binaries that were linked at build time, will be linked at run time. Even better, it just happens and you don't need a separate management layer to enforce this.

This is how a Habitat package provides portability. Installing and running a Habitat package brings all of its dependencies with it. They do not all live in the same .hart package, but your application's .hart package includes the necessary metadata to let Habitat know what other packages to download and install from the depot. These dependencies may or may not already exist on the node with varying versions, but it doesn't matter because a Habitat application only relies on the packages that reside within Habitat. And even within the Habitat environment, you can have multiple applications that rely on the same dependency but different versions, and these applications can run side by side.

The challenge of portability and the Habitat studio

So when you are building a Habitat plan into a hart package, what keeps that build from pulling dependencies from the default Linux lib directories? What if you do not specify these dependencies in your plan and the build links them from elsewhere? That could break our portability. If your application builds magically from a non-Habitat controlled location, then there is no guarantee that those dependencies will land when you install your application elsewhere. Habitat constructs a build environment called a "studio" to protect against this exact scenario.

The Habitat studio is a clean room environment. The only libraries you will find in this environment are those managed by Habitat. You will find /lib and /usr/lib totally empty here:

[37][default:/src:0]# ls /lib -la
total 8
drwxr-xr-x  2 root root 4096 Dec 24 22:46 .
drwxr-xr-x 26 root root 4096 Dec 24 22:46 ..
lrwxrwxrwx  1 root root    3 Dec 24 22:46 lib -> lib
[38][default:/src:0]# ls /usr/lib -la
total 8
drwxr-xr-x 2 root root 4096 Dec 24 22:46 .
drwxr-xr-x 9 root root 4096 Dec 24 22:46 ..
lrwxrwxrwx 1 root root    3 Dec 24 22:46 lib -> lib

Habitat installs several packages into the studio including several familiar Linux utilities and build tools. Every utility and library that Habitat loads into the studio is a Habitat package itself.

[1][default:/src:0]# ls /hab/pkgs/core/
acl       cacerts    gawk      gzip            libbsd         mg       readline    vim
attr      coreutils  gcc-libs  hab             libcap         mpfr     sed         wget bash      diffutils  glibc     hab-backline    libidn         ncurses  tar         xz
binutils  file       gmp       hab-plan-build  linux-headers  openssl  unzip       zlib bzip2     findutils  grep      less            make           pcre     util-linux

This can be a double edged sword. On the one hand it protects us from undeclared dependencies being missed by our package. The darker side is that your plan may be building source that has build scripts that expect dependencies or other build tools to exist in their "usual" homes. If you are unfamiliar with how the standard Linux linker scans for dependencies, discovering what is wrong with your build may be less than obvious.

The rules of dependency scanning

So before we go any further lets take a look at how the linker works and how Habitat configures its build environment to influence where it finds dependencies at both build and run time. The linker looks at a combination of environment variables, cli options and well known directory paths and in a strict order of precedence. Here is a direct quote from the ld (the linker binary) man page:

The linker uses the following search paths to locate required shared libraries:

1. Any directories specified by -rpath-link options.
2. Any directories specified by -rpath options.  The difference between -rpath and -rpath-link is that directories specified by -rpath options are included in the executable and used at runtime, whereas the -rpath-link option is only effective at link time. Searching -rpath in this way is only supported by native linkers and cross linkers which have been configured with the --with-sysroot option.
3. On an ELF system, for native linkers, if the -rpath and -rpath-link options were not used, search the contents of the environment variable "LD_RUN_PATH".
4. On SunOS, if the -rpath option was not used, search any directories specified using -L options.
5. For a native linker, search the contents of the environment variable "LD_LIBRARY_PATH".
6. For a native ELF linker, the directories in "DT_RUNPATH" or "DT_RPATH" of a shared library are searched for shared libraries needed by it. The "DT_RPATH" entries are ignored if "DT_RUNPATH" entries exist.
7. The default directories, normally /lib and /usr/lib.
8. For a native linker on an ELF system, if the file /etc/ld.so.conf exists, the list of directories found in that file.

At build time Habitat sets the $LD_RUN_PATH variable to the library path of every dependency that the building plan depends on. We can see this in Habitat's build output when we build a Habitat plan:

zeromq: Setting LD_RUN_PATH=/hab/pkgs/mwrock/zeromq/4.1.4/20161225135834/lib:/hab/pkgs/core/glibc/2.22/20160612063629/lib:/hab/pkgs/core/gcc-libs/5.2.0/20161208223920/lib:/hab/pkgs/core/libsodium/1.0.8/20161214075415/lib

This means that at run time, when you run your application built by habitat, it will load from the "habetized" packaged dependencies. This is because setting the $LD_RUN_PATH influences how the ELF metadata is constructed and causes it to point to these Habitat package paths.

Patching pre-built binaries

Habitat not only allows one to build packages from source but also supports "binary-only" packages. These are packages that are made up of binaries downloaded from some external binary repository or distribution site. These are ideal for closed-source software or software that may be too complicated or takes too long to build. However, Habitat cannot control the linking process for these binaries. If you try to execute these binaries in a Habitat studio, you may see runtime failures.

The dotnet-core package is a good example of this. I ended up giving up on building that plan from source and instead just download the binaries from the public .NET distribution site. Running ldd on the dotnet binary, we see:

[8][default:/src:0]# ldd /hab/pkgs/mwrock/dotnet-core/1.0.0-preview3-003930/20161225145648/bin/dotnet
/hab/pkgs/core/glibc/2.22/20160612063629/bin/ldd: line 117:
No such file or directory

Well that's not very clear. This isn't even able to show us any of the linked dependencies because the glibc interpreter the ELF metadata says to use is not where the metadata says it is:

[9][default:/src:1]# file /hab/pkgs/mwrock/dotnet-core/1.0.0-preview3-003930/20161225145648/bin/dotnet
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked,
interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32,
BuildID[sha1]=db256f0ac90cd718d8ec2d157b29437ea8bcb37f, not stripped

/lib64/ld-linux-x86-64.so.2 does not exist . We can manually fix this even after a binary is built with a tool called patchelf. We will declare a build dependency in our plan to core/patchelf and then we can use the following command:

find -type f -name 'dotnet' \
  -exec patchelf --interpreter "$(pkg_path_for glibc)/lib/ld-linux-x86-64.so.2"

Now lets try ldd again:

[16][default:/src:130]# ldd /hab/pkgs/mwrock/dotnet-core/1.0.0-preview3-003930/20161225151837/bin/dotnet
linux-vdso.so.1 (0x00007ffe421eb000)
libdl.so.2 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libdl.so.2 (0x00007fcb0b2cc000)
libpthread.so.0 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libpthread.so.0 (0x00007fcb0b0af000)
libstdc++.so.6 => not found
libm.so.6 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libm.so.6 (0x00007fcb0adb1000)
libgcc_s.so.1 => not found
libc.so.6 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libc.so.6 (0x00007fcb0aa0d000)
/hab/pkgs/core/glibc/2.22/20160612063629/lib/ld-linux-x86-64.so.2 (0x00007fcb0b4d0000)

This is better. It now links our glibc dependencies to the Habitat packaged glibc binaries, but there are still a couple dependencies that the linker could not find. At least now we can see more clearly what they are.

There is another argument we can pass to patchelf --set-rpath that can edit the ELF metadata as if $LD_RUN_PATH was set when the binary was built:

find -type f -name 'dotnet' \
  -exec patchelf --interpreter "$(pkg_path_for glibc)/lib/ld-linux-x86-64.so.2" --set-rpath "$LD_RUN_PATH" {} \;
find -type f -name '*.so*' \
  -exec patchelf --set-rpath "$LD_RUN_PATH" {} \;

So we set the rpath to the $LD_RUN_PATH set in the Habitat environment. We will also make sure to do this for each *.so file in the directory where we downloaded the distributable binaries. Finally ldd now finds all of our dependencies:

[19][default:/src:130]# ldd /hab/pkgs/mwrock/dotnet-core/1.0.0-preview3-003930/20161225152801/bin/dotnet
linux-vdso.so.1 (0x00007fff3e9a4000)
libdl.so.2 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libdl.so.2 (0x00007f1e68834000)
libpthread.so.0 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libpthread.so.0 (0x00007f1e68617000)
libstdc++.so.6 => /hab/pkgs/core/gcc-libs/5.2.0/20161208223920/lib/libstdc++.so.6 (0x00007f1e6829d000)
libm.so.6 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libm.so.6 (0x00007f1e67f9f000)
libgcc_s.so.1 => /hab/pkgs/core/gcc-libs/5.2.0/20161208223920/lib/libgcc_s.so.1 (0x00007f1e67d89000)
libc.so.6 => /hab/pkgs/core/glibc/2.22/20160612063629/lib/libc.so.6 (0x00007f1e679e5000)
/hab/pkgs/core/glibc/2.22/20160612063629/lib/ld-linux-x86-64.so.2 (0x00007f1e68a38000)

Every dependency is a Habitat packaged binary as declared in our own application's (dotnet-core here) dependencies as low level as glibc. This should be fully portable across any 64 bit Linix distribution.

Creating a Docker container Host on Windows Nano Server with Chef by Matt Wrock

This week Microsoft launched the release of Windows Server 2016 along with its ultra light headless deployment option - Nano Server. The Nano server images are many times smaller than what we have come to expect from a Windows server image. A Nano Vagrant box is just a few hundred megabytes. These machines also boot up VERY quickly and require fewer updates and reboots.

Earlier this year, I blogged about how to run a Chef client on Windows Nano Server. Things have come a long way since then and this post serves as an update. Now that the RTM Nano bits are out, we will look at:

  • How to get and run a Nano server
  • How to install the chef client on Windows Nano
  • How to use Test-Kitchen and Inspec to test your Windows Nano Server cookbooks.

The sample cookbook I'll be demonstrating here will highlight some of the new Windows container features in Nano server. It will install docker and allow you to use your Nano server as a container host where you can run, manipulate and inspect Windows containers from any Windows client.

How to get Windows Nano Server

You have a few options here. One thing to understand about Windows Nano is that there is no separate Windows Nano ISO. Deploying a Nano server involves extracting a WIM and some powershell scripts from a Windows 2016 Server ISO. You can then use those scripts to generate a .VHD file from the WIM or you can use the WIM to deploy Nano to a bare metal server. There are some shortcuts available if you don't want to mess with the scripts and prefer a more instantly gratifying experience. Lets explore these scenarios.

Using New-NanoServerImage to create your Nano image

If you mount the server 2016 ISO (free evaluation versions available here), you will find a "NanoServer\NanoServerImageGenerator" folder containing a NanoServerImageGenerator powershell module. This module's core function is New-NanoServerImage. Here is an example of using to to produce a Nano Server VHD:

Import-Module NanoServerImageGenerator.psd1
$adminPassword = ConvertTo-SecureString "vagrant" -AsPlainText -Force

New-NanoServerImage `
  -MediaPath D:\ `
  -BasePath .\Base `
  -TargetPath .\Nano\Nano.vhdx `
  -ComputerName Nano `
  -Package @('Microsoft-NanoServer-DSC-Package','Microsoft-NanoServer-IIS-Package') `
  -Containers `
  -DeploymentType Guest `
  -Edition Standard `
  -AdministratorPassword $adminPassword

This will generate a Nano Hyper-V capable image file of a Container/DSC/IIS ready Nano server. You can read more about the details and other options of this function in this Technet article.

Direct EXE/VHD download

As I briefly noted above, you can download evaluation copies of Windows Server 2016. Instead of downloading a full multi gigabyte Windows ISO, you could choose the exe/vhd download option. This will download an exe file that will extract a pre-made vhd. You can then create a new Hyper-V VM from the vhd. With that vm, just login to the Nano console to set the administrative password and you are good to go.


This is my installation method of choice. I use a packer template to automate the download of the 2016 server ISO, the generation of the image file and finally package the image both for Hyper-V and VirtualBox Vagrant providers. I keep the image publicly available on Atlas via mwrock/WindowsNano. The advantage of these images is that they are fully patched (key for docker to work with Windows containers), work with VirtualBox and enable file sharing ports so you can map a drive to Nano.

Vagrant Nano bug

One challenge working with Nano Server and cross platform automation tools such as vagrant is that Nano exposes a Powershell.exe with no -EncryptedCommand argument which many cross platform WinRM libraries leverage to invoke remote Powershell on a Windows box.

Shawn Neal and I rewrote the WinRM ruby gem to use PSRP (powershell remoting protocol) to talk powershell and allow it to interact with Nano server. This has been integrated with all the Chef based tools and I will be porting it to Vagrant soon. In the meantime, a "vagrant up" will hang after creating the VM. Know that the VM is in fact fully functional and connectable. I'll mention a hack you can apply to get Test-Kitchen's vagrant driver working later in this post.

Connecting to Windows Nano Server

Once you have a Nano server VM up and running. You will probably want to actually use it. Note: There is no RDP available here. You can connect to Nano and run commands either using native Powershell Remoting from a Windows box (powershell on Linux does not yet support remoting) or use knife-windows' "knife winrm" from Windows, Mac or Linux.

Powershell Remoting:

$ip = "<ip address of Nano Server>"

# You only need to add the trusted host once
Set-Item WSMan:\localhost\Client\TrustedHosts $ip
# use usename and pasword "vagrant" on the mwrock vagrant box
Enter-PSSession -ComputerName $ip -Credential Administrator


# mwrock vagrant boxes have a username and password "vagrant"
# add "--winrm-port 55985 for local VirtualBox
knife winrm -m <ip address of Nano Server> "your command" --winrm-userator --winrm-password

Note that knife winrm expects "cmd.exe" style commands by default. Use "--winrm-shell powershell" to send powershell commands.

Installing Chef on Windows Nano Server

Quick tip: Do not try to install a chef client MSI. That will not work.

Windows Nano server jettisons many of the APIs and subsystems we have grown accustomed to in order to achieve a much more compact and cloud friendly footprint. This includes the removal of the MSI subsystem. Nano server does support the newer appx packaging system currently best known as the format for packaging Windows Store Apps. With Nano Server, new extensions have been added to the appx model to support what is now known as "Windows Server Applications" (aka WSAs).

At Chef, we have added the creation of appx packages into our build pipelines but these are not yet exposed by our Artifactory and Bintray fed Omnitruck delivery mechanism. That will happen but in the mean time, I have uploaded one to a public AWS S3 bucket. You can grab the current client (as of this post) here. To install this .appx file (note: if using Test-Kitchen, this is all done automatically for you):

  1. Either copy the .appx file via a mapped drive or just download it from the Nano server using this powershell function.
  2. Run "Add-AppxPackage -Path <path to .appx file>"
  3. Copy the appx install to c:\opscode\chef:
  $rootParent = "c:\opscode"
  $chef_omnibus_root - Join-Path $rootParent "chef"
  if(!(Test-Path $rootParent)) {
    New-Item -ItemType Directory -Path $rootParent

  # Remove old version of chef if it is here
  if(Test-Path $chef_omnibus_root) {
    Remove-Item -Path $chef_omnibus_root -Recurse -Force

  # copy the appx install to the omnibus_root. There are serious
  # ACL related issues with running chef from the appx InstallLocation
  # This is temporary pending a fix from Microsoft.
  # We can eventually just symlink
  $package = (Get-AppxPackage -Name chef).InstallLocation
  Copy-Item $package $chef_omnibus_root -Recurse

The last item is a bit unfortunate but temporary. Microsoft has confirmed this to be an issue with running simple zipped appx applications. The ACLs on the appx install root are seriously restricted and you cannot invoke the chef client from that location. Until this is fixed, you need to copy the files from the appx location to somewhere else. We'll just copy to the well known Chef default location on Windows c:\opscode\chef.

Running Chef

With the chef client installed, its easiest to work with chef when its on your path. To add it run:

$env:path += ";c:\opscode\chef\bin;c:\opscode\chef\embedded\bin"

# For persistent use, will apply even after a reboot.
setx PATH $env:path /M

Now you can run the chef client just as you would anywhere else. Here I'll check the version using knife:

C:\dev\docker_nano_host [master]> knife winrm -m "chef-client -v" --winrm-user vagrant --winrm-password vagrant Chef: 12.14.60

Not all resources may work

I have to include this disclaimer. Nano is a very different animal than our familiar 2012 R2. I am confident that the newly launched Windows Server 2016 should work just as 2012 R2 does today, but nano has APIs that have been stripped away that we have previously leveraged heavily in Chef and Inspec. One example is Get-WmiObject. This cmdlet is not available on Nano Server so any usage that depends on it will fail.

Most of the crucial areas surrounding installing and invoking chef are patched and tested. However, there may be resources that either have not yet been patched or will simply never work. The windows_package resource is a good example. Its used to install MSIs and EXE installers not supported on Nano.

Test-Kitchen and Inspec on Nano

The WinRM rewrite to leverage PSRP allows our remote execution ecosystem tools to access Windows Nano Server. We have also overhauled our mixlib-install gem to use .Net core APIs (the .Net runtime supported on Nano) for the chef provisioners. With those changes in place, Test-Kitchen can install and run Chef, and Inspec can test resources on your Nano instances.

There are a few things to consider when using Test-Kitchen on Windows Nano:

Specifying the Chef appx installer

As I mentioned above, the "OmniTruck" system is not yet serving appx packages to Nano. However, you can tell Test-Kitchen in your .kitchen.yml to use a specific .msi or .appx installer. Here is some example yaml for running Test-Kitchen with Nano:

  name: vagrant

  name: chef_zero
  install_msi_url: https://s3-us-west-2.amazonaws.com/nano-chef-client/chef-12.14.60.appx

  name: inspec

  - name: windows-nano
      box: mwrock/WindowsNano

Inspec requires no configuration changes.

Working around Vagrant hangs

Until I refactor Vagrant's winrm communicator, it cannot talk powershell with Windows Nano. Because Test-Kitchen and Inspec talks to Nano directly via the newly PSRP supporting WinRM ruby gem, they make Vagrant's limitation nearly unnoticeable. However the RTM Nano bits exacerbated the Vagrant bug causing it to hang when it does its initial winrm auth check. This can unfortunately hang your kitchen create. You can work around this by applying a simple "hack" to your vagrant install:

Update C:\HashiCorp\Vagrant\embedded\gems\gems\vagrant-1.8.5\plugins\communicators\winrm\communicator.rb (adjusting the vagrant gem version number as necessary) and change:

result = Timeout.timeout(@machine.config.winrm.timeout) do


result = Timeout.timeout(@machine.config.winrm.timeout) do

This should get your test-kitchen runs unblocked.

Running on Azure hosted Nano images

If you prefer to run Test-Kitchen and Inspec against an Azure hosted VM instead of vagrant, use Stuart Preston's excellent kitchen-azurerm driver:

  name: azurerm

  subscription_id: 'your subscription id'
  location: 'West Europe'
  machine_size: 'Standard_F1'

  - name: windowsnano
      image_urn: MicrosoftWindowsServer:WindowsServer:2016-Nano-Server-Technical-Preview:latest

See the kitchen-azurerm readme for details regarding azure authentication configuration. As of the date of this post, RTM images are not yet available but thats probably going to change very soon. In the meantime, use TP5.

Using Chef to Configure a Docker host

One of the exciting new features of Windows Server 2016 and Nano Server is their ability to host Windows containers. They can do this using the same Docker API we are familiar with with linux containers. You could walk through the official instructions for setting this up or you could just have Chef do this for you.

Updating the Nano server

Note that in order for this to work on RTM Nano images, you must install the latest Windows updates. My vagrant boxes come fully patched and ready but if you are wondering how do you install updates on a Nano server, here is how:

$sess = New-CimInstance -Namespace root/Microsoft/Windows/WindowsUpdate -ClassName MSFT_WUOperationsSession
Invoke-CimMethod -InputObject $sess -MethodName ApplyApplicableUpdates

Then just reboot and you are good.

A sample cookbook to install and configure the Docker service

I converted the above mentioned instructions for installing Doker and configuring the service into a Chef cookbook recipie.  Its fairly straightforward:

powershell_script 'install Nuget package provider' do
  code 'Install-PackageProvider -Name NuGet -Force'
  not_if '(Get-PackageProvider -Name Nuget -ListAvailable -ErrorAction SilentlyContinue) -ne $null'

powershell_script 'install nano container package' do
  code 'Install-Module -Name xNetworking -Force'
  not_if '(Get-Module xNetworking -list) -ne $null'

zip_path = "#{Chef::Config[:file_cache_path]}/docker.zip"
docker_config = File.join(ENV["ProgramData"], "docker", "config")

remote_file zip_path do
  source "https://download.docker.com/components/engine/windows-server/cs-1.12/docker.zip"
  action :create_if_missing

dsc_resource "Extract Docker" do
  resource :archive
  property :path, zip_path
  property :ensure, "Present"
  property :destination, ENV["ProgramFiles"]

directory docker_config do
  recursive true

file File.join(docker_config, "daemon.json") do
  content "{ \"hosts\": [\"tcp://\", \"npipe://\"] }"

powershell_script "install docker service" do
  code "& '#{File.join(ENV["ProgramFiles"], "docker", "dockerd")}' --register-service"
  not_if "Get-Service docker -ErrorAction SilentlyContinue"

service 'docker' do
  action [:start]

dsc_resource "Enable docker firewall rule" do
  resource :xfirewall
  property :name, "Docker daemon"
  property :direction, "inbound"
  property :action, "allow"
  property :protocol, "tcp"
  property :localport, [ "2375" ]
  property :ensure, "Present"
  property :enabled, "True"

This downloads the appropriate docker binaries, installs the docker service and configures it to listen on port 2375.

To validate that all actually worked we have these Inspec tests:

describe port(2375) do
  it { should be_listening }

describe command("& '$env:ProgramFiles/docker/docker' ps") do
  its('exit_status') { should eq 0 }

describe command("(Get-service -Name 'docker').status") do
  its(:stdout) { should eq("Running\r\n") }

If this all passes, we know our server is listening on the expected port and that docker commands work.

Converge and Verify

So lets run these with kitchen verify:

C:\dev\docker_nano_host [master]> kitchen verify
-----> Starting Kitchen (v1.13.0)
-----> Creating <default-windows-nano>...
       Bringing machine 'default' up with 'hyperv' provider...
       ==> default: Verifying Hyper-V is enabled...
       ==> default: Starting the machine...
       ==> default: Waiting for the machine to report its IP address...
           default: Timeout: 240 seconds
           default: IP:
       ==> default: Waiting for machine to boot. This may take a few minutes...
           default: WinRM address:
           default: WinRM username: vagrant
           default: WinRM execution_time_limit: PT2H
           default: WinRM transport: negotiate
       ==> default: Machine booted and ready!
       ==> default: Machine not provisioned because `--no-provision` is specified.
       [WinRM] Established

       Vagrant instance <default-windows-nano> created.
       Finished creating <default-windows-nano> (1m15.86s).
-----> Converging <default-windows-nano>...


  Port 2375
     ✔  should be listening
  Command &
     ✔  '$env:ProgramFiles/docker/docker' ps exit_status should eq 0
  Command (Get-service
     ✔  -Name 'docker').status stdout should eq "Running\r\n"

Summary: 3 successful, 0 failures, 0 skipped
       Finished verifying <default-windows-nano> (0m11.94s).

Ok our docker host is ready.

Creating and running a Windows container

First if you are running Nano on VirtualBox, you need to add a port forwarding rule for port 2375. Also note that you will need the docker client installed on the machine where you intend to run docker commands. I'm running them from my Windows 10 laptop. To install docker on Windows 10:

Invoke-WebRequest "https://download.docker.com/components/engine/windows-server/cs-1.12/docker.zip" -OutFile "$env:TEMP\docker.zip" -UseBasicParsing

Expand-Archive -Path "$env:TEMP\docker.zip" -DestinationPath $env:ProgramFiles

$env:path += ";c:\program files\docker"

No matter what platform you are running on, once you have the docker client, you need to tell it to use your Nano server as the docker host. Simply set the DOCKER_HOST environment variable to "tcp://<ipaddress of server>:2375".

So now lets download a nanoserver container image from the docker hub repository:

C:\dev\NanoVHD [update]> docker pull microsoft/nanoserver
Using default tag: latest
latest: Pulling from microsoft/nanoserver
5496abde368a: Pull complete
Digest: sha256:aee7d4330fe3dc5987c808f647441c16ed2fa1c7d9c6ef49d6498e5c9860b50b
Status: Downloaded newer image for microsoft/nanoserver:latest

Now lets run a command...heck lets just launch an interactive powershell session inside the container with:

docker run -it microsoft/nanoserver powershell

Here is what we get:

Windows PowerShell
Copyright (C) 2016 Microsoft Corporation. All rights reserved.

PS C:\> ipconfig

Windows IP Configuration

Ethernet adapter vEthernet (Temp Nic Name):

   Connection-specific DNS Suffix  . : mshome.net
   Link-local IPv6 Address . . . . . : fe80::2029:a119:3e4f:851a%15
   IPv4 Address. . . . . . . . . . . :
   Subnet Mask . . . . . . . . . . . :
   Default Gateway . . . . . . . . . :
PS C:\>

Ahhwwww yeeeeaaaahhhhhhh.

What's next?

So we have made alot of progress over the last few months but the story is not entirely complete. We still need to finish knife bootstrap windows winrm and plug in our azure extension.

Please let us know what works and what does not work. I personally want to see Nano server succeed and of course we intend for Chef to provide a positive Windows Nano Server configuration story.

Released WinRM Gem 2.0 with a cross-platform, open source PSRP client implementation by Matt Wrock

Today we released the gems: WinRM 2.0, winrm-fs 1.0 and winrm elevated 1.0. I first talked about this work in this post and have since performed extensive testing (but I have confidence the first bug will be reported soon) and made several improvements. Today its released and available to any consuming application wanting to use it and we should see a Test-Kitchen release in the near future upgrading its winrm gems. Up next will be knife-windows and vagrant.

This is a near rewrite of the WinRM gem. Its gotten crufty over the years and its API and internal structure needed some attention. This release fixes several bugs and brings some big improvements. You should read the readme to catch up on the changes but here is how it looks in a nutshell (or an IRB shell):

mwrock@ubuwrock:~$ irb
2.2.1 :001 > require 'winrm'
opts = {
  endpoint: '',
  user: 'vagrant',
  password: 'vagrant'
conn = WinRM::Connection.new(opts); nil
conn.shell(:powershell) do |shell|
  shell.run('$PSVersionTable') do |stdout, stderr|
    STDOUT.print stdout
    STDERR.print stderr
end; nil => true
2.2.1 :002 > opts = {
2.2.1 :003 >       endpoint: '',
2.2.1 :004 >       user: 'vagrant',
2.2.1 :005 >       password: 'vagrant'
2.2.1 :006?>   }
 => {:endpoint=>"", :user=>"vagrant", :password=>"vagrant"}
2.2.1 :007 > conn = WinRM::Connection.new(opts); nil
 => nil
2.2.1 :008 > conn.shell(:powershell) do |shell|
2.2.1 :009 >       shell.run('$PSVersionTable') do |stdout, stderr|
2.2.1 :010 >           STDOUT.print stdout
2.2.1 :011?>         STDERR.print stderr
2.2.1 :012?>       end
2.2.1 :013?>   end; nil

Name                           Value
----                           -----
PSVersion                      4.0
WSManStackVersion              3.0
CLRVersion                     4.0.30319.34209
BuildVersion                   6.3.9600.17400
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0}
PSRemotingProtocolVersion      2.2

Note this is run from an Ubuntu 14.04 host targeting a Windows 2012R2 VirtualBox VM. No Windows host required.

100% Ruby PSRP client implementation

So for the four people reading this that know what this means: yaaay! woohoo! you go girl!! we talk PSRP now. yo.

No...Really...why should I care about this?

I'll be honest, there are tons of scenarios where PSRP will not make any difference, but here are some tangible points where it undoubtedly makes things better:

  • File copy can be orders of magnitude faster. If you use the winrm-fs gem to copy files to a remote windows machine, you may see transfer speeds as much as 30x faster. This will be more noticeable transferring files larger than several kilobytes. For example, the PSRP specification PDF - about 4 and a half MB - takes about 4 seconds via this release vs 2 minutes on the previous release on my work laptop. For details as to why PSRP is so much faster, see this post.
  • The WinRM gems can talk powershell to Windows Nano Server. The previous WinRM gem is unable to execute powershell commands against a Windows Nano server. If you are a test-kitchen user and would like to see this in action, clone https://github.com/mwrock/DSCTextfile and:
bundle install
bundle exec kitchen verify

This will download my WindowsNanoDSC vagrant box, provision it, converge a DSC file resource and test its success with Pester. You should notice that not only does the nano server's .box file download from the internet MUCH faster, it boots and converges several minutes faster than its Windows 2012R2 cousin.

Stay tuned for Chef based kitchen converges on Windows Nano!

  • You can now execute multiple commands that operate in the same scope (runspace). This means you can share variables and imported commands from call to call because calls share the same powershell runspace whereas before every call ran in a separate powershell.exe process. The winrm-fs gem is an example of how this is useful.
def stream_upload(input_io, dest)
  read_size = ((max_encoded_write - dest.length) / 4) * 3
  chunk, bytes = 1, 0
  buffer = ''
    $to = $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath("#{dest}")
    $parent = Split-Path $to
    if(!(Test-path $parent)) { mkdir $parent | Out-Null }
    $fileStream = New-Object -TypeName System.IO.FileStream -ArgumentList @(

  while input_io.read(read_size, buffer)
    bytes += (buffer.bytesize / 3 * 4)
    logger.debug "Wrote chunk #{chunk} for #{dest}" if chunk % 25 == 0
    chunk += 1
    yield bytes if block_given?
  buffer = nil # rubocop:disable Lint/UselessAssignment

  [chunk - 1, bytes]

def stream_command(encoded_bytes)
    $fileStream.Write($bytes, 0, $bytes.length)

Here  we issue some powershell to create a FileStream, then in ruby we iterate over an IO class and write to that FileSteam instance as many times as we need and then dispose of the stream when done. Before, that FileStream would be gone on the next call and instead we'd have to open the file on each trip.

  • Non administrator users can execute commands. Because the former WinRM implementation was based on winrs, a user had to be an administrator in order to authenticate. Now non admin users, as long as they belong to the correct remoting users group, can execute remote commands.

This is just the beginning

In and of itself, a WinRM release may not be that exciting but lays the groundwork for some great experiences. I cant wait to explore testing infrastructure code on windows nano further and, sure, sane file transfer rates sounds pretty great.

How can we most optimally shrink a Windows base image? by Matt Wrock

I have spent alot of time trying to get my Windows vagrant boxes as small as possible. I blogged pretty extensively on what optimizations one can make and how those optimizations can be automated with Packer. Over the last week I've leveraged that automation to allow me to collect data on exactly how much each technique I employ saves in the final image. The results I think are very interesting.

Diving into the data

The metrics above reflect the savings yielded in a fully patched Windows 2012 R2 VirtualBox base image. The total size of the final compressed .box vagrant file with no optimizations was 7.71GB and 3.71GB with all optimizations applied.

I have previously blogged the details involved in each optimization and my Packer templates can be found online that automate this process. Let me quickly summarize these optimizations in order of biggest bang for buck:

  • SxS Cleanup (54%): The Windows SxS folder can grow larger and larger over time. This has historically been a major problem and until not too long ago, the only remedy was to periodically repave the OS. Among other things, this folder includes backups for every installed update so that they can be undone if necessary. The fact of the matter is that most will never rollback any update. Windows now expose commands and scheduled tasks that allow us to periodically trim this data. Naturally this will have the most impact the more updates that have been installed.
  • Removing windows features or Features On Demand (25%): Windows ships with almost all installable features and roles on disk. In many/most cases, a server is built for a specific task and its dormant unenabled features simply take up valuable disk space. Another relatively new feature in Windows management is the ability to totally remove these features from disk. They can always be restored later either via external media or Windows Update.
  • Optimize Disk (13%): This is basically a defragmenter and optimizes the disk according to its used sectors. This will likely be more important as disk activity increases between OS install and the time of optimization.
  • Removing Junk/Temp files (5%): Here we simply delete the temp folders and a few other unnecessary files and directories created during setup. This will likely have minimal impact if the server has not undergone much true usage.
  • Removing the Page File (3%): This is a bit misleading because the server will have a page file. We just make sure that the image in the .box file has no page file (possibly a GB in space but compresses to far less). On first boot, the page file will be recreated.

The importance of "0ing" out unused space

This is something that is of particular importance for VirtualBox images. This is the act of literally flipping every unused bit on disk to 0. Otherwise the image file treats this space as used in a final compressed .box file. The fascinating fact here is if you do NOT do this, you save NOTHING. At least for VirtualBox but not Hyper-V and that is all I measured. So our 7.71 GB original patched OS with all optimizations applied but without this step compressed to 7.71GB. 0% savings.

This is small?

Lets not kid ourselves. As hard as we try to chip away at a windows base image, we are still left with a beast of an image. Sure we can cut a fully patched Windows image almost in half but it is still just under 4 GB. Thats huge especially compared to most bare Linux base images.

If you want to experience a truly small Windows image, you will want to explore Windows Nano Server. Only then will we achieve orders of magnitude of savings and enter into the Linux "ballpark". The vagrant boxes I have created for nano weigh in at about 300MB and also boot up very quickly.

Your images may vary

The numbers above reflect a particular Windows version and hypervisor. Different versions and hypervisors will assuredly yield different metrics.

There is less to optimize on newer OS versions

This is largely due to the number of Windows updates available. Today, a fresh Windows 2012 R2 image will install just over 220 updates compared to only 5 on Windows 2016 Technical Preview 5. 220 updates takes up alot of space, scattering bits all over the disk.

Different Hypervisor file types are more efficient than others

A VirtualBox .vmdk will not automatically optimize as well as a Hyper-V .vhd/x. Thus come compression time, the final vagrant VirtualBox .box file will be much larger if you dont take steps yourself to optimize the disk.

Creating a Windows Server 2016 Vagrant box with Chef and Packer by Matt Wrock

I've been using Packer for a bit over a year now to create the Windows 2012 R2 Vagrant box that I regularly use for testing various server configuration scripts. My packer template has been evolving over time but is composed of some Boxstarter package setup and a few adhoc Powershell scripts. I have blogged about this process here. This has been working great, but I'm curious how it would look differently if I used Chef instead of Boxstarter and random powershell.

Chef is a much more mature configuration management platform than Boxstarter (which I would not even label as configuration management). My belief is that breaking up what I have now into Chef resources and recipes will make the image configuration more composable and easier to read. Also as an engineer employed by Chef, I'd like to be able to walk users through how this would look using Chef.

To switch things up further, I'm conducting this experimentation on a whole new OS - Windows Server 2016 TP5. This means I dont have to worry about breaking my other templates, my windows updates will be much smaller (5 updates vs > 220) and I can use DSC resources for much of the configuring. So this post will guide you through using Chef and Packer together and dealing with the "gotchas" which I ran into. The actual template can be found on github here.

If you want to "skip to the end," I have uploaded both Hyper-V and VirtualBox providers to Atlas and you can use them with vagrant via:

 vagrant init mwrock/Windows2016
 vagrant up

Preparing for the Chef Provisioner

There are a couple things that need to happen before our Chef recipes can run.

Dealing with cookbook dependencies

I've taken most of the scripts that I run in a packer run and have broken them down into various Chef recipes encapsulated in a single cookbook I include in my packer template repository. Packer's Chef provisioners will copy this cookbook to the image being built but what about other cookbooks it depends on? This cookbook uses the windows cookbook, the wsus-client cookbook and dependencies that they have and so on, but packer does not expose any mechanism for discovering those cookbooks and downloading them.

I experimented with three different approaches to fetching these dependencies. The first two really did the same thing: installed git and then cloned those cookbooks onto the image. The first method I tried did this in a simple powershell provisioner and the second method used a Chef recipe. The down sides to this approach were:

  • I had to know upfront what the exact dependency tree was and each git repo url.
  • I also would either have to solve all the versions myself or just settle for the HEAD of master for all cookbook dependencies.

Well there is a well known tool that solves these problems: Berkshelf. So my final strategy was to run berks vendor to discover the correct dependencies and their versions and download them locally to vendor/cookbooks which we ignore from source control:

C:\dev\packer-templates [master]> cd .\cookbooks\packer-templates\
C:\dev\packer-templates\cookbooks\packer-templates [master]> berks vendor ../../vendor/cookbooks
Resolving cookbook dependencies...
Fetching 'packer-templates' from source at .
Fetching cookbook index from https://supermarket.chef.io...
Using chef_handler (1.4.0)
Using windows (1.44.1)
Using packer-templates (0.1.0) from source at .
Using wsus-client (1.2.1)
Vendoring chef_handler (1.4.0) to ../../vendor/cookbooks/chef_handler
Vendoring packer-templates (0.1.0) to ../../vendor/cookbooks/packer-templates
Vendoring windows (1.44.1) to ../../vendor/cookbooks/windows
Vendoring wsus-client (1.2.1) to ../../vendor/cookbooks/wsus-client

Now I include both my packer-templates cookbook and the vendored dependent cookbooks in the chef-solo provisioner definition:

"provisioners": [
    "type": "chef-solo",
    "cookbook_paths": ["cookbooks", "vendor/cookbooks"],
    "guest_os_type": "windows",
    "run_list": [

Configuring WinRM

As we will find as we make our way to a completed vagrant .box file, there are a few key places where we will need to change some machine state outside of Chef. The first of these is configuring WinRM. Before you can use either the chef-solo provisioner or a simple powershell provisioner, WinRM must be configured correctly. The Go WinRM library cannot authenticate via NTLM and so we must enable Basic Authentication and allow unencrypted traffic. Note that my template removes these settings prior to shutting down the vm before the image is exported since my testing scenarios have NTLM authentication available.

Since we cannot do this from any provisioner, we do this in the vm build step. We add a script to the <FirstLogonCommands> section of our windows answer file. This is the file that automates the initial install of windows so we are not prompted to enter things like admin password, locale, timezone, etc:

  <SynchronousCommand wcm:action="add">
     <CommandLine>cmd.exe /c C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -File a:\winrm.ps1</CommandLine>

The winrm.ps1 script looks like:

netsh advfirewall firewall add rule name="WinRM-HTTP" dir=in localport=5985 protocol=TCP action=allow
winrm set winrm/config/service/auth '@{Basic="true"}'
winrm set winrm/config/service '@{AllowUnencrypted="true"}'

As soon as this runs on our packer build, packer will detect trhat WinRM is accessible and will move on to provisioning.

Choosing a Chef provisioner

There are two Chef flavored provisioners that come "in the box" with packer. The chef-client provisioner is ideal if you store your cookbooks on a Chef server. Since I am storing the cookbook with the packer-templates to be copied to the image, I am using the chef-solo provisioner.

Both provisioners will install the Chef client on the windows VM and will then converge all recipes included in the runlist specified in the template:

  "provisioners": [
      "type": "chef-solo",
      "cookbook_paths": ["cookbooks", "vendor/cookbooks"],
      "guest_os_type": "windows",
      "run_list": [

Windows updates and other WinRM unfriendly tasks

The Chef provisioners invoke the Chef client via WinRM. This means that all of the restrictions of WinRM apply here. That means no windows updates, no installing .net, no installing SQL server and a few other edge case restrictions.

We can work around these restrictions by isolating these unfriendly commands and running them directly via the powershell provisioner set to run "elevated":

      "type": "powershell",
      "script": "scripts/windows-updates.ps1",
      "elevated_user": "vagrant",
      "elevated_password": "vagrant"

When elevated credentials are used, the powershell script is run via a scheduled task and therefore runs in the context of a local user free from the fetters of WinRM. So we start by converging a Chef runlist with just enough configuration to set things up. This includes turning off automatic updates by using the wsus-client::configure recipe so that manually running updates will not interfere with automatic updates kicked off by the vm. The initial runlist also installs the PSWindowsUpdate module which we will use in the above powershell provisioner.

Here is our install-ps-modules.rb recipe that installs the Nuget package provider so we can install the PSWindowsUpdate module and the other DSC modules we will need during our packer build:

powershell_script 'install Nuget package provider' do
  code 'Install-PackageProvider -Name NuGet -Force'
  not_if '(Get-PackageProvider -Name Nuget -ListAvailable -ErrorAction SilentlyContinue) -ne $null'

%w{PSWindowsUpdate xNetworking xRemoteDesktopAdmin xCertificate}.each do |ps_module|
  powershell_script "install #{ps_module} module" do
    code "Install-Module #{ps_module} -Force"
    not_if "(Get-Module #{ps_module} -list) -ne $null"

The windows-updates.ps1 looks like:

Get-WUInstall -WindowsUpdate -AcceptAll -UpdateType Software -IgnoreReboot

Multiple Chef provisioning blocks

After windows updates, I move back to Chef to finish off the provisioning:

      "type": "chef-solo",
      "remote_cookbook_paths": [
      "guest_os_type": "windows",
      "skip_install": "true",
      "run_list": [

A couple important things to include when running the Chef provisioner more than once is to tell it not to install Chef and to reuse the cookbook directories it used on the first run.

For some reason, the Chef provisioners will download and install chef regardless of whether or not Chef is already installed. Also, on the first Chef run, packer copied the cookbooks from your local environment to the vm. When it copies these cookbooks on subsequent attempts, its incredibly slow (several minutes). I'm assuming this is due to file checksum checking logic in the go library. You can avoid this sluggish file copy by just referencing the remote cookbook paths setup by the first run with the remote_cookbook_paths array shown above.

Cleaning up

Once the image configuration is where you want it to be, you might (or not) want to remove the Chef client. I try to optimize my packer setup for minimal size and the chef-client is rather large (a few hundred MB). Now you can't remove Chef with Chef. What kind of sick world would that be? So we use the powershell provisioner again to remove Chef:

Write-Host "Uninstall Chef..."
if(Test-Path "c:\windows\temp\chef.msi") {
  Start-Process MSIEXEC.exe '/uninstall c:\windows\temp\chef.msi /quiet' -Wait

and then clean up the disk before its exported and compacted into its final .box file:

Write-Host "Cleaning Temp Files"
try {
  Takeown /d Y /R /f "C:\Windows\Temp\*"
  Icacls "C:\Windows\Temp\*" /GRANT:r administrators:F /T /c /q  2>&1
  Remove-Item "C:\Windows\Temp\*" -Recurse -Force -ErrorAction SilentlyContinue
} catch { }

Write-Host "Optimizing Drive"
Optimize-Volume -DriveLetter C

Write-Host "Wiping empty space on disk..."
$Volume = Get-WmiObject win32_logicaldisk -filter "DeviceID='C:'"
$ArraySize= 64kb
$SpaceToLeave= $Volume.Size * 0.05
$FileSize= $Volume.FreeSpace - $SpacetoLeave
$ZeroArray= new-object byte[]($ArraySize)
$Stream= [io.File]::OpenWrite($FilePath)
try {
   $CurFileSize = 0
    while($CurFileSize -lt $FileSize) {
        $Stream.Write($ZeroArray,0, $ZeroArray.Length)
        $CurFileSize +=$ZeroArray.Length
finally {
    if($Stream) {
Del $FilePath

What just happenned?

All of the Chef recipes, powershell scripts and packer templates can be cloned from my packer-templates github repo, but in summary, this is what they all did:

  • Installed windows
  • Installed all windows updates
  • Turned off automatic updates
  • Installed VirtualBox guest additions (only in vbox-2016.json template)
  • Uninstalled Powershell ISE (I dont use this)
  • Removed the page file from the image (it will re create itself on vagrant up)
  • Removed all windows featured not enabled
  • Enabled file sharing firewall rules so you can map drives to the vm
  • Enabled Remote Desktop and its firewall rule
  • Cleaned up the windows SxS directory of update backup files
  • Set the LocalAccountTokenFilterPolicy so that local users can remote to the vm via NTLM
  • Removes "junk" files and folders
  • Wiped all unused space on disk (might seem weird but makes the final compressed .box file smaller)

Most of this was done with Chef resources and we were also able to make ample use of DSC. For example, here is our remote_desktop.rb recipe:

dsc_resource "Enable RDP" do
  resource :xRemoteDesktopAdmin
  property :UserAuthentication, "Secure"
  property :ensure, "Present"

dsc_resource "Allow RDP firewall rule" do
  resource :xfirewall
  property :name, "Remote Desktop"
  property :ensure, "Present"
  property :enabled, "True"

Testing provisioning recipes with Test-Kitchen

One thing I've found very important is to be able to test packer provisioning scripts outside of an actual packer run. Think of this, even if you pair down your provisioning scripts to almost nothing, a packer run will always have to run through the initial windows install. Thats gonna be several minutes. Then after the packer run, you must wait out the image export and if you are using the vagrant post-provisioner, its gonna be several more minutes while the .box file is compressed. So being able to test your provisioning scripts in an isolated environment that can be spun up relatively quickly can save quite a bit of time.

I have found that working on a packer template includes three stages:

  1. Creating a very basic box with next to no configuration
  2. Testing provisioning scripts in a premade VM
  3. A full Packer run with the provisioning scripts

There may be some permutations of this pattern. For example I might remove windows update until the very end.

Test-Kitchen comes in real handy in step #2. You can also use the box produced by step #1 in your Test-Kitchen run. Depending on if I'm building Hyper-V or VirtualBox provider I'll go about this differently. Either way, a simple call to kitchen converge can be much faster than packer build.

Using kitchen-hyperv to test scripts on Hyper-V

The .kitchen.yml file included in my packer-templates repo uses the kitchen-hyperv driver to test my Chef recipes that provision the image:

  name: hyperv
  parent_vhd_folder: '../../output-hyperv-iso/virtual hard disks'
  parent_vhd_name: packer-hyperv-iso.vhdx

If I'm using a hyperv builder to first create a minimal image, packer puts the build .vhdx file in output-hyperv-iso/virtual hard disks. I can use kitchen-hyperv and point it at that image and it will create a new VM using that vhdx file as the parent of a new differencing disk where I can test my recipes. I can then have test-kitchen run these recipes in just a few minutes or less which is a much tighter feedback loop than packer provides.

Using kitchen-vagrant to test on Virtualbox

If you create a .box file with a minimal packer template, it will output that .box file in the root of the packer-template repo. You can add that box to your local vagrant repo by running:

vagrant box add 2016 .\windows2016min-virtualbox.box

Now you can test against this with a test-kitchen driver config that looks like:

  name: vagrant
  box: 2016

Check out my talk on creating windows vagrant boxes with packer at Hashiconf!

I'll be talking on this topic next month (September 2016) at Hashiconf. You can use my discount code SPKR-MWROCK for 15% off General Admission tickets.