Understanding and troubleshooting WinRM connection and authentication: a thrill seeker's guide to adventure / by Matt Wrock

Connecting to a remote windows machine is often far more difficult than one would have expected. This was my experience years ago when I made my first attempt to use powershell remoting to connect to an Azure VM. At the time, powershell 2 was the hotness and many were talking up its remoting capabilities. I had been using powershell for about a year at the time and thought I'd give it a go. It wasn't simple at all and took a few hours to finally succeed.

Now armed with 2012R2 and more knowledge its simpler but lets say you are trying to connect from a linux box using one of the open source WinRM ports, there are several gotchas.

I started working for Chef about six weeks ago and it is not at all uncommon to find customers and fellow employees struggling with failure to talk to a remote windows node. I'd like to lay out in this post some of the fundamental moving parts as well as the troubleshooting decision tree I often use to figure out where things are wrong and how to get connected.

I'll address cross platform scenarios using plain WinRM, powershell remoting from windows and some Chef specific tooling using the knife-windows gem.

Connecting and Authenticating

In my experience these are the primary hurdles to WinRM sweet success. First is connecting. Can I successfully establish a connection on a WinRM port to the remote machine? There are several things to get in the way here. Then a yak shave or two later you get past connectivity but are not granted access. What's that you say? You are signing in with admin credentials to the box?...I'm sorry say that again?...huh?...I just can't hear you.

TL;DR - A WinRM WTF checklist:

I am going to go into detail in this post on the different gotchas and their accompanying settings needed to successfully connect and execute commands on a remote windows machine using WinRM. However, if you are stuck right now and don't want to sift through all of this, here is a cheat sheet list of things to set to get you out of trouble:

On the remote windows machine:

  • Run Enable-PSRemoting
  • Open the firewall with: netsh advfirewall firewall add rule name="WinRM-HTTP" dir=in localport=5985 protocol=TCP action=allow
  • Accessing via cross platform tools like chef, vagrant, packer, ruby or go? Run these commands:
winrm set winrm/config/client/auth '@{Basic="true"}'
winrm set winrm/config/service/auth '@{Basic="true"}'
winrm set winrm/config/service '@{AllowUnencrypted="true"}'

Note: DO NOT use the above winrm settings on production nodes. This should be used for tets instances only for troubleshooting WinRM connectivity.

This checklist is likely to address most trouble scenarios when accessing winrm over HTTP. If you are still stuck or want to understand this domain more, please read on.

Barriers to entry

Lets talk about connectivity first. Here are the key issues that can prevent connection attempts to a WinRM endpoint:

  • The Winrm service is not running on the remote machine
  • The firewall on the remote machine is refusing connections
  • A proxy server stands in the way
  • Improper SSL configuration for HTTPS connections

We'll address each of these scenarios but first...

How can I know if I can connect?

It can often be unclear whether we are fighting a connection or authentication problem. So I'll point out how you can determine if you can eliminate connectivity as a potential issue.

On Mac/Linux:

$ nc -z -w1 <IP or host name> 5985;echo $?

This uses netcat available on the mac and most linux distros. Assuming you are using the default HTTP based WinRM port 5985 (more on determining the correct port in just a bit), if the above returns 0, you know you are getting through to a listening WinRM endpoint on the other side.

On Windows:

Test-WSMan -ComputerName <IP or host name>

Again this assumes you are trying to connect over the default HTTP WinRM port (5985), if not add -UseSSL. This should return some non-error response that looks something like:

wsmid         : http://schemas.dmtf.org/wbem/wsman/identity/1/wsmanidentity.xsd
ProtocolVersion : http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd
ProductVendor   : Microsoft Corporation
ProductVersion  : OS: 0.0.0 SP: 0.0 Stack: 3.0

WinRM Ports

The above commands used the default WinRM HTTP port to attempt to connect to the remote WinRM endpoint - 5985. WinRM is a SOAP based HTTP protocol.

Side Note: In 2002, I used to car pool to my job in Sherman Oaks California with my friend Jimmy Bizzaro and would kill time by reading "Programming Web Services with SOAP" an O'Reilly publication. This was cutting edge, cool stuff. Java talking to .net, Java talking to Java but from different machines. This was the future. REST was done in a bed or on a toilet. So always remember, today's GO and Rust could be tomorrow's soap.

Anyhoo...WinRM can talk HTTP and HTTPS. The default ports are 5985 and 5986 respectfully. However the default ports can be changed. Now usually the change is driven by network address translation. Sure these ports can be changed locally too, but in my experience if you need to access WinRM on ports other than 5985 or 5986 its usually to accommodate NAT. So check your Virtualbox NAT config or your Azure or EC2 port mappings to see if there is a port forwarding to 5985/6 on the VM. Those would be the ports you need to use. The Test-WSMan cmdlet also takes a -port parameter where you can provide a non standard WinRM port.

So now you know the port to test but you are getting a non 0 netcat response or an error thrown from Test-WSMan. Now What?

Is WinRM turned on?

This is the first question I ask. If winrm is not listening for requests, then there is nothing to connect to. There are a couple ways to do this. What you usually do NOT want to do is simply start the winrm service. Not that that is a bad thing, its just not likely going to be enough. The two best ways to "turn on" WinRM are:

winrm quickconfig

or the powershell approach:

Enable-PSRemoting

For default windows 2012R2 installs (not altered by group policy), this should be on by default. However windows 2008R2 and client SKUs will be turned off until enabled.

Foiled by Public Network Location

You may get the following error when enabling winrm:

Set-WSManQuickConfig : <f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault" Code="2150859113"
Machine="localhost"><f:Message><f:ProviderFault provider="Config provider"
path="%systemroot%\system32\WsmSvc.dll"><f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault"
Code="2150859113" Machine="win81"><f:Message>WinRM firewall exception will not work since one of the network
connection types on this machine is set to Public. Change the network connection type to either Domain or Private and
try again. </f:Message></f:WSManFault></f:ProviderFault></f:Message></f:WSManFault>
At line:1 char:1
+ Set-WSManQuickConfig -Force
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [Set-WSManQuickConfig], InvalidOperationException
    + FullyQualifiedErrorId : WsManError,Microsoft.WSMan.Management.SetWSManQuickConfigCommand

You need to set the Network Location to Private. I have written a post devoted to Internet Connection Type. There are different ways to set the location on different windows versions. You can view the details in the above post but the one that is the most obscure but universally works across all versions is:

$networkListManager = [Activator]::CreateInstance([Type]::GetTypeFromCLSID([Guid]"{DCB00C01-570F-4A9B-8D69-199FDBA5723B}")) 
$connections = $networkListManager.GetNetworkConnections() 

# Set network location to Private for all networks 
$connections | % {$_.GetNetwork().SetCategory(1)}


Wall of fire

In some circles called a firewall. This can often be a blocker. For instance, while winrm is on by default on 2012R2, its firewall rules will block public traffic from outside its own subnet. So if you are trying to connect to a server in EC2 or Azure for example, opening this firewall restriction is important and can be done with:

HTTP:

netsh advfirewall firewall add rule name="WinRM-HTTP" dir=in localport=5985 protocol=TCP action=allow

HTTPS:

netsh advfirewall firewall add rule name="WinRM-HTTPS" dir=in localport=5986 protocol=TCP action=allow

This also affects client SKUs which by default do not open the firewall to any public traffic. If you are on a client version of windows 8 or higher, you can also use the -SkipNetworkProfileCheck switch when enabling winrm via Enable-PSRemoting which will at least open public traffic to the local subnet and may be enough if connecting to a machine on a local hypervisor.

Proxy Servers

As already stated, WinRM runs over http. Therefore if you have a proxy server sitting between you and the remote machine you are trying to connect to, you need to make sure that the request goes through that proxy server. This is usually not an issue if you are on a windows machine and using a native windows API like powershell remoting or winrs to connect. They will simply use the proxy settings in your internet settings.

Ruby tooling like Chef, Vagrant, or others uses a different mechanism. If the tool is using the WinRM ruby gem, like chef and vagrant do, they rely on the HTTP_PROXY environment variable instead of the local system's internet settings. As of knife-windows 1.1.0, the http_proxy settings in your knife.rb config file will make its way to the HTTP_PROXY environment variable. You can manually set this as follows:

Mac/Linux:

$ export HTTP_PROXY="http://<proxy server>:<proxy port>/"

Windows Powershell:

$env:HTTP_PROXY="http://<proxy server>:<proxy port>/"

Windows Cmd:

exit

Friends don't let friends use cmd.exe and you are my friend.

SSL

I'm saving SSL for the last connection issue because it is more involved (why folks often opt for HTTP over the more secure HTTPS). There is extra configuration required both on both the remote and local side and that can vary by local platform. Lets first discuss what must be done on the remote WinRM endpoint.

Create a self signed certificate

Assuming you have not purchased a SSL certificate from a valid certificate authority, you will need to generate a self signed certificate. If your are on a 2012R2 windows os version or later, this is trivial:

$c = New-SelfSignedCertificate -DnsName "<IP or host name>" -CertStoreLocation cert:\LocalMachine\My

Read ahead for issues with New-SelfSignedCertificate and certificate verification with openssl libraries.

Creating a HTTPS WinRM listener

Now WinRM needs to be configured to respond to https requests. This is done by adding an https listener and associating it with the thumbprint of the self signed cert you just created.

winrm create winrm/config/Listener?Address=*+Transport=HTTPS "@{Hostname=`"<IP or host name>`";CertificateThumbprint=`"$($c.ThumbPrint)`"}"

Adding firewall rule

Finally enable winrm https requests through the firewall:

netsh advfirewall firewall add rule name="WinRM-HTTPS" dir=in localport=5986 protocol=TCP action=allow

SSL client configuration

At this point you should be able to reach a listening WinRM endpoint on the remote server. On a mac or linux box, a netcat check on the https winrm port should be successful:

$ nc -z -w1 <IP or host name> 5986;echo $?

On Windows, runing Test-NetConnection (a welcome alternative to telnet on windows 8/2012 or higher) should show an open TCP port:

C:\> Test-netConnection <IP> -Port 5986

ComputerName           : <IP>
RemoteAddress          : <IP>
RemotePort             : 5986
InterfaceAlias         : vEthernet (External Virtual Switch)
SourceAddress          : <local IP>
PingSucceeded          : True
PingReplyDetails (RTT) : 0 ms
TcpTestSucceeded       : True

However, trying to establish a WinRM connection will likely fail with a certificate validation error unless you install that same self signed cert you created on the remote endpoint.

If you try to test the connection on windows using Test-WSMan as we saw before, you would receive this error:

Test-WSMan -ComputerName 192.168.1.153 -UseSSL
Test-WSMan : <f:WSManFault
xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault" Code="12175"
Machine="ultrawrock"><f:Message>The server certificate on the destination
computer (192.168.1.153:5986) has the following errors:
The SSL certificate is signed by an unknown certificate authority.
The SSL certificate contains a common name (CN) that does not match the
hostname.     </f:Message></f:WSManFault>
At line:1 char:1
+ Test-WSMan -ComputerName 192.168.1.153 -UseSSL
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (192.168.1.153:String) [Test-W
   SMan], InvalidOperationException
    + FullyQualifiedErrorId : WsManError,Microsoft.WSMan.Management.TestWSManC
   ommand

Now you have a few options depending on your platform and needs:

  • Do not install the certificate and disable certificate verification (not recommended)
  • Install to the windows certificate store if you are on windows and need to use native windows APIs like powershell remoting
  • Export the certificate to a .pem file for use with ruby based tools like chef

Ignoring certificate validation errors

This is equivalent to when you are browsing the internet in a standard browser and try to view a https based site with an invalid cert and the browser gives you a scary warning that you are about to go somewhere potentially dangerous but gives you the option to go there anyway even though that's probably a really bad idea.

If you are testing, especially using a local hypervisor, the risk of a man in the middle attack is pretty small, but you didn't hear that from me. If you do not want to go through the trouble of installing the certificate (we'll go through those steps shortly), here is what you need to do:

Powershell Remoting:

$options=New-PSSessionOption -SkipCACheck -SkipCNCheck
Enter-PSSession -ComputerName <IP or host name> -Credential <user name> -UseSSL -SessionOption $options

WinRM Ruby Gem:

irb
require 'winrm'
w=WinRM::WinRMWebService.new('https://<ip or host>:5986/wsman', :ssl, :user => '<user>', :pass => '<password>', :no_ssl_peer_verification => true)

Chef's knife-windows

knife winrm -m <ip> ipconfig -x <user> -P <password> -t ssl --winrm-ssl-verify-mode verify_none

Installing to the Windows Certificate store

This is the more secure route and will allow you to interact with the machine via powershell remoting without being nagged that your certificate is not valid.

The first thing to do is download the certificate installed on the remote machine:

$webRequest = [Net.WebRequest]::Create("https://<ip or host>:5986/wsman")
try { $webRequest.GetResponse() } catch {}
$cert = $webRequest.ServicePoint.Certificate

Now we have an X509Certificate instance of the certificate used by the remote winrm HTTPS listener. So we install it in our local machine certificate store along with the other root certificates:

$store = New-Object System.Security.Cryptography.X509Certificates.X509Store `
  -ArgumentList  "Root", "LocalMachine"
$store.Open('ReadWrite')
$store.Add($cert)
$store.Close()

Having done this, we can now validate the SSL connection with Test-WSMan:

C:\> Test-WSMan -ComputerName 192.168.43.166 -UseSSL
wsmid        : http://schemas.dmtf.org/wbem/wsman/identity/1/wsmanidentity.xsd
ProtocolVersion : http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd
ProductVendor   : Microsoft Corporation
ProductVersion  : OS: 0.0.0 SP: 0.0 Stack: 3.0

Now we can use tools like powershell remoting or winrs to talk to the remote machine.

Exporting the certificate to a .pem/.cer file

The above certificate store solution works great on windows for windows tools, but it won't help for many cross platform scenarios like connecting from non-windows or using chef tools like knife-windows. The WinRM gem used by tools like Chef and Vagrant take a certificate file which is expected to be a base 64 encoded public key only certificate file. It commonly has a .pem, .cer, or .crt extension.

On windows you can export the X509Certificate we downloaded above to such a file by using the following lines of powershell:

"-----BEGIN CERTIFICATE-----" | Out-File cert.pem -Encoding ascii
[Convert]::ToBase64String($cert.Export('cert'), 'InsertLineBreaks') |
  Out-File .\cert.pem -Append -Encoding ascii
"-----END CERTIFICATE-----" | Out-File cert.pem -Encoding ascii -Append

With this file you could use Chef's knife winrm command from the knife-windows gem to run commands on a windows node:

knife winrm -m 192.168.1.153 ipconfig -x administrator -P Pass@word1 -t ssl -f cert.pem

Problems with New-SelfSignedCertificate and openssl

If the certificate on the server was generated using New-SelfSignedCertificate, cross platform tools that use openssl libraries may fail to verify the certificate unless New-SelfSignedCertificate was used with the -CloneCert argument and passed a certificate that includes a BasicConstraint property identifying it as a CA. Viewing the certificate's properties in the certificate manager GUI, the certificate should contain this:

certCA.PNG

There are are several alternatives to the convenient New-SelfSignedCertificate cmdlet if you need a cert that must be verified with openssl libraries:

  1. Disable peer verification (not recommended) as shown earlier
  2. Create a private/public key certificate using openssl's req command and then use openssl pkcs12 to combine those 2 files to a pfx file that can be imported to the winrm listener's certificate store. Note: Make sure to include the "Server Authentication" Extended Key Usage (EKU) not added by default
  3. Use the handy New-SelfSignedCertificateEx available from the Technet Script Center and provides finer grained control of the certificate properties and make sure to use the -IsCA argument:
. .\New-SelfSignedCertificateEx.ps1
New-SelfsignedCertificateEx -Subject "CN=$env:computername" `
 -EKU "Server Authentication" -StoreLocation LocalMachine `
 -StoreName My -IsCA $true

Exporting the self signed certificate on non-windows

If you are not on a windows machine, all this powershell is going to produce far different output than what is desirable. However, its actually even simpler to do this with the openssl s_client command:

openssl s_client -connect <ip or host name>:5986 -showcerts </dev/null 2>/dev/null|openssl x509 -outform PEM >mycertfile.pem

The output mycertfile.pem can now be passed to the -f argument of knife-windows commands to execute commands via winrm:

mwrock@ubuwrock:~$ openssl s_client -connect 192.168.1.155:5986 -showcerts </dev/null 2>/dev/null|openssl x509 -outform PEM >mycertfile.pem
mwrock@ubuwrock:~$ knife winrm -m 192.168.1.155 ipconfig -x administrator -P Pass@word1 -t ssl -f ~/mycertfile.pem
WARNING: No knife configuration file found
192.168.1.155
192.168.1.155 Windows IP Configuration
192.168.1.155
192.168.1.155
192.168.1.155 Ethernet adapter Ethernet:
192.168.1.155
192.168.1.155    Connection-specific DNS Suffix  . :
192.168.1.155    Link-local IPv6 Address . . . . . : fe80::6c3f:586a:bdc0:5b4c%12
192.168.1.155    IPv4 Address. . . . . . . . . . . : 192.168.1.155
192.168.1.155    Subnet Mask . . . . . . . . . . . : 255.255.255.0

Authentication

As you can probably tell so far, alot can go wrong and there are several moving parts to establishing a successful connection with a remote windows machine over WinRM. However, we are not there yet. Most of the gotchas here are when you are using HTTP instead of HTTPS and you are not domain joined. This tends to describe 95% of the dev/test scenarios I come in contact with.

As we saw above, there is quite a bit of ceremony involved in getting SSL just right and running WinRM over HTTPS. Lets be clear: its the right thing to do especially in production. However, you can avoid the ceremony but that just means there is other ceremonial sacrifices to be made. At this point, if you are connecting over HTTPS, authentication is pretty straight forward. If not, there are often additional steps to take. However these additional steps tend to be less friction laden, but more security heinous, than the SSL setup.

HTTP, Basic Authentication and cross-platform

Both the Ruby WinRM gem and the Go winrm package do not interact with the native windows APIs needed to make Negotiate authentication possible and therefore must use Basic Authentication when using the HTTP transport. So unless you are either using native windows WinRM via winrs or powershell remoting or using knife-windows on a windows client (more on this in a bit), you must tweak some of the WinRM settings on the remote windows server to allow plain text basic authentication over HTTP.

Here are the commands to run:

winrm set winrm/config/client/auth '@{Basic="true"}'
winrm set winrm/config/service/auth '@{Basic="true"}'
winrm set winrm/config/service '@{AllowUnencrypted="true"}'

One bit of easy guidance here is that if you can't use Negotiate authentication, you really really should be using HTTPS with verifiable certificates. However if you are just trying to get off the ground with local Vagrant boxes and you find yourself in a situation getting WinRM Authentication errors but know you are passing the correct credentials, please try running these on the remote machine before inflicting personal bodily harm.

I always include these commands in windows packer test images because that's what packer and vagrant need to talk to a windows box since they always use HTTP and are cross platform without access to the Negotiate APIs.

This is quite the security hole indeed but usually tempered by the fact that it is on a test box in a NATed network on the local host. Perhaps we are due for a vagrant PR allowing one to pass SSL options in the Vagrantfile. That would be simple to add.

Chef's winrm-s gem using windows negotiate on windows

Chef uses a separate gem that mostly monkey patches the WinRM gem if it sees that winrm is authenticating from windows to windows. In this case it leverages win32 APIs to use Negotiate authentication instead of Basic Authentication and therefore the above winrm settings can be avoided. However, if accessing from a linux client, it will drop to Basic Authentication and the settings shown above must then be present.

Local user accounts

Windows remote communication tends to be easier when you are using domain accounts. This is because domains create implicit trust boundaries so windows adds restrictions when using local accounts. Unfortunately the error messages you can sometimes get do not at all make it clear what you need to do to get past these restrictions. There are two issues with local accounts that I will mention:

Qualifying user names with the "local domain"

One thing that has previously tripped me up and I have seen others struggle with is related to authenticating local users. You may have a local user (not a domain user) and it is getting access denied errors trying to login. However if you prefix the user name with './', then the error is resolved. The './' prefix is equivelent to '<local host or ip>\<user>'. Note that the './' prefix may not work in a windows login dialog box. In that case use the host name or IP address of the remote machine instead of '.'.

Setting the LocalAccountTokenFilterPolicy registry setting

This does not apply to the built in administrator account. So if you only logon as administrator, you will not run into this. However lets say I create a local mwrock account and even add this account to the local Administrators security group. If I try to connect remotely with this account using the default remoting settings on the server, I will get an Access Denied error if using powershell remoting or a WinRMAuthentication error if using the winrm gem. This is typically only visible on 2012R2. By default, the winrm service is running on a newly installed 2012R2 machine with an HTTP listener but without the LocalAccountTokenFilterPolicy enabled, while 2008R2 and client SKUs have no winrm service running at all. Running winrm quickconfig or Enable-PSRemoting on any OS will enable the LocalAccountTokenFilterPolicy, which will allow local accounts to logon. This simply sets the LocalAccountTokenFilterPolicy subkey of HKLM\software\Microsoft\Windows\CurrentVersion\Policies\system to 1.

Trusted Hosts with HTTP, non domain joined powershell remoting

There is an additional security restriction imposed by powershell remoting when connected over HTTP on a non domain joined  (work group) environment. You need to add the host name of the machine you are connecting to the list of trusted hosts. This is a white list of hosts you consider ok to talk to. If there are many, you can comma delimit the list. You can also include wildcards for domains and subdomains:

Set-Item "wsman:\localhost\client\trustedhosts" -Value 'mymachine,*.mydomain.com' -Force

Setting your trusted hosts list a single wildcard would allow all hosts:

Set-Item "wsman:\localhost\client\trustedhosts" -Value '*' -Force

You would only do this if you simply interact with local test instances and even that is suspect.

Double-Hop Authentication

Lets say you want to access a UNC share on the box you have connected to or in any other way use your current credentials to access another machine. This will typically fail with the ever informative Access Denied error. You can enable whats called credential delegation by using a different type of authentication mechanism called CredSSP. This is only available using Powershell remoting and requires extra configuration on both the client and remote machines.

On the remote machine, run:

Enable-WSManCredSSP -Role Server

On the client there are a few things to set up. First, similar to the server, you need to enable it but also specify a white list of endpoints.  This is formatted similar to the trusted hosts discussed above:

Enable-WSManCredSSP -Role Client -DelegateComputer 'my_trusted_host.me.org'

Next you need to edit the local security policy on the machine to allow delegation to specific endpoints. In the gpedit GUI, navigate to Computer Configuration > Administrative Templates > System > Credential Delegation and enable "Allow Delegating Fresh Credentials". Further, you need to add the endpoints you authorize delegation to. You can add WSMAN\*.my_domain.com to allow all endpoints in the my_domain.com domain. You can add as many entries as you need.

Certificate based authentication

Even more secure than usernames and passwords is using a x509 certificate signed by a trusted certificate authority. Many use this techniue when using SSH with SSH keys. Well, the same is possible with WinRM. I won't get into the details here since I have blogged separately on this topic here.

Windows Nano TP 3

As of the date of this post, Microsoft has released technical preview 3 of its new Windows Nano flavored server OS. I have previously blogged about this super light weight os but here is a winrm related bit of info that is unique to nano as of this version at least: there are no tools to tweak the winrm settings. Neither the winrm command or the winrm powershell provider are present.

In order to make changes, you must edit the registry directly. These settings are located at:

 HKLM\Software\Microsoft\Windows\CurrentVersion\WSMAN

Other Caveats

I've written an entire post on this topic and will not go into the same detail here. Basically I have found that once winrm is correctly configured, there is still a small subset of operations that will fail in a remote context. Any interaction with wsus is an example but please read my previous post for more. When you hit one of these road blocks, you typically have two options:

  1. Use a Scheduled Task to execute the command in a local context
  2. Install an SSH server and use that

The second option appears to be imminent and in the end will make all of this easier and perhaps render this post irrelevant.