Insights Diagnosing Failed RDS Broker Role

Diagnosing Failed RDS Broker Role

A while back, we received a support case regarding a Windows Server 2016 box that was set up with an all-in-one RDS configuration where a single server is both the broker and session host.  At some point after the build engineer handed the box off to the client, the RDS roles basically stopped working.  Let’s walk through the troubleshooting process and final resolution.

Symptoms

When the RDS role is working, the Remote Desktop Services tab in Server Manager looks roughly like this:

After the issue started though, we had the following issues.  Server Manager wasn’t loading the RDS details:

Using PowerShell to get details of the RD Deployment fails:

Trying to redo the RDS configuration fails:

Troubleshooting

To troubleshoot this issue, we tried a few different things.  Initially, we thought maybe the RD Broker role configuration had gotten corrupted.  However, removing and re-adding the RD Broker role didn’t help.  Since all the RDS-related PowerShell commands failed with the error in the above screenshot, we couldn’t get any further info that way.

Next, we started looking into the event logs.  All of the RDS and Terminal Services related logs were clear of errors.  However, the Windows Remote Management log showed this error each time we ran the Get-RDServer PowerShell Command:

This error code, 2150859180, isn’t clearly documented anywhere.  However, error codes can be represented as either decimal or hex.  To try to get more info, we use a decimal -> hex converter (like this one) and find that the hex value for this error is 803381AC.  If we plug this into a search engine in hex format as 0x803381AC, we find that it maps to ERROR_WSMAN_REMOTESHELLS_NOT_ALLOWED.

Resolution

With a more specific error message, ERROR_WSMAN_REMOTESHELLS_NOT_ALLOWED, we can track this down much more easily.  It’s clear that remote shells are blocked for some reason.  The easiest way to disable remote shells is through Group Policy so we run a “gpresult /h” and find:

In this screenshot, I recreated the issue in my lab so it’s applied with the Local Group Policy.  In the original client environment, there was a GPO for applying security standards that had this rule enabled.  To test this out, we changed the related registry key for this setting from 0 to 1 and restarted the WinRM service:

After doing so, the RDS roles began functioning correctly:

Knowing that the “Allow Remote Shell Access” setting is causing the issue, we had to create an overriding GPO that re-enabled that setting for just this server.

Summary

Overall, this was a tricky issue to diagnose, and there was a lot of head-scratching during the troubleshooting phase.  However, knowing two things really helped resolve this issue.  First, understanding how WinRM is used by the RDS and Server Manager process to discover the RDS-related information from the server helped point us toward the event log at Applications and Services Logs > Microsoft > Windows > Windows Remote Management.  Second, converting the error code from decimal to hex and running a web search with the hex form is what really got us to the resolution.  Being able to decipher the error codes is an important component of any troubleshooting scenario.