Ask the Directory Services Team

Renew Certificate Authority Certificates on Windows Server Core. No Problem!

Hi there!  Rob and Jim are here from the Directory Services team.  Today’s blog strives to clearly elucidate an administrative procedure that comes along more frequently with PKI Hierarchies being deployed to Windows Server Core operating systems.

Installing the Certificate Services Role on Windows Server Core will not be covered in this blog, but this is good reference for this endeavor - https://learn.microsoft.com/en-us/powershell/module/adcsdeployment/install-adcscertificationauthority?view=windowsserver2022-ps

 

In our scenario we already have an OFFLINE ROOT and an Enterprise Subordinate CA certificate that needs to be renewed.  Both of these PKI roles are installed on the Windows Server Core operating system.

 

  1. To start the renewal process, validate if the customer has the following registry value in place so we know if / where the Certificate Signing Request (CSR) file is going to be written to.
HKLM\System\CurrentControlSet\Services\CertSvc\Configuration\CA Name
Value Name:  RequestFileName
Value Name:  ParentCAMachine
Value Name:  ParentCAName

 

 

These registry settings control where the CSR file and name will be located if they are specified.  Make sure that the folder path already exists before moving forward.  GENERATING A CSR WILL NOT create a missing folder.

 

If the Root CA is an Enterprise Root CA (domain joined) the CSR creation will use the two Parent registry values to submit the certificate request to this Root CA.  However, these values will not work and will probably not be present when utilizing an Offline (Standalone) Root CA.

2. In an elevated command prompt on the subordinate Issuing CA run the following command after deciding if  reuse of the CA’s existing private key is in order or if a new private key should be generated:

  CertUtil -RenewCert [ReuseKeys]

 

 

If you want to generate a new private key for the Subordinate CA, then type: 

 

If you want to generate a new private key for the Subordinate CA, then type: CertUtil -RenewCert

If you want to use the existing private key for the Subordinate CA, then type: CertUtil -RenewCert ReuseKeys

Additional information on CA certificate renewal options can be found here - Certification Authority Renewal - Win32 apps | Microsoft Learn

  1. Copy the resultant CSR .req File over to the Root CA.

  2. Now we can submit the request that we just copied to The Root CA which is also running  on Windows Core OS. We are going to use the Certreq.exe command to submit this request to the Standalone Root CA.
CertReq -Submit -Config "RootCAComputerDNSName\RootCAName" SubCACSRFIleName.req

 

Example of the command line.

 

Root CA Computer name: FourthCoffeeCA01.FourthCoffee.com
Root CA Name:  FourthCoffee Root CA 01
Sub CA CSR File Name:  FourthCoffeeSubCARenewal01.req

CertReq -Submit -Config "FourthCoffeeCA01.FourthCoffee.com\FourthCoffee Root CA 01" FourthCoffeeSubCARenewal01.req

IMPORTANT:  You should get a Request ID as part of the output.  You will need to make a note of this for forthcoming steps.

 

5.  If the CA Manager needs to approve the CSR, this can be accomplished via the certificate services management UI (cetsrv.msc) if possible as it is easier. 
However, since the Root is also running the Windows Core OS, we must run the following command:

 

CertUtil -Resubmit RequestIDNumber
Example of the command line: Request ID from step 4:  Request ID = 3

CertUtil -Config "FourthCoffeeCA01.FourthCoffee.com\FourthCoffee Root CA 01" -resubmit 3

 

6.  We next need to retrieve the CER/CRT file from the Root CA so that we can install the certificate on the Subordinate CA to complete the renewal.

 

CertUtil -View -Restrict "RequestID=RequestIDNumber" -out RawCertificate > C:\FourthCoffeeSubCACert.cer

7.  Assuming the Root CA's certificate has not been renewed, we just need to copy the resultant FourthCoffeeSubCACert.cer file back to the subordinate CA that is being renewed.

8.  Back on the subordinate CA in an elevated command prompt we then need to install the subordinate CA's certificate. Using the following command: 

CertUtil -InstallCert CACertFileName
 Example:  Certutil -InstallCert FourthCoffeeSubCACert.cer

 

When this command is run the Certificate Service Service on the subordinate CA will start.

 

We hope this blog will take some of the mystery and challenge out of interacting with Microsoft PKI running on  Windows Server Core.

 

Robert “what were you thinking” Greene and Jim “that’s for the birds” Tierney.

 

 

Windows Server vNext at Microsoft Ignite, and What's New in Active Directory Technical Takeoff

Hey everybody, Ryan Ries here with a quick heads-up that there is some hot-off-the-presses content you need to check out if you're interested in Windows Server and Active Directory. And if you're reading this, I know you are.

 

First, we have the Windows Server vNext session from Ignite. It has a ton of announcements, demos, news on what's coming and we are looking to drive a ton of customer interest in all the changes for AD, Hyper-V, clustering, file server, storage, performance as well as the new breakout features like Hotpatch and Pay-As-You-Go. Video and slides here: 

What’s New in Windows Server v.Next (microsoft.com)

 

Next, we have a session by my friends Wayne McIntyre and Linda Taylor on what's new in Active Directory. Come learn about 32KB database pages, delegated MSAs, and more! Here's a link to the entire Microsoft Technical TakeOff schedule, which includes many amazing sessions:

Event | Microsoft Technical Takeoff: Windows + Intune - November 27-30, 2023

 

And here is the link specifically to the Active Directory session:

What's new in Active Directory | Microsoft Technical Takeoff

NDES and the dreaded 2 & 10 Event ids stating “The parameter is incorrect"

 

Hey Guys Rob here again, today I am going to go over a set of typical Network Device Enrollment Service Event ID’s that you will inevitably encounter if you are maintaining an environment with NDES installed. 

 

These two events always seem to run together and can be seen in the Application Event log. 

 

Log Name: Application 
Source: Microsoft-Windows-NetworkDeviceEnrollmentService 

Date: DATE_TIME 

Event ID: 2 

Level: Error 

User: NDES_APPLICATION_POOL_IDENTITY 

Computer: NDES_SERVER_COMPUTER_NAME 

Description:  The Network Device Enrollment Service cannot be started (0x80070057). The parameter is incorrect. 

 

Log Name: Application 

Source: Microsoft-Windows-NetworkDeviceEnrollmentService 

Date: DATE_TIME 

Event ID: 10 

Level: Error 

User: NDES_APPLICATION_POOL_IDENTITY 

Computer: NDES_SERVER_COMPUTER_NAME 

Description:  The Network Device Enrollment Service cannot retrieve one of its required certificates (0x80070057). The parameter is incorrect. 

 

As you can see, although it is nice to see errors about a service or application, it does you no good if there is not enough information available to make something actionable about the event.  Hopefully, this is where this blog will be helpful to all of you. 

 

The above errors happen for one of three common reasons. 

 

  1. Access to the Private Keys for one or both Registration Authority (RA) certificates is not possible by the application pool identity account running the SCEP (Simple Certificate Enrolment Protocol) application pool. 
  2. One or both RA certificates were NOT issued by the Certification Authority for which NDES is configured to forward Certificate Service Requests (CSR).
  3. The RA certificates are failing revocation checks. This means that either its certificates or one of the CA (certification authorities) certificates in the chain are failing revocation checks for some reason. If you try and access either the /CertSrv/MSCEP/MSCEP.dll or /CertSrv/MSCEP_Admin endpoints on the NDES Server you will also see an HTTP 500 error as well. 

 

Missing Private Key permissions 

Below are the steps for the first scenario to validate / add the application pool identity account. 

Private key Permissions: 

First, make sure you gave the Application Pool Identity account permission to the private keys on the newly issued certificates. 

  1. Run: CertLM.msc 
  2. Expand: Certificates - Local Computer\Personal\Certificates 
  3. Click on the new certificate, and then right click on it, and select "All Tasks
  4. Click on "Manage Private Keys
  5. You should see the Permissions dialog box.  Click the Add button, and type in the account running the SCEP Application Pool. 
  6. Click the OK button. 
  7. This account only requires "Allow" "Read" permissions. 
  8. Once the permissions have been configured, click the OK button. 
  9. Do this for all certificates that were recently renewed / issued. 
  10. Open an elevated command prompt and type:  IISReset 
  11. Then, test accessing the website. 

 

 

 

If you are unsure what account is being used for the SCEP application pool, you can find this out by doing the following: 

  1. Run:  InetMgr.exe 
  2. Navigate to:  SERVERNAME\Application Pools 
  3. Find the application pool named SCEP, and then look at the Identity column. 

NOTE:  If the NDES role was configured with a non-domain service account and it is leveraging the Application PoolIdentity, please understand the ApplicatonPoolIdentity, and NetworkService are not the computer account.  You will need to add NetworkService to the private key permissions for these two accounts.  These accounts have very restricted rights on the system itself. 

 

 

Registration Authority certificates issued by wrong CA. 

The second scenario can only happen in a situation where you have more than one Certification Authority in the environment, where you have renewed the Registration Authority certificates, and one or both certificates were NOT issued by the Certification Authority that NDES is sending the certificate service requests to.   

 

The first thing we need to determine is what CA issued the two NDES certificates.  

  1. Run: CertLM.msc 
  2. Expand:  Certificates - Local Computer\Personal\Certificates 
  3. Look at the Issued By column for the current RA certificate.  This will tell us the CA that issued the certificates. 
  4. To find out the CA NDES is configured to use run:  Regedit.exe 
  5. Navigate to:  HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Cryptography\MSCEP\CAInfo 
  6. The registry value named Configuration shows you the CA computer \ CA Name that NDES is using. 
  7. Validate that this is the CA that issued both RA certificates.  If not, delete the certificates not issued by this CA and enroll again. 

NOTE: If you are using the MMC (Microsoft Management Console) to do the enrollment, you can specify the CA to use when you are filling out the information.  You would click on the Certification Authority tab and select the CA to use. 

 

 

After procuring the NDES certificates from the correct CA, you must perform an IISESET from an elevated command prompt. 

 

One or both RA certificates are failing chaining or revocation checks. 

 The third scenario is a bit trickier as most customers are not familiar with CAPI2 operational logging and how to interpret the data being provided. I am going to concentrate on looking at the NDES RA certificates to determine if they are failing a revocation check. By no means is this meant to be an exhaustive guide on how to use CAPI2 to troubleshoot chaining or revocation checking failures. 

 

The first two problems usually show themselves once the NDES has been in place for one or more years, and it failed just after replacing the existing NDES certificates. So, if everything was working before replacing the RA certificates, please review the two previous scenarios before jumping to an issue with certificate chaining or revocation checking. 

 

What is certificate chaining? 

 Certificate chaining refers to the computer being able to take an end entity certificate and follow the chain all the way up to a Root CA certificate that is in the Trusted Root store of the computer. If the certificate cannot be chained to a Root CA certificate, then the certificate would not be considered a ‘Trusted’ certificate since the computer does not trust the root CA that issued the entire certificate chain. 

 

What is revocation checking? 

 All certificates except Root CA certificates have a field on them called CRL Distribution Points (CDP).  This lists different URLs that host a file known as a Certificate Revocation List (CRL).  This file lists all issued certificates that, for several reasons, the CA Manager decided should no longer be trusted, and they wanted the world to know about this fact. 

 

Just like certificates, these CRL files have a finite lifetime restricting how long they can be used.  Once the CRL’s Next update value has been reached it is no longer trusted, and the computer MUST download the newest CRL at that time.  If it fails to download the CRL because the URL is not reachable, or the CRL has not been updated at the URL, then it will generate a revocation check failure. When the revocation check fails, the computer can no longer trust the certificate it was trying to use. In this case, it would mean that NDES would not trust the RA certificate and thus NDES would fail to start / run. 

 

Keep in mind that in a two-tier PKI hierarchy where you have an Offline Root CA and an Online Enterprise Issuing CA that issued the RA certificates, the following checks will happen: 

 

  • The RA certificate is going to be looked at to find the download locations of the Online Enterprise Issuing CA’s CRL.  Then, we will validate that the RA certificate has NOT been revoked by the Online Enterprise Issuing CA Manager. 
  • The Online Enterprise Issuing CA’s CRL certificate is going to be looked at to find the download locations of the Offline Root CA’s CRL.  Then, we will validate that the Online Enterprise Issuing CA certificate has NOT been revoked by the Offline Root CA Manager. 

As you can see, this can get complicated quickly depending on how many tiers the PKI hierarchy has within the environment. 

 

Enabling logging and data collection. 

First, we must enable CAPI2 Operational Event logging on the NDES server in question.   

  1. Run: Eventvwr.msc 
  2. Navigate to:  Event Viewer\Applications and Services Logs\Microsoft\Windows\CAPI2\Operational 
  3. Right click on Operational log, and we are going to do two things: 
    1. Increase the log file size to at least 10000.  A 1 MB log file is not going to be enough, and if the server is busy 10 MBs might not be enough, but it is a start. 
    2. Enable the log. 
  4. Click the OK button. 
  5.  A few commands need to be run in an elevated command prompt.   
    1. IISReset 

This makes IIS (Internet Information Services) reload / reset and will cause the NDES application pool to try and load the certificates again once someone hits either the /CertSrv/MSCEP/MSCEP.dll or /CertSrv/MSCEP_Admin sites. 

 

              b. CertUtil -SetReg Chain\ChainCacheResyncFileTime @Now 

 

This command tells CryptoAPI (CAPI) not to rely on the Crypto cache and instead attempt to access the real locations for AIA (Authority Information Access) and CDP locations.  If these fail, it still allows the Crypto cache to be used so it will NOT cause an outage.  It just helps with putting more error events in the log. 

 

              c. CertUtil -URLCache * Delete  

 

This command tells CryptoAPI to delete everything in its FILE cache.  This will NOT delete anything in its memory cache or per process Memory cache. 

 

    6. Lastly, try and access the /CertSrv/MSCEP_Admin page.  You should see a HTTP 500 error, which is fine at this time, and you should            also see the NetworkDeviceEnrollment events of 2 and 10 in the application event logs. 

 

 

Log Name: Application 

Source: Microsoft-Windows-NetworkDeviceEnrollmentService 

Date: DATE_TIME 

Event ID: 2 

Level: Error 

User: NDES_APPLICATION_POOL_IDENTITY 

Computer: NDES_SERVER_COMPUTER_NAME 

Description:  The Network Device Enrollment Service cannot be started (0x80070057). The parameter is incorrect. 

 

Log Name: Application 

Source: Microsoft-Windows-NetworkDeviceEnrollmentService 

Date: DATE_TIME 

Event ID: 10 

Level: Error 

User: NDES_APPLICATION_POOL_IDENTITY 

Computer: NDES_SERVER_COMPUTER_NAME 

Description:  The Network Device Enrollment Service cannot retrieve one of its required certificates (0x80070057). The parameter is incorrect. 

 

    7. Once the issue is reproduced, I suggest you return to the CAPI2 Operational Event log properties and disable it.  We do not want other CryptoAPI calls happening on the server to push or overwrite the data in the event log. 

 

The CAPI2 event logs could have quite a few events in them.  A good event ID based filter to start with is to only show the following events: 11,30,41-42,51-53,90 

 

 

To find out what events are of interest for the Registration Authority (RA) certificates, you will want to do a Find in the event logs for either the thumbprint value of the certificate or the subject name of the certificate. 

 

Example of a common RA certificate subject name is: ADATAM-WEB01-MSCEP-RA 

 

The subject name is usually defined as the NDES servers name-MSCEP-RA.  Once the certificate has been renewed at least you will want to validate the current subject name is of the Exchange Enrollment Agent (Offline request), and CEP Encryption based templates that are in use. 

 

   CAPI2 Events of Interest 

It is well to understand a little bit about the different CAPI2 events that you are going to see in the event log that are related to chaining and revocation checking: 

Event ID 90 X509 Objects (X509Objects) - Shows all the certificates, CRL’s and OCSP (Online Certificate Status Protocol) Responses it was able to collect either via the certificate stores, or via CryptoAPI cache.  This is a good event to review to see if the OS (operating systems) found all certificates in the chain or not.  If you do not see a required certificate, then the chaining function will not succeed.  This event also shows more detail about each certificate than the other events in the log. 

 

Event ID 11 Build Chain (CertGetCertificateChain) - Shows if the certificate chains to a valid root certification authority.  In addition, it does revocation checks to see if all certificates in the chain succeed or fail their revocation check. 

 

Event ID 30 Verify Chain Policy (CertVerifyChainPolicy) - First thing with this event is to determine what Policy is getting verified.  There are several types of policy checks that this event will check against (See CertVerifyChainPolicy link above for the list of policy checks).  Given the policy it will either show success or failure.  The policy check could pass and still show as a failure if Event ID 11 fails because of a revocation check failure you will see this same failure here. 

 

Event ID 41 Verify Revocation (CertVerifyRevocation) - Shows what CAPI2 knows about the status of the CryptoAPI cache in reference to revocation information.   

 

Event ID 42 Reject Revocation Information (CertRejectedRevocationInfo) - Shows that CryptoAPI cached data is being rejected as it is either stale or needs to go off system to get the latest CRL / OCSP response from the network. 

 

Event ID 53 Retrieve Object from Network (CryptRetrieveObjectByUrlWire) - Shows the status of attempting to access a specific AIA or CDP URI (Uniform Resource Identifier).  It will give you the call status too. 

 

Example of troubleshooting with CAPI2 logging enabled 
  1. First filter the collected CAPI2 event log with the following:  11,30,41-42,51-53,90 
  2. Click on Find and type something unique about the certificate.  Either Subject Name or thumbprint value. 

     

  3. We can see the first instance where the subject name is found, and it is shown as an error.   

When looking at these events you want to have the Detail, Friendly View when reviewing the entries. 

 

4. Now when looking at the event, we will be interested in looking at multiple events in the logs to determine what is going on.  Typically, in this type of problem, you want to look at the events in the following order:  90, 11, 30, and lastly 53. 

 

 

    5. We can see in event 11 that this is failing a revocation check.  You will want to pay attention to the TrustStatus field in the Details section.  The first TrustStatus is the overall TrustStatus.  This tells you about the entire chain and specifically that one of the certificates in the trust path failed revocation.   

 Below the overall TrustStatus, it will show each individual Chain Element (each certificate in the chain) in the certificate trust path and its TrustStatus

 

From looking at the above, we can determine that the ADATAM-WEB01-MSCEP-RA certificate is the one that is failing the revocation check.  This means we need to look at the CA that issued this certificate and validate that its CRL is reachable and valid at the URIs (Uniform Resource Identifier) in question. 

If the PKI hierarchy has more CA’s, you may discover that the RA certificate is valid and that the Issuing CA’s certificate failed validation.  If that is the case, it would mean that there is an issue with the Root CA’s CRLs (certificate revocation lists). 

 

6. We are not going to look at the Event 30s as there are no policy checks that would be validated in the context of NDES RA certificates being valid or not. 

7. Next we would typically jump right to Event 53s to see what might be going on with accessing the CRL / OCSP URLs. 

First thing to look at is the URL in the event.  This tells us what path it is trying to access: 

 

 

 

 The next part is we can see the HTTP Request and HTTP Response from the OCSP Server. 

 

 It was an error of HTTP 503.  This tells us there is an issue with the OCSP Server that must be addressed to resolve the NDES problem. 

 

8.  Another example is trying to access a Certificate Revocation List (CRL) file. 

 

 

 

 

 It was able to successfully download the CRL file as evident by the following HTTP 200 (OK): 

 

 

9.  But after successfully downloading the CRL file we see Event 42 error. 

 

 

So, we can see that it is stating that the CRL at the HTTP URL path is no longer valid. 

 

Usually, when an HTTP 500 error is seen and is related to revocation checking, it is an unreachable or expired CRL.   Most of the cases I have seen where this is the issue, it is that the Root CA’s CRL that has expired, and the customer has forgotten to boot the Root CA and publish the new CRL that gets created at service start. 

 

Common Network Device Enrollment Service (NDES) configuration wizard failures

Hey all! Rob Greene here. We see cases around Network Device Enrollment Service (NDES) failing to successfully complete.

 

 

Please keep in mind that you can get these error messages outside of NDES installation, however we are not going to be covering those errors within this blog.  This blog is going to concentrate on the assumption that everything is working fine in general with regards to issuing certificates within the environment, but the NDES configuration wizard is failing.

 

The most often encountered errors by customers are:

  • Access Denied
  • RPC Communication
  • AD CS Service Stop / Start
Access Denied Message

 

The first error message is the dreaded Access Denied error message while running through the wizard like the one below.

 

 

Or if looking at the deployment operational logs:

Event Viewer\Application and Services Logs\Microsoft\Windows\CertificateServices-Deployment\Operational

Log Name:      Microsoft-Windows-CertificateServices-Deployment/Operational
Source:        Microsoft-Windows-CertificateServices-Deployment
Date:          [Date/Time]
Event ID:      104
Level:         Error
User:          [DOMAIN\USER]
Computer:      [NDES Computer Name]

Description:
System.Exception:
System.Exception: CMSCEPSetup::InitializeDefaults: Access is denied. 0x80070005 (WIN32: 5 ERROR_ACCESS_DENIED)
   at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.PowerShellCommandExecutor.Execute(Command command, IPowerShellEngine powerShellEngine, IRehydrator rehydrator)
   at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.PowerShellExecutablePR`2.ExecuteCommand(CommandParameter[] parameters)
   at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.NDES.Operations.Initialize.Execute(PostConfigurationTaskData taskData)
   at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.AsyncOperationP`1.DoWork(Object sender, DoWorkEventArgs eventArgs)

 

There are several tasks that are happening when going through the configuration wizard, and most of these tasks require an elevated account.  Due to this account elevation requirement, if Access Denied is being seen during the configuration, it will mean that the account running the wizard does not have the required permissions.

 

Here is the list of tasks that are done:

  1. Modify the permissions on the certificate template named:  IPSec(Offline request).  It adds the Application Pool Identity account that was specified in the NDES configuration wizard with Enroll permissions for the template.

Use the CertTmpl.msc console while logged in as the account used to run the NDES configuration wizard to try and set Enroll permissions on the template.  Was it able to successfully set permissions on this Template?

 

  1. Modify the CertificateTemplates attribute on the CA's pKIEnrollmentService object.  The object is in the Configuration partition (CN=CA NAME,CN=Enrollment Services,CN=Public Key Services,CN=Services,CN=Configuration,DC=Forest Root Domain) of the CA it is targeting.  The following template names are added the CertificateTemplates attribute: IPSECIntermediateOffline, CEPEncryption, and EnrollmentAgentOffline.

Use the CertSrv.msc console on the CA computer while logged in as the account used to run the NDES configuration wizard and try and add these templates to the CA.  Was it able to successfully add the templates to the CA?

 

  1. Stop and Start of the Active Directory Certificate Services service on the certification authority (CA) computer.

From the NDES Server, use Services.msc console and try and restart the AD CS service while logged in as the account used to run the NDES configuration wizard. Was it able to stop and start the AD CS Service?

 

If any of these tasks fail, you will see the error message of Access Denied.  So, the first thing to check is to ensure that the account used to run the NDES configuration wizard can do each of these tasks independently of the wizard.

 

How RPC communications works.   Remote Procedure Call (RPC) has two components.

 

Endpoint Mapper – The endpoint mapper listens on port TCP 135. The point of the endpoint mapper is to have a database of each RPC based application (via UUID) and then know what high / ephemeral port the RPC application is listening on.

 

RPC application / DCOM application - When a DCOM or RPC based application starts up, it finds an available high port (also known as an ) typically in the range of 49152 – 65535. Once it finds a port it then registers its RPC application (also known as a UUID) with the RPC Endpoint Mapper and its UUID.

 

When an RPC / DCOM based client application wants to connect to the RPC/DCOM application it first contacts the RPC Endpoint Mapper and asks to be given the port number for the RPC/DCOM application via the UUID information. The endpoint mapper looks this information up and then returns the high port that the RPC / DCOM application gave it. Then the RPC / DCOM client application attempts to connect to the high port given to it by the RPC endpoint mapper.

 

For more information on RPC and how it works see this:

https://learn.microsoft.com/en-us/troubleshoot/windows-client/networking/rpc-errors-troubleshooting

The RPC server is unavailable (RPC_S_SERVER_UNAVAILABE) – 0x800706ba / 1722

 

When not an Access Denied, this is the other most often seen error, when running the configuration wizard.

 The RPC Server is unavailable.  0x800706ba (WIN32: 1722 RPC_S_SERVER_UNAVAILABLE) dialog box.

 

The event log entry for this is going to look something like the below:

Log Name:      Microsoft-Windows-CertificateServices-Deployment/Operational
Source:        Microsoft-Windows-CertificateServices-Deployment
Date:          [Date/Time]
Event ID:      104
Level:         Error
User:          [DOMAIN\USER]
Computer:      [NDES Computer Name]

Description:

Microsoft.CertificateServices.Deployment.Common.NDES.NetworkDeviceEnrollmentServiceSetupException:
Microsoft.CertificateServices.Deployment.Common.NDES.NetworkDeviceEnrollmentServiceSetupException: The Network Device Enrollment Service setup failed because certification authority (CA) "[CA COMPUTERNAME]\CA NAME" could not be contacted. Make sure that the CA is properly configured and available. The error is: The RPC server is unavailable. 0x800706ba (WIN32: 1722 RPC_S_SERVER_UNAVAILABLE)
   at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.PowerShellCommandExecutor.Execute(Command command, IPowerShellEngine powerShellEngine, IRehydrator rehydrator)
   at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.NDES.NDESPSHProviderContext.Validate()
   at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.Provider.NDES.Operations.SetCAConfiguration.Execute(CAConfigurationParameters caInformationParameter)
   at Microsoft.CertificateServices.ServerManager.DeploymentPlugIn.DeploymentWizard.Common.ViewModels.CAConfigurationViewModel.Validate()

 

This type of error will come from a few different scenarios.

 

DCOM Permissions / Hardening Mismatch issues:

 

Run the following command from the NDES Server and target the Certification Authority that is the specific CA the NDES server will be a proxy for. If you get back Access Denied, then you will have problems with DCOM permissions.

 

CA Computer Name of: fab-rt-rootca01.fabrikam.com
CA Name of:  Fabrikam Root CA1 G2 CertUtil -Config "fab-rt-rootca01.fabrikam.com\Fabrikam Root CA1 G2" -ping

 

See the following KB:  KB5004442—Manage changes for Windows DCOM Server Security Feature Bypass (CVE-2021-26414)
https://support.microsoft.com/en-us/topic/kb5004442-manage-changes-for-windows-dcom-server-security-feature-bypass-cve-2021-26414-f1400b52-c141-43d2-941e-37ed901c769c

This could be from the DCOM Hardening setting being mismatched between the NDES Server and the Certification Authority.

 

Ports being blocked by Firewalls

 

A Firewall (hardware based or software based) is preventing RPC / DCOM communications between the NDES Server and the server running the Certification Authority Service.  To see if this is an issue you can run the following CertUtil.exe command.

CA Computer Name of: fab-rt-rootca01.fabrikam.com
CA Name of: Fabrikam Root CA1 G2
CertUtil -Config "fab-rt-rootca01.fabrikam.com\Fabrikam Root CA1 G2" -ping

When things are correct you should see output like this:

Connecting to fab-rt-rootca01.fabrikam.com\Fabrikam Root CA1 G2 ...
Server "Fabrikam Root CA1 G2" ICertRequest2 interface is alive (437ms)
CertUtil: -ping command completed successfully.

If this fails with “The RPC Server is unavailable (0x800706ba (WIN32: 1722 RPC_S_SERVER_UNAVAILABLE))”, then connectivity from the NDES Server to the Certification Authority needs to be investigated.

 

While running the above CertUtil command get double-sided network traces. Double-sided network traces means you will run a network tracing tool on the NDES Server and the Certification Authority at the same time. Look in the resultant traces and see if the required ports are leaving the NDES Server and successfully getting to the Certification Authority server.

Service Control Manager times out waiting for AD CS Service to Stop and Start

 

As stated earlier, the NDES configuration wizard needs to be able to successfully stop and start the AD CS Service on the Certification Authority server.  If you can stop and start the service, you can still fail to configure NDES, if the AD CS Service cannot be stopped and started within a 30-second window.

 

NDES stops and starts the service via the Service Control Manager (SCM) APIs.  If you have ever attempted to stop/start a service and noticed it does not stop/start quickly, you might see a message stating that Service Control Manager cannot tell you if it the service was successfully stopped / started, as it did not report back in a timely fashion.  Well, SCM will only wait 30 seconds for the service to return the status of the stop/start command it sent to it.  SCM stops worrying about the service when it takes longer than 30 seconds. 

 

NDES first sends the stop command to SCM for AD CS, then uses SCM to find out when the service is successfully stopped.  the start command to SCM for AD CS and again uses SCM to find out when the service is successfully started.

 

We typically see this fail in the following two scenarios:

  1. The AD CS Service uses a Hardware Storage Module (HSM), and AD CS service does not start quickly because it requires the use of Operator Cards or communications with the HSM is latent.
  2. The AD CS Service just takes a long time to stop and start.  This happens typically because an AD CS Auditing setting was enabled on the Certification Authority.  The auditing setting is:  Start and stop Active Directory Certificate Services.
    1. Launch CertSrv.msc
    2. Right click on the CA’s computer object and select Properties.
    3. Click on the Auditing tab.
    4. Uncheck “Start and stop Active Directory Certificate Services
    5. Click the OK button.
    6. In an elevated command prompt type:  Net Stop CertSvc & Net Start CertSvc

Depending on how long the service takes to stop and start with either or both these issues, the Service Control Manager (SCM) can be modified to wait longer than the default 30 seconds. See this WIKI content.

Event ID 7011: Service Timeout - TechNet Articles - United States (English) - TechNet Wiki (microsoft.com)

 

Increase the service timeout period for Service Control Manager (SCM)

 

The Service Control Manager will generate an event if a service does not respond within the defined timeout period (the default timeout period is 30000 milliseconds). To resolve this problem, use the Registry Editor to change the default timeout value for all services.

To perform this procedure, you must have membership in the Administrators group, or you must have been delegated the appropriate authority.

Caution: Incorrectly editing the registry may severely damage your system. Before making changes to the registry, you should back up any valued data.

To change the service timeout period:

  1. Click the Start button, then click Run, type regedit, and click OK.
  2. In the Registry Editor, click the registry subkey HKLM\SYSTEM\CurrentControlSet\Control.
  3. In the details pane, locate the ServicesPipeTimeout entry, right-click that entry and then select ModifyNote: If the ServicesPipeTimeout entry does not exist, you must create it by selecting New on the Edit menu, followed by the DWORD Value, then typing ServicesPipeTimeout, and clicking Enter.
  4. Click Decimal, enter the new timeout value in milliseconds and then click OK.
  5. Restart the computer.

 

If you have one of these errors I hope that this was able helpful in determining what was going on and helped in resolving the issue for you.

 

 

 

Preview of SAN URI for Certificate Strong Mapping for KB5014754

Hello, this is Matthew Palko, senior product management lead in Enterprise & Security, and today I have some information to share about the new changes to strong certificate mapping in Active Directory.

 

Preview of SAN URI for Certificate Strong Mapping for KB5014754 

KB5014754, released in May 2022, introduced changes to Active Directory Kerberos Key Distribution (KDC) behavior on Windows Server 2008 and later when validating certificates during certificate-based authentication. These changes were made to address elevation of privilege related vulnerabilities leveraging certificate spoofing. 

The KDC changes require certificates for a user or computer object to be strongly mapped to Active Directory. The KB describes multiple mapping options, including manual mapping options and automatic mapping that will populate an OID extension with a device or user SID for online certificate templates from Active Directory Certificate Services (AD CS).   

We are announcing the preview of a new strong mapping format that will work with KDCs running Windows Server Preview Build 25246 and later. This mapping uses the user SID and can be used for manual mapping and offline certificate requests. This new mapping is a Subject Alternative Name (SAN) tag-based URI which uses the following format: 

URL=tag:microsoft.com,2022-09-14:sid:<value> 

In this SAN URI, “microsoft.com” and “2022-09-14” are hard-coded values which should not be modified. The only value that needs to be provided when using the SAN URI is the user or device SID which will replace the <value> field.  

Below is an example of a certificate that has been issued with this SAN URI. Under the Subject Alternative Name field, the tag is listed in the Value section populated with a user’s SID.  

 

 

Existing strong mappings that are described in KB5014754 are not being modified and this new mapping is an additional option to provide more flexibility in issuing certificates that meet the strong mapping requirement. 

Strong mapping is currently not enabled by default. If you are attempting to implement strong mapping using the SAN URI and want to test it is working properly, you can use the audit events described in KB5014754 to check to see if the mapping is working correctly. 

Considerations for Schannel 

Schannel-based servers with KB5014754 will by default attempt to map client certificates. This will require a query by the server to the Domain Controller to confirm the mapping. If a server is not running in a domain environment and does not need certificate mapping, mapping can be disabled by setting the SCH_CRED_NO_SYSTEM_MAPPER flag in SCH_CREDENTIALS on the server. If a server is a part of a domain environment, disabling certificate mapping for SChannel will disable protections against the escalation of privilege vulnerabilities KB5014754 is meant to address which will leave your environment at risk to those attacks. 

Example Certificate INF 

This is an example inf file for a smart card certificate request that includes the new SAN URI field: 

[Version] 

Signature=$Windows NT$ 

 

[NewRequest] 

; list of Keys / Values can be found here:  https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/certreq_1#newrequest 

subject="CN=USER DN" 

; put the FQDN of the server or name of the user after CN= 

; if you need a blank subject field use Subject="CN="

 

[Strings] 

szOID_SUBJECT_ALT_NAME = 2.5.29.17 

szOID_ENHANCED_KEY_USAGE = 2.5.29.37 

szOID_PKIX_KP_SERVER_AUTH = 1.3.6.1.5.5.7.3.1 

szOID_PKIX_KP_CLIENT_AUTH = 1.3.6.1.5.5.7.3.2 

szOID_KP_SMARTCARD_LOGON = 1.3.6.1.4.1.311.20.2.2 

 

KeySpec = AT_NONE 

; KeySpec can only be set to one value, and NOT Multiple values. 

; If set to 1 AT_EXCHANGE, used for Exchange (Encryption/RSA). This is used with Cryptographic Service Providers (CSP). 

; If set to 2 AT_SIGNATURE, used for Signature (think Diffe-Helman, but can use RSA signing too).  This is used with Cryptographic Service Providers (CSP). 

; If set to 0 AT_NONE, used with Key Storage Providers (KSP) typically. 

 

KeyLength = 2048 

; Can be 1024, 2048, 4096, 8192, or 16384, default is 1024 

 

KeyUsage= "CERT_DIGITAL_SIGNATURE_KEY_USAGE | CERT_NON_REPUDIATION_KEY_USAGE | CERT_KEY_ENCIPHERMENT_KEY_USAGE" 

; CERT_DIGITAL_SIGNATURE_KEY_USAGE -- 80 (128) 

; CERT_NON_REPUDIATION_KEY_USAGE -- 40 (64) 

; CERT_KEY_ENCIPHERMENT_KEY_USAGE -- 20 (32) 

; CERT_DATA_ENCIPHERMENT_KEY_USAGE -- 10 (16) 

; CERT_KEY_AGREEMENT_KEY_USAGE -- 8 

; CERT_ENCIPHER_ONLY_KEY_USAGE -- 1 

; CERT_DECIPHER_ONLY_KEY_USAGE -- 8000 (32768) 

 

;

; NOTE: All true/false should be in all lowercase letters!

;

 

Exportable=false 

; true - means that you can export the private key. 

; false - means you cannot export the private key. This is Default 

; If this is being used with a Smartcard may want to set it to False. 

 

MachineKeySet=false 

; true - means that the cerficate is for a machine and not the logged on user. 

; false - Means the certificate is for the User. 

 

HashAlgorithm=Sha256 

;Hash Algorithm to be used for this request (CSR). 

; Sha256, sha384, sha512, sha1, md5, md4, md2 

 

SMIME = False  

; If this parameter is set to true, an extension with the object identifier value 1.2.840.113549.1.9.15 is added to the request.  

; The number of object identifiers depends on the on the operating system version installed and CSP capability,  

; which refer to symmetric encryption algorithms that may be used by Secure Multipurpose Internet Mail Extensions (S/MIME) applications such as Outlook. 

 

PrivateKeyArchive = false  

; true - means that the private key will be archived on the CA. 

; false - Means the private key will only reside on the requesting machine.  This is Default. 

; If you do archive the private key, then you MUST use a RequestType of CMC 

 

UseExistingKeySet = false  

;  Used to specify that an existing key pair should be used in building a certificate request.  

If this key is set to true, you must also specify a value for the RenewalCert key or the KeyContainer name.  

You must not set the Exportable key because you cannot change the properties of an existing key. In this case, no key material is generated when the certificate request is built. 

 

ProviderName="Microsoft Software Key Storage Provider" 

; ProviderType=1 

; ProviderType setting NOT NEEDED for Key Storage Providers. 

; use Certutil -csplist to get the list of Provider names with its Provider Number. 

 

RequestType=CMC 

; Values are:  CMC, PKCS10, PKCS10-, PKCS7.  Default is PKCS10 

 

[Extensions] 

; list of Extensions can be found here:  https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/certreq_1#extensions 

 

%szOID_ENHANCED_KEY_USAGE%="{text}%szOID_KP_SMARTCARD_LOGON%, 

_continue_ = %szOID_PKIX_KP_CLIENT_AUTH%" 

 

%szOID_SUBJECT_ALT_NAME% = "{text}UPN=user@contoso.com& 

_continue_ = EMail=user@contoso.com& 

_continue_ = URL=tag:microsoft.com,2022-09-14:sid:<value>" 

; list of options for SAN can be found here:  https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/certreq_1#extensions 

;<value> should be replaced with the user's actual SID from AD 

A Print Nightmare Artifact - krbtgt/NT Authority

Hello! This is Jessev from the Directory Services team with some advice on how to deal with an annoyance created by the print spooler service. We, on the Direcory Service team, tend to see this issue more so than our User Experience team, who handles printer issues.

 

The term “Print Nightmare” is related to the security vulnerability fixed in the July 6 2021 (7B.21) update. What is described in this blog post is a situation that can develop as a result of the fix for the so-called Print Nightmare vulnerability.

KB5005010: Restricting installation of new printer drivers after applying the July 6, 2021 updates - Microsoft Support

 

Common symptoms are, slow or sluggish DCs, slow or sluggish printer servers, print clients being slow, unable to connect to print queues and the like.

 

As with many performance issues, the source of the problem can fester and stew for a long time, until all of a sudden, problems begin to manifest. Sooner or later thresholds are breached and the symptoms begin. So, before you reach for the sword of MaxConcurrentAPI, to slay performance monsters on your DCs: Investigate, see if perhaps the source of the problem is this Print Nightmare artifact.

 

Before we get into it, I am going to assume you know how to take and evaluate a network trace. Any network sniffer will do, so long as it captures full packets and can parse Kerberos traffic. We here in Directory Services use Netmon 3.4 and Wireshark.

 

Cause

The issue is caused by the spooler service sending a bad Service Principal Name (SPN) to a Domain Controller (DC) by way of the InitilizeSecurityContext function. These Kerberos Ticket requests will fail, so the client resorts to NTLM. The bad SPN, sent by the spooler is, “krbtgt/NT Authority”.

 

The client spooler will reach out to the print server frequently to check the queue, check connectivity, check for updated printer drivers and of course to send print jobs. Each of these connections requires some level of authentication. This extra authentication traffic leads to the performance problems and a loss of sanity by security teams, wondering why their logs are flooded with Kerberos and NTLM traffic.

 

Symptoms

Symptoms of tend to fall into these three areas:

  • Slow Print Jobs, slow update of print queue, printer driver update failures
  • Slow DCs, slow logons, replication issues
  • Excessive Kerberos and NTLM traffic

 

Identify and Prove Data Collection

The best source of data, to find the artifact, is a network trace from a DC or a known noisy client. The artifact will not be seen on the print server itself, unless that print server is also a DC. Note that the DCs used by the clients could be any DC in the domain; it is on these DCs that the artifact will be found. Consideration of the network and site topologies, and the relative location of the print server(s), will need to be considered when deciding upon which DC to collect the network trace. When in doubt, collect the network trace on the DC closest to the clients, the PDC emulator or client that you know is furiously attempting to authenticate.

 

Evaluating the Network trace

Once you have the trace, filter on Kerberosv5 (just ‘kerberos’ for Wireshark) and then look for frames that display the error, KDC_ERR_S_PRINCIPAL_UNKNOWN. In the frame details, expand Kerberos, then the Kerberos error and look for the sname, krbtgt/NT Authority. This sname is our artifact. Also, if this is the source of performance problems, there will be no lack of example frames.

 

More precise filters:

Netmon: KerberosV5.KrbError.Sname.String.NameString.String.String.OctetStream == "NT Authority"

Wireshark: kerberos.SNameString == "NT Authority"

 

An example image from Netmon is below:

 

Mitigation

The mitigation for this is the registry value RpcNamedPipeAuthentication. If the goal is for short term relief, change the value on the top-talkers identified using the method described with Log Parser Studio (LPS) in the section below. For a more long-term solution, this value may be deployed using a group policy preference item. 

 

Registry Stuff! We set this on the clients

HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Windows NT\Printers\RPC |
RpcNamedPipeAuthentication == 0x2 (DWORD)

 

Setting RpcNamedPipeAuthentication to 0x2 does not lead to a security vulnerability. This registry value controls the client machine; if it will send authentication information to the remote machine for RPC over NamedPipes calls. Depending on the configuration of the environment, the remote machine might reject the incoming RPC call without the authentication information, but it does not open a security vulnerability.

 

Top Talkers

Now that we have evidence that the Print Nightmare Artifact could be the source of our problems, we can target top talkers for symptom relief. We do this by collecting netlogon logs from one or more DCs and then aggregating the log(s) into a table using Log Parser Studio. 

 

Ok ... so how do we know from which DCs to collect netlogon logs?

 

When the client fails to authenticate with Kerberos, it will fall back to NTLM. The print server will send the client NTLM traffic to the DC to which the print server has its secure channel. To find the print server’s secure channel, use the method below.

 

On the print server run the following from an elevated command prompt, using contoso.com as the example domain:

 

Nltest /sc_verify:contoso.com

 

The result will resemble the following:

 

Flags: b0 HAS_IP  HAS_TIMESERV

Trusted DC Name \\dc.contoso.com

Trusted DC Connection Status Status = 0 0x0 NERR_Success

Trust Verification Status = 0 0x0 NERR_Success

The command completed successfully

 

In this case, the DC with which the print server has its secure channel is, dc.contoso.com.

 

Netlogon Logging

Now that we have identified the DC, we can enable netlogon logging. Note that, if a preferred DC could not be identified, we can collect netlogon logs from all or some of the DCs. Collect from DCs that are in the same site as the print server if nothing else.

 

The commands below assume an elevated command prompt.

 

To enable Netlogon logging:

 

Nltest /dbflag:0x2080ffff

 

To disable Netlogon logging:

 

Nltest /dbflag:0x0

 

The default netlogon log location is here: c:\windows\debug\netlogon.log.

 

Using the frequency of the kerberos errors in the network trace, use your best judgment regarding how long to wait for netlogon logging to collect data. Netlogon logging is relatively lightweight, so you can leave it running for as long as desired. The Netlogon log will roll in FIFO method. The default size is 20 megs.

 

Log Parser Studio

Once we have our netlogon.log file(s), we can evaluate that data with Log Parser Studio (LPS).

 

Download and Install

Log Parser Studio (LPS) can be downloaded here.

 

Create the Query
  1. In LPS, click on File on the menu bar, then New, and then New Query.
  2. In the bottom window, delete the default contents and paste in the query below. When you paste in the query, mind the word wrap and make sure to clean up any leading or trailing spaces.

SELECT EXTRACT_SUFFIX(TEXT,0,'Returns ') AS ERR,

TO_UPPERCASE (extract_prefix(extract_suffix(TEXT, 0, 'logon of '), 0, 'from ')) as UserName,

TO_UPPERCASE (extract_prefix(extract_suffix(TEXT, 0, 'from '), 0, 'Returns ')) as MachineName,

 COUNT(*) FROM '[LOGFILEPATH]'

WHERE INDEX_OF(TO_UPPERCASE (TEXT),'SAMLOGON') >0

AND INDEX_OF(TO_UPPERCASE(TEXT),'RETURNS') >0

AND NOT INDEX_OF(TO_UPPERCASE(TEXT),'KERBEROS') >0

GROUP BY ERR, UserName, MachineName ORDER BY COUNT(*) DESC

 

  1. Click the Log Type item on the middle bar and choose, TEXTLINELOG

 

Run Query
  1. Click the Log button from the tool bar and navigate to the directory that contains your netlogon log(s).
  2. Click the Run Query button, red exclamation icon.

 

Examine Results

The query will sort the list with the top talkers at the top. Each line will resemble the following:

ERR

UserName

MachineName

Count(All *)

0x0

CONTOSO\jdoe

WORKSTATION01 (VIA PRINTSERVER01)

2416

 

 

Once the query has completed running, you may export the results to a CSV using the Green Arrow icon on the toolbar.

 

The registry value change, described previously, can be applied to these top talkers. Once the value is changed on the client and the client is rebooted, a new netlogon log can be examined to prove that the client has stopped being so noisy. This can be helpful in a situation where change control requires proof of a solution before being rolled out to the enterprise.

 

Conclusion

This scenario can lead to a lot of difficult to define problems. DCs can become slow causing logon delays, replication errors and the like. Print servers slow down, printer clients can have problems sending print jobs, checking the queue and updating printer drivers.

 

Once the scenario is confirmed, the fix can be rolled out to the entire enterprise with a group policy preference item. If immediate relief is required, because an enterprise-wide change must pass through change control and ‘InfoSec’ teams, use Log Parser Studio to evaluate the netlogon logs and identify the clients that are creating the most load, and get them under control.

 

(Though I've no ETA, and it is just a rumor, this particular value might be exposed in the future as an individual setting in Group Policy, instead of having to deploy it with a Group Policy Preference item (registry change)).

 

References

P.S. Special thanks to two customers, MJ and MB, who helped in one fashion or another in the creation of this blog post - You know who you are!

Having issues since deploying November 2022 Security Updates to your domain controller?

Hello, Chris Cartwright here from Directory Services support team. Taking a breather from the phone calls. In the past few weeks, there has been a large number of questions, rumors, and suggestions thrown around about the November 2022 security updates.

 

Microsoft Support recommends that you read these articles to gain the most understanding of topics discussed in this and related blogs:

 

There are two issues that we are currently seeing after installing the November 2022 security update or the Out of Band (OOB) version of this update. Please review the associated blog posts below to determine if you need to take action on one, or perhaps both scenarios.

  1. Memory leaks within LSASS.exe on domain controllers.
    https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/so-you-say-your-dc-s-memory-is-getting-all-used-up-after/ba-p/3696318 

  2. Kerberos authentication failures caused by non-intersecting encryption types (KDC_ERR_ETYPE_NOSUPP error).
    A behavior change was made that exposes a failure in environments where encryption types do not intersect in environments controlling Kerberos Encryption Types, and/or environments where FAST, Windows Claims, Compound identity, or SID compression are configured.
    https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/what-happened-to-kerberos-authentication-after-installing-the/ba-p/3696351 

So, you say your DC’s memory is getting all used up after installing November 2022 security update

Hello, Chris here from Directory Services support team with part 2 of the series.

 

After installing the November 2022/Out of Band update on your domain controllers you might experience a memory leak happening within LSASS.exe (Local Security Authority Subsystem Service).  This could affect domain controller performance, cause operational failures, and/or reliability issues. 

 

If you have already patched your domain controllers, the December 13, 2022 security update should resolve the known memory leak that is happening within LSASS.exe at this time.  See table below, however if you do not currently feel comfortable with doing this please read the below:  

OS

Resolving Rollup KB

Resolving Security Only Update

Windows Server 2019

5021237

N/A

Windows Server 2016

5021235

N/A

Windows Server 2012 R2

5021294

5021296

Windows Server 2012

5021285

5021303

Windows Server 2008 R2

5021291

5021288

Windows Server 2008

5021289

5021293

 

To briefly summarize the below, there is currently a registry key workaround for the memory leak.  If you haven’t installed the December update or newer yet, you can use the registry key to avoid this problem.  Run the following commands in an elevated command prompt on all of your domain controllers:

reg add "HKLM\System\CurrentControlSet\services\KDC" -v "KrbtgtFullPacSignature" -d 0 -t REG_DWORD

The above registry change will stop the memory leak without stopping and starting the KDC Service.  It WILL NOT free up memory that has already been leaked within LSASS.  So, it is recommended that a reboot be done of the domain controller when it is feasible to do so.

 

Note: Once you have installed the patch that resolves this known issue, you should either remove this value or set KrbtgtFullPacSignature to a higher setting depending on what your environment will allow. It is recommended to enable Enforcement mode as soon as your environment is ready. See: KB5020805: How to manage Kerberos protocol changes related to CVE-2022-37967

 

If you want to see if you're affected by this specific memory issue, check for constant increases of this performance counter within Perfmon.exe to see if it is constantly rising: 

\Process(lsass)\Private Bytes

 

You will want to monitor “Private Bytes” for LSASS over a period of time.  If this value just keeps increasing after installation of the November 2022/OOB update, then you are more than likely affected by this issue.  Normal behavior should be that this value should go up during higher loads on the DC and then go down when the DC is not being utilized overtime. Please be aware that domain controllers will, by default, attempt to cache as much of the Active Directory database in memory as possible. See the linked section of Memory usage considerations in AD DS performance tuning | Microsoft Learn.

 

Information about the changes made to Kerberos Privilege Attribute Certificate (PAC) with the November 2022 security update:
KB5020805: How to manage Kerberos protocol changes related to CVE-2022-37967

 

Links to operating system versions affected by this issue:

https://learn.microsoft.com/en-us/windows/release-health/status-windows-10-1809-and-windows-server-2019#2966msgdesc

https://learn.microsoft.com/en-us/windows/release-health/status-windows-10-1607-and-windows-server-2016#2966msgdesc

https://learn.microsoft.com/en-us/windows/release-health/status-windows-8.1-and-windows-server-2012-r2#2966msgdesc

https://learn.microsoft.com/en-us/windows/release-health/status-windows-server-2012#2966msgdesc
https://learn.microsoft.com/en-us/windows/release-health/status-windows-7-and-windows-server-2008-r2-sp1#2966msgdesc

https://learn.microsoft.com/en-us/windows/release-health/status-windows-server-2008-sp2#2966msgdesc

 

Introduction to this blog series: https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/having-issues-since-deploying-november-2022-security-updates-to/ba-p/3696512 

Part 3 of this blog series: https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/what-happened-to-kerberos-authentication-after-installing-the/ba-p/3696351 

What happened to Kerberos Authentication after installing the November 2022/OOB updates?

Hello, Chris here from Directory Services support team with part 3 of the series.

 

With the November 2022 security update, some things were changed as to how the Kerberos Key Distribution Center (KDC) Service on the Domain Controller determines what encryption types are supported by the KDC and what encryption types are supported by default for users, computers, Group Managed Service Accounts (gMSA), and trust objects within the domain.

 

It is strongly recommended that you read the following article before going forward if you are not certain about Kerberos Encryption types are nor what is supported by the Windows Operating System:

Understanding Kerberos encryption types: https://techcommunity.microsoft.com/t5/core-infrastructure-and-security/decrypting-the-selection-of-supported-kerberos-encryption-types/ba-p/1628797

 

Before we dive into what all has changed, note that there were some unexpected behaviors with the November update:

November out-of-band announcement: https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/november-2022-out-of-band-update-released-take-action/ba-p/3680144

Kerberos changes related to Encryption Type: https://support.microsoft.com/en-us/topic/kb5021131-how-to-manage-the-kerberos-protocol-changes-related-to-cve-2022-37966-fd837ac3-cdec-4e76-a6ec-86e67501407d

November out-of-band guidance: https://learn.microsoft.com/en-us/windows/release-health/windows-message-center#2961

 

After installing Windows Updates released on November 8, 2022 on Windows domain controllers, you might have issues with Kerberos authentication.

This specific failure is identified by the logging of Microsoft-Windows-Kerberos-Key-Distribution-Center Event ID 14 in the System event log of DC role computers with this unique signature in the event message text:

 

While processing an AS request for target service <service>, the account <account name> did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 1). The requested etypes : <etype numbers >. The accounts available etypes : <etype numbers>. Changing or resetting the password of <account name> will generate a proper key.

 

Where (a.) “the missing key has an ID 1” and (b.) "4" is not listed in the "requested etypes" or "account available etypes" fields.

 

You need to read the links above. If you are experiencing this signature above, Microsoft strongly recommends installing the November out of band patch (OOB) which mitigated this regression. The OOB should be installed on top of or in-place of the Nov 8 update on DC Role computers while paying attention to special install requirements for Windows Updates on pre-WS 2016 DCs running on the Monthly Rollup (MR) or SO (Security only) servicing branches.
Note that this out-of-band patch will not fix all issues. You should keep reading. If you tried to disable RC4 in your environment, you especially need to keep reading.

 

There was a change made to how the Kerberos Key Distribution Center (KDC) Service determines what encryption types are supported and what should be chosen when a user requests a TGT or Service Ticket.

 

Prior to the November 2022 update, the KDC made some assumptions:

  • If the User’s/GMSA’s/Computer’s/Service account’s/Trust object’s msDS-SupportedEncryptionTypes attribute was NULL (blank) or a value of 0, the KDC assumes account only supports RC4_HMAC_MD5.
  • If the Windows Kerberos Client on workstations/Member Servers and KDCs are configured to ONLY support either one or both versions of AES encryption, the KDC would create an RC4_HMAC_MD5 encryption key as well as create AES Keys for the account if msDS-SupportedEncryptionTypes was NULL or a value of 0. This meant you could still get AES tickets.
  • Configurations where FAST/Windows Claims/Compound Identity/Disabled Resource SID Compression were implemented had no impact on the KDC’s decision for determining Kerberos Encryption Type. See below screen shot of an example of a user account that has these higher values configured but DOES NOT have an encryption type defined within the attribute.

 

After November 2022 Update the KDC Makes the following decisions:

  • If the User’s/GMSA’s/Computer’s/Service account’s/Trust object’s msDS-SupportedEncryptionTypes attribute was NULL (blank) or a value of 0, it defaults to an RC4_HMAC_MD5 encrypted ticket with AES256_CTS_HMAC_SHA1_96 session keys if the DefaultDomainSupportedEncTypes registry value is not configured on the KDC. If this value IS configured, then it uses the value defined in the registry.
  • If the User’s/GMSA’s/Computer’s/Service account’s/Trust object’s msDS-SupportedEncryptionTypes attribute is NOT NULL nor a value of 0, it will use the most secure intersecting (common) encryption type specified. If the KDC’s Kerberos client is NOT configured to support any of the encryption types configured in the account’s msDS-SupportedEncryptionTypes attribute then the KDC will NOT issue a TGT or Service Ticket as there is no common Encryption type between the Kerberos Client, Kerberos enabled service, or the KDC.

As explained above, the KDC is no longer proactively adding AES support for Kerberos tickets, and if it is NOT configured on the objects then it will more than likely fail if RC4_HMAC_MD5 has been disabled within the environment. This literally means that the authentication interactions that worked before the 11b update that shouldn't have, correctly fail now.


On top of that, if FAST, Compound Identity, Windows Claims, or Resource SID Compression has been enabled on accounts that don’t have specific encryption types specified within the environment, it also will cause the KDC to NOT issue Kerberos tickets as the attribute msDS-SupportedEncryptionTypes is no longer NULL or a value of 0. These technologies/functionalities are outside the scope of this article. You can read more about these higher bits here: FAST, Claims, Compound auth and Resource SID compression.

 

Steps to address the issues:

So now that you have the background as to what has changed, we need to determine a few things.

 

If the November 2022/OOB updates have been deployed to your domain controller(s), determine if you are having problems with the inability for the domain controllers (KDC) to issue Kerberos TGT’s or Service tickets. This can be done by Filtering the System Event log on the domain controllers for the following:

Event Log: System
Event Source: Kerberos-Key-Distribution-Center
Event IDs: 16,27,26,14,42
NOTE: If you want to know about the detailed description, and what it means, see the section later in this article labeled: Kerberos Key Distribution Center Event error messages.

 

If any of these have started around the same time as the November security update being installed, then we already know that the KDC is having issues issuing TGT or Service tickets.

 

First, we need to determine if your environment was configured for Kerberos FAST, Compound Identity, Windows Claims or Resource SID Compression. This can be easily done one of two ways:

  1. Running the following Windows PowerShell command to show you the list of objects in the domain that are configured for these.  Get-ADObject -Filter "msDS-supportedEncryptionTypes -bor 0xf0000 -and -not msDS-supportedEncryptionTypes -bor 0x1c"
  2. Running the 11B checker (see sample script at the end of the article found at GitHub - takondo/11Bchecker ) and checking the list of accounts under the section, “There are X objects that have msDS-SupportedEncryptionTypes configured, but no encryption protocol is allowed.”

If any objects are returned, then the supported encryption types will be REQUIRED to be configured on the object’s msDS-SupportedEncryptionTypes attribute.  If the script returns a large number of objects in the Active Directory domain, then it would be best to add the encryption types needed via another Windows PowerShell command below:

Set-ADUser [sAMAccountName] -KerberosEncryptionType [CommaSeparatedListOfEtypes]

Set-ADComputer [sAMAccountName] -KerberosEncryptionType [CommaSeparatedListOfEtypes]

Set-ADServiceAccount [sAMAccountName] -KerberosEncryptionType [CommaSeparatedListOfEtypes]

 

Supported values for ETypes: DES, RC4, AES128, AES256
NOTE:  The value “None” is also supported by the PowerShell Cmdlet, but will clear out any of the supported encryption types.

 

If no objects are returned via method 1, or 11B checker doesn’t return any results for this specific scenario, it would be easier to modify the default supported encryption type for the domain via a registry value change on all the domain controllers (KDCs) within the domain. This is done by adding the following registry value on all domain controllers.

HKLM\System\CurrentControlSet\Services\KDC
Value Type: REG_DWORD
Value Name: DefaultDomainSupportedEncTypes

The value data required would depend on what encryption types that are required to be configured for the domain or forest for Kerberos Authentication to succeed again.

 

Some of the common values to implement are:
For AES128_CTS_HMAC_SHA1_96 and AES256_CTS_HMAC_SHA1_96 support, you would set the value to: 0x18. This will exclude use of RC4 on accounts with msDS-SupportedEncryptionTypes value of NULL or 0 and require AES.

 

For RC4_HMAC_MD5, AES128_CTS_HMAC_SHA1_96 and AES256_CTS_HMAC_SHA1_96 support, you would set the value to: 0x1C. This will allow use of both RC4 and AES on accounts when msDS-SupportedEncryptionTypes value of NULL or 0. Note: This will allow the use of RC4 session keys, which are considered vulnerable.

 

If you want to include an AES256_CTS_HMAC_SHA1_96_SK (Session Key), then you would add 0x20 to the value. So, if with the above examples 0x18 would be 0x38, and 1C would be 0x3C.
More information about the supported encryption type bit flags can be found here: 

https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-kile/6cfc7b50-11ed-4b4d-846d-6f08f0812919

 

The KDC registry value can be added manually on each domain controller, or it could be easily deployed throughout the environment via Group Policy Preference Registry Item deployment.

 

Mismatched Kerberos Encryption Types

The next issue needing attention is the problem of mismatched Kerberos Encryption Types and missing AES keys. You can leverage the same 11b checker script mentioned above to look for most of these problems.


Here’s an example of an environment that is going to have problems with explanations in the output (Note: This script does not make any changes to the environment. It just outputs a report to the screen):

*****************************************
Legacy OS detected: CN=F42003,CN=Computers,DC=contoso,DC=com
This OS is not compatible with the new default behavior, and authentication to this computer will fail after the DC is upgraded to 11B/11OOB

Explanation: This computer is running an unsupported Operating System that requires RC4 to be enabled on the domain controller.

 

There are 2 objects that do not have msDS-SupportedEncryptionTypes configured.
When authenticating to this target, Kerberos will default to the setting of DefaultDomainSupportedEncTypes registry on the authenticating DC.
This defaults to a value of 0x27, which means 'use AES for session keys and RC4 for ticket encryption'
If this target service does not support AES, please set msDS-SupportedEncryptionTypes to 4 on this object so that only RC4 is used.
CN=alice,CN=Users,DC=contoso,DC=com
CN=bob,CN=Users,DC=contoso,DC=com

Explanation: If you have disabled RC4, you need to manually set these accounts accordingly, or leverage DefaultDomainSupportedEncTypes. If you still have RC4 enabled throughout the environment, no action is needed.

 

======================================
There are 1 objects that have msDS-SupportedEncryptionTypes configured, but no encryption protocol is allowed.
This can cause authentication to/from this object to fail.
Please either delete the existing msDS-SupportedEncryptionTypes settings, or add supported etypes.
Example: Add 0x1C to signify support for AES128, AES256, and RC4
CN=server12,CN=Users,DC=contoso,DC=com

Explanation: The fix action for this was covered above in the FAST/Windows Claims/Compound Identity/Resource SID compression section.


======================================
There are 2 computers/services that are configured for RC4/DES only
If you have any clients or DCs that are configured to only support AES, authentication will fail
Here is the list of objects that are RC4/DES only:
CN=computer4,CN=Users,DC=contoso,DC=com
CN=Stefan,CN=Users,DC=contoso,DC=com

Explanation: If are trying to enforce AES anywhere in your environments, these accounts may cause problems. You need to investigate why they have been configured this way and either reconfigure, update, or replace them.

 

A common scenario where authentication fails after installing November update on DCs in this condition is if DCs are configured to only support AES
Example: Setting the 'Configure encryption types allowed for Kerberos' policy to AES only on DCs
Here are the DCs configured for AES only:
CN=DC1,OU=Domain Controllers,DC=contoso,DC=com

Explanation: This is warning you that RC4 is disabled on at least some DCs. You’ll need to consider your environment to determine if this will be a problem or is expected.

 

Want more Information about Windows OS and Kerberos? Attribute msDS-SupportedEncryptionTypes

To avoid redundancy, I will briefly cover a very important attribute called msDS-SupportedEncryptionTypes on objectClasses of User. For our purposes today, that means user, computer, and trustedDomain objects. Here's an example of that attribute on a user object:

If you haven’t patched yet, you should still check for some issues in your environment prior to patching via the same script mentioned above.

 

If you have already patched, you need to keep an eye out for the following Kerberos Key Distribution Center events.  If you see any of these, you have a problem.

 

Event ID 14 Description: While processing an AS request for target service krbtgt/contoso.com, the account Client$ did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 5). The requested etypes : 18 17 23 3 1. The accounts available etypes : 23. Changing or resetting the password of krbtgt will generate a proper key.


Translation: The krbtgt account has not been reset since AES was introduced into the environment.
Resolution: Reset the krbtgt account password after ensuring that AES has not been explicitly disabled on the DC.

Event ID 42 Description: The Kerberos Key Distribution Center lacks strong keys for account krbtgt. You must update the password of this account to prevent use of insecure cryptography.


Translation: The krbtgt account has not been reset since AES was introduced into the environment.
Resolution: Reset the krbtgt account password after ensuring that AES has not been explicitly disabled on the DC.


Event ID 26 Description: While processing an AS request for target service krbtgt/CONTOSO.COM, the account Client$ did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 3). The requested etypes were 18. The accounts available etypes were 23 18 17.


Translation: The DC, krbtgt account, and client have a Kerberos Encryption Type mismatch.
Resolution: Analyze the DC and client to determine why the mismatch is occurring.


Event ID 16 Description: While processing a TGS request for the target server http/foo.contoso.com, the account admin@contoso.com did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 8). The requested etypes were 18 17 23 24 -135. The accounts available etypes were 23 18 17. Changing or resetting the password of <account name> will generate a proper key.


Translation: The encryption types specified by the client do not match the available keys on the account or the account’s encryption type configuration.
Resolution: Reset <account name’s>password after ensuring that AES has not been explicitly disabled on the DC or ensure that the client’s and service account’s encryption types have a common algorithm.


Event ID 27 Description: While processing a TGS request for the target server http/foo.contoso.com, the account admin@CONTOSO.COM did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 9). The requested etypes were 23 3 1. The accounts available etypes were 23 18 17.


Translation: The encryption types configured on the service account for foo.contoso.com are not compatible with the encryption types specific by the DC. (Another Kerberos Encryption Type mismatch)
Resolution: Analyze the DC, the service account that owns the SPN, and the client to determine why the mismatch is occurring.

 

All of the events above would appear on DCs. There is one more event I want to touch on, but would be hard to track since it is located on the clients in the System event log.

 

Event log: System
Source: Security-Kerberos
Event ID: 4

Description: The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server ADATUMWEB$. The target name used was HTTP/adatumweb.adatum.com. This indicates that the target server failed to decrypt the ticket provided by the client. This can occur when the target server principal name (SPN) is registered on an account other than the account the target service is using. Ensure that the target SPN is only registered on the account used by the server. This error can also happen if the target service account password is different than what is configured on the Kerberos Key Distribution Center for that target service. Ensure that the service on the server and the KDC are both configured to use the same password. If the server name is not fully qualified, and the target domain (ADATUM.COM) is different from the client domain (CONTOSO.COM), check if there are identically named server accounts in these two domains, or use the fully-qualified name to identify the server.
Possible problem: Account hasn't had its password reset (twice) since AES was introduced to the environment or some encryption type mismatch.

 

Translation: There is a mismatch between what the requesting client supports and the target service account.
Resolution: Analyze the service account that owns the SPN and the client to determine why the mismatch is occurring.

 

By now you should have noticed a pattern. Things break down if you haven’t reset passwords in years, or if you have mismatched Kerberos Encryption policies. Keep in mind the following rules/items:

  • If you have still pre Windows 2008/Vista Servers/Clients:

    Such devices may still obtain Kerberos tickets, but cannot decrypt tickets with AES session keys generated by November patched DCs. Pre 2008 devices do not support the msDS-SupportedEncryptionTypes attribute. Upgrade these server to a supported version of Windows. IT Admins and CTOs should reread the Sarbanes–Oxley Act of 2002.

  • If you have other third-party Kerberos clients (Java, Linux, etc.) systems that are currently using RC4 or DES:

    Contact the third-party vendor to see if the device/application can be reconfigured or updated to support AES encryption, otherwise replace them with devices/applications that support AES encryption and AES session keys.

    To run a command on Linux to dump the supported encryption types for a keytab file:

    klist -kte /var/kerberos/krb5/user/KeyTabFileName
    To analyze a KeyTab file on Windows, you can use the following command to find out the encryption types configured in the file: KTPass /in KeyTabFileName
  • An entire forest and all trusts should have a common Kerberos encryption type to avoid a likely outage. You must ensure that msDS-SupportedEncryptionTypes are also configured appropriately for the configuration you have deployed

  • If your security team gives you a baseline image or a GPO that has RC4 disabled, and you haven’t finished prepping the entire environment to solely support AES, point them to this article. Make sure they accept responsibility for the ensuing outage.

  • You'll want to leverage the security logs on the DC throughout any AES transition effort looking for RC4 tickets being issued. You need to enable auditing for "Kerberos Authentication Service" and "Kerberos Service Ticket Operations" on all Domain Controllers. Events 4768 and 4769 will be logged that show the encryption type used. The field you'll need to focus on is called "Ticket Encryption Type" and you're looking for 0x17. 0x17 indicates RC4 was issued. This XML query below can be used to filter for these:
<QueryList>
<Query Id="0" Path="Security">
<Select Path="Security">*[EventData[Data[@Name='TicketEncryptionType'] and (Data='0x17')]]
</Select>
</Query>
</QueryList>
  • You need to evaluate the passwordLastSet attribute for all user accounts (including service accounts) and make sure it is a date later than when Windows Server 2008 (or later) DCs were introduced into the environment.

  • Previous guidance said “Do not wait for AES to be enforced on you.” This guidance stands.

Happy Hunting,

Chris Cartwright

Sample Script

 

Update 12/17/2022: 

The sample script "11B checker" text previously found at the bottom of this post has been removed.  The script is now available for download from GitHub at GitHub - takondo/11Bchecker.

It includes enhancements and corrections since this blog post's original publication.  If you obtained a version previously, please download the new version.

 

Related blogs:

Introduction to this blog series: https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/having-issues-since-deploying... 

Part 2 of this blog series: https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/so-you-say-your-dc-s-memory-is-getting-all-used-up-after/ba-p/3696318 

November 2022 Out of Band update released! Take action!

Greetings from the Windows Directory Services team!

The team wanted to bring to your attention the November 17th, 2022 release of an Out of Band (OOB), non-security update that addresses the Kerberos authentication issues experienced in some environments after installing November 8, 2022 (or later) updates on domain controllers.

 

After installing Windows Updates released on November 8, 2022 on Windows domain controllers, you might have issues with Kerberos authentication.

This specific failure is identified by the logging of Microsoft-Windows-Kerberos-Key-Distribution-Center Event ID 14 in the System event log of DC role computers with this unique signature in the event message text:

 

While processing an AS request for target service <service>, the account <account name> did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 1). The requested etypes : <etype numbers >. The accounts available etypes : <etype numbers>. Changing or resetting the password of <account name> will generate a proper key.


Where

  • (a.) “the missing key has an ID 1” and (b.) "4" is not listed in the "requested etypes" or "account available etypes" fields.

Some scenarios that might be affected:

For important details about how to obtain and install the November OOB update, please see the following link on Windows Release Health Message Center at: