System Center Operations Manager Not Working, Diagnosed and Fixed (2026 Guide)
Why System Center Operations Manager Stops Working
I've seen this exact scenario play out on dozens of enterprise networks: an admin pushes a System Center Operations Manager agent to a perfectly healthy Windows Server machine, and the whole thing collapses with a cryptic error that tells you almost nothing useful. Or you've had SCOM running fine for months, and then one Monday morning your agents go gray and your monitoring console is dark. The frustration is real, especially when the downstream impact means your team is flying blind on production servers.
System Center Operations Manager not working is almost never caused by one thing. In my experience, the failures cluster into two dominant categories that account for the vast majority of cases: authentication failures during agent push installations, and Active Directory integration breakdowns when your management group scales past certain limits. Both are fixable. Neither has an obvious error message that points you straight to the cause, which is exactly why so many admins spend hours going in circles.
The first category hits you as error 800706D3, "The authentication service is unknown." You're trying to push a SCOM 2012 agent from a Windows Server 2012 management server to another Windows Server 2012 machine, and the Operations Manager server flat-out can't open the Service Control Manager on the target. The error text sounds like a Kerberos or firewall problem. It isn't. Nine times out of ten, someone disabled the DNS Client service on the agent server, usually because the DNS cache was causing problems, and that single change silently breaks MSRPC authentication for remote service management.
The second category shows up as Event ID 20012 from source "OpsMgr Connector" and Event ID 2000 from "HealthService" in the Operations Manager event log. Agents report that they can't find the primary management server through Active Directory policy. The root cause is a hard limit baked into Operations Manager: agents simply cannot parse more than 10 service connection points (SCPs). If your management group has grown to more than 10 management servers and you're using the Automatically Manage Failover option in your agent assignment rule, every agent starts failing to locate a primary server at startup. The environment error code 0x8007000A in Event 2000 is the fingerprint.
Both problems are fixable with no reinstallation required. But you do need to understand what you're actually dealing with before you start clicking. I'll walk you through diagnosing which issue you have, then the exact steps to resolve each one. Browse all Microsoft fix guides →
The Quick Fix for System Center Operations Manager, Try This First
Before you spend an hour in network traces and Group Policy, run through this single check. It resolves the most common SCOM agent push installation failure fast, and it takes about two minutes.
On the target agent server (not the management server, the machine you're pushing the agent to), open Services:
- Press Win + R, type
services.msc, and hit Enter. - Scroll down to DNS Client.
- Check the Status and Startup Type columns. If DNS Client is stopped and set to Disabled, you've found your problem.
- Right-click DNS Client and select Properties.
- Change Startup type to Automatic, then click Start.
- Click OK.
Now go back to your Operations Manager console and retry the agent push. In most cases, the push succeeds immediately. The Service Control Manager on the agent server becomes reachable again, and the MSRPC bind that was previously failing with "authentication_type_not_recognized" (visible in network traces at the Bind Nack frame) completes cleanly.
If the DNS Client was already running, or re-enabling it didn't fix your push, jump straight to the step-by-step section below, because you're likely hitting the SCP limit issue with Active Directory integration instead.
If you're seeing gray agent states rather than a push failure, check your Operations Manager event log first. Open Event Viewer, navigate to Applications and Services Logs > Operations Manager, and filter for Event IDs 20012 and 2000. If both are present, that's the AD integration SCP problem, and the quick fix for that is a different process covered in Steps 3 through 5 below.
This step is specifically for the SCOM error 800706D3 and "The authentication service is unknown" failure. The DNS Client service is not just about resolving hostnames, other Windows components, including the MSRPC authentication stack that SCOM relies on for remote Service Control Manager access, depend on it being active.
You can check this remotely from the management server using PowerShell if you'd prefer not to RDP to every agent:
Get-Service -ComputerName server2012agent.contoso.local -Name "Dnscache" | Select-Object Name, Status, StartType
If Status shows Stopped and StartType shows Disabled, that's your culprit. To fix it remotely from the management server:
# Set the service to Automatic and start it
$svc = Get-WmiObject Win32_Service -ComputerName "server2012agent.contoso.local" -Filter "Name='Dnscache'"
$svc.ChangeStartMode("Automatic")
$svc.StartService()
Alternatively, connect via the Services console directly: right-click Computer in the left pane of Services (services.msc), choose Connect to another computer, and type the agent server name. Navigate to DNS Client in the list, right-click, select Properties, change the Startup type, and start it.
Once DNS Client is running, open your Operations Manager console, go to Administration > Device Management > Agent Managed, find the failed server, right-click it, and choose Repair or retry the push installation. You should see the agent state change from pending to healthy within a few minutes.
After re-enabling the DNS Client service, it's worth doing a quick sanity check before you call the job done, especially in enterprise environments where other issues can stack on top of each other.
From the management server, verify you can now reach the Service Control Manager on the agent server:
# Test remote SCM access
$connection = [System.ServiceProcess.ServiceController]::GetServices("server2012agent.contoso.local")
$connection | Select-Object -First 5 Name, Status
If that returns a list of services without throwing an exception, MSRPC authentication is working. If it still throws "The authentication service is unknown", or if you can reproduce error 1747 by opening Services and connecting to the remote machine, then the DNS Client restart alone wasn't sufficient, and you need to look at whether a Group Policy is forcibly disabling the service (covered in Advanced Troubleshooting below).
You can also confirm healthy SCOM agent connectivity in the console: go to Monitoring > Operations Manager > Agent Health State. A healthy agent shows green. If the agent is still gray more than 10 minutes after the DNS Client fix, right-click the agent and select Health Explorer to drill into which specific monitor is failing.
Check the Operations Manager event log on the agent server as well. Open Event Viewer, navigate to Applications and Services Logs > Operations Manager, and look for any remaining errors. A clean, healthy agent logs Event ID 1210 (Agent connected to management server) on startup.
If your problem is agents failing to find the primary management server, and you're seeing Event 20012 from "OpsMgr Connector" and Event 2000 from "HealthService", the path forward is different. This is the Active Directory integration SCP limit problem, and it shows up specifically in management groups with more than 10 management servers using the Automatically Manage Failover option.
First, confirm the diagnosis. On an affected agent, open Event Viewer and filter the Operations Manager log for these two events:
- Event ID 20012, Source: OpsMgr Connector, "The OpsMgr Connector did not find any connection policy in Active Directory for management group [name]"
- Event ID 2000, Source: HealthService, "The Management Group [name] failed to start. The error message is the environment is incorrect.(0x8007000A)."
If you have diagnostic tracing enabled, check the TracingGuidsNative.log file for this entry: SCP Not found for primary ManagementServer followed by AD integration is enabled but primary info was not located. Ignore this MG. That sequence of log entries is definitive, you're hitting the 10-SCP parsing limit.
Count your management servers in the Operations console: go to Administration > Management Servers. If you see more than 10, and your agent assignment rule has "Automatically manage failover" selected, you've confirmed the root cause. Move to Step 4.
The fix for the SCP limit problem is to restructure your agent assignment rule so that Operations Manager publishes fewer than 10 SCPs. Here's the exact process from the Operations console.
Log on with an account that's a member of the Operations Manager Administrators role, then:
- In the Operations console, click Administration.
- In the Administration workspace, click Management Servers.
- Right-click the primary management server and select Properties.
- In the Management Server Properties dialog, select the Auto Agent Assignment tab.
- Select the existing agent assignment setting and click Edit to open the Agent Assignment and Failover Wizard.
- On the Inclusion Criteria page, copy the LDAP query shown there and paste it into a Notepad file. You'll need this in a moment.
- Click Cancel to close the wizard without saving.
- Back on the Auto Agent Assignment tab, click Delete to remove the existing agent assignment setting. Confirm the deletion.
Now you'll rebuild the rule with the SCP count under control. The key is to not use the Automatically Manage Failover option if you have more than 10 management servers. Click Add to open the Agent Assignment and Failover Wizard fresh, and proceed carefully through the Domain and Inclusion Criteria pages, using the LDAP query you saved to Notepad, but on the failover configuration page, manually specify which management servers to include as failover targets rather than letting SCOM auto-select all of them. Keep the total below 10.
After you've recreated the agent assignment rule with fewer than 10 SCPs, give Operations Manager a few minutes to publish the updated SCPs to Active Directory. Then restart the HealthService on one of the affected agents to force it to re-read the AD policy:
# Run this on the affected agent server
Restart-Service HealthService
Watch the Operations Manager event log on that agent. After the restart, you should see the agent successfully locate the primary management server via AD, the Events 20012 and 2000 will stop appearing. A healthy AD-integrated connection will show Event ID 20070 ("The OpsMgr Connector connected to the management server") in the log.
If you want to verify the SCP count in Active Directory directly, you can query it with PowerShell from a domain controller or a machine with AD tools installed:
$root = [ADSI]"LDAP://RootDSE"
$searcher = New-Object System.DirectoryServices.DirectorySearcher
$searcher.SearchRoot = [ADSI]("LDAP://" + $root.rootDomainNamingContext)
$searcher.Filter = "(objectClass=serviceConnectionPoint)"
$searcher.FindAll() | Where-Object { $_.Properties["keywords"] -like "*Operations Manager*" } | Measure-Object
The count returned should be 10 or fewer for your management group. If it's higher, the AD cleanup from the rule deletion may not have completed, wait a few minutes and recheck, or manually delete the stale SCP objects from Active Directory Sites and Services under the management server's computer object.
Roll out the HealthService restart across all affected agents once you've confirmed one agent recovers cleanly. In large environments, do this in batches to avoid overwhelming the management server with simultaneous reconnections.
Advanced Troubleshooting for System Center Operations Manager Not Working
When the standard fixes don't resolve System Center Operations Manager agent connectivity problems, you need to get into the deeper diagnostic layers. Here's what I check in enterprise environments where the straightforward approaches haven't solved it.
Reading Network Traces for Error 800706D3
If you've re-enabled the DNS Client service but the SCOM agent push installation failure persists with error 800706D3, capture a network trace during the failed push attempt using either Network Monitor or Wireshark. Filter on the management server and agent server IP addresses and look for the MSRPC bind sequence. A healthy bind looks like this:
MSRPC:c/o Bind: scmr(SCMR) UUID{367ABB81-9844-35F1-AD32-98F038001003}
MSRPC:c/o Bind Ack
A failing bind shows:
MSRPC:c/o Bind: scmr(SCMR) UUID{367ABB81-9844-35F1-AD32-98F038001003}
MSRPC:c/o Bind Nack: Reject Reason: authentication_type_not_recognized
The Bind Nack with rejection reason "authentication_type_not_recognized" is the network-level confirmation of the DNS Client dependency failure. If you see this even after re-enabling DNS Client, the service may be being forcibly stopped by a Group Policy. Check Computer Configuration > Windows Settings > Security Settings > System Services in your GPO chain for the DNS Client service, if it's set to Disabled there, a local fix won't hold across reboots.
Group Policy Override for DNS Client
Run gpresult /h gpresult.html on the agent server and open the resulting HTML file. Search for "DNS Client" or "Dnscache" under Computer Configuration > Policies > Windows Settings > Security Settings > System Services. If a GPO is disabling it, you'll see the policy name and the linked OU. Coordinate with your AD team to either modify the GPO or move the agent server to an OU where that policy doesn't apply.
Enabling Diagnostic Tracing for AD Integration Issues
For Event 20012 and 2000 problems, enabling diagnostic tracing gives you the TracingGuidsNative.log file with granular detail about exactly where the SCP lookup fails. On the affected agent, open a Command Prompt as Administrator and run:
cd "C:\Program Files\Microsoft Monitoring Agent\Agent"
StartTracing.cmd VER
Reproduce the issue by restarting the HealthService, then stop tracing:
StopTracing.cmd
Open the log at C:\Windows\Temp\OpsMgrTrace\TracingGuidsNative.log and search for "SCP Not found" and "AD integration is enabled but primary info was not located." The surrounding log lines show exactly which management group and SCP query failed.
Antivirus Exclusions
It's worth checking whether your antivirus solution is intercepting SCOM's service management calls. Operations Manager has specific recommended antivirus exclusions, covering the SCOM agent directory, the Health Service data store, and the MOM SDK service binaries. If exclusions aren't in place, real-time scanning can interrupt the very operations that SCOM depends on for agent communication. Check the official Microsoft guidance for the current exclusion list specific to your SCOM version.