This article is intended to give some tips on fixes and changes for System Center Operations Manager 2007 (OpsMgr) Agent Health. There are a variety of hotfixes and registry keys listed so please be aware that these are things that may not be noticed in your environment directly but might be causing issues that you are not aware of. If you are having an issue with agents in a grey state these fixes may help, if they do not see the following:
1. Make sure that your agents, management servers and gateways are running the latest Cumulative Update (CU). At the time this article was written the latest CU is CU5 and can be downloaded at the following link:
2. All agents, management servers and gateways should be running the JET EDB hotfix for Windows found in KB981263:
To apply this hotfix, you must be running Windows Server 2003 Service Pack 2 (SP2) or Windows Server 2008 Service Pack 2 (SP2) on your computer.
3. There is an Avg. Disk Sec/Transfer issue that returns an incorrect very large value. For more information see the following:
4. You cannot receive event notifications after you back up and then clear event logs in Windows Server 2008 or in Windows Vista, thus keeping OpsMgr from alerting on events after clearing event logs:
For Windows 2008 or Vista http://support.microsoft.com/kb/2458331
This issue does not occur in Windows Server 2008 R2 or Windows 7.
5. The “Win32_Service” WMI class leaks memory in Windows Server 2008 R2 and in Windows 7:
This is included in Server 2008 R2 SP1 and Windows 7 SP1.
6. A hotfix is available that improves the stability of the Windows Management Instrumentation repository in Windows Server 2003:
7. WSH binaries are overwritten by Windows File Protection after you install Windows Script 5.7 on a computer that is running Windows Server 2003 or Windows XP, for Operations Manager. This may be the cause of High CPU utilization:
8. A managed application has a high number of thread handles and of event handles in the Microsoft .NET Framework 2.0, for Operations Manager this would cause a high number of handles:
9. The CPU usage of an application or a service that uses MSXML 6.0 to handle XML requests reaches 100% in Windows Server 2008, Windows Vista, Windows XP Service Pack 3, or other systems that have MSXML 6.0 installed, for Operations Manager this would cause high CPU for the Monitoringhost.exe process:
11. New AFD connections fail when software that uses TDI drivers is installed on a Windows Server 2008 or Windows Vista SP1 system that is running on a computer that has multiple processors:
11. FIX: You receive a “Provider Load Failure” error message or the Wmiprvse.exe process stops responding when you use a SQL Server WMI provider to obtain information about SQL Server 2005, SQL Server 2008, or SQL Server 2008 R2 services:
12. Extended Protection for Authentication. This feature enhances the protection and handling of credentials when authenticating network connections by using Integrated Windows Authentication (IWA). In Operations Manager there may be some errors doing client push that a trace would show in a network trace as:
Server Error, (91) Invalid user identifier
13. Description of the rollup update for the .NET Framework 3.5 Service Pack 1 on Windows XP and on Windows Server 2003 (976765, 980773 and 976769): June 8, 2010:
14. Description of the rollup update for the .NET Framework 3.5 Service Pack 1 and the .NET Framework 2.0 Service Pack 2 on Windows XP and on Windows Server 2003 (976765 and 980773): June 8, 2010:
15. A rollup hotfix package for Windows Server 2008 Failover Clustering WMI provider:
1. Subkey: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\HealthService\Parameters
Name: Persistence Version Store Maximum
Value: Number of 16-kilobyte pages
The default size of the version store depends on the Operations Manager role and is defined as the number of 16-kilobyte pages to allocate in memory. The default values are as follows:
• Agent (workstation operating systems): 640 (10 megabytes)
• Agent (server operating systems): 1920 (30 megabytes)
• Management Server: 5120 (80 megabytes)
If you experience a problem with the configuration of this registry value you will see the following event:
Event Type: Error
Event Source: ESE
Event Category: Transaction Manager
Event ID: 623
Description: HealthService (<PID>) The version store for instance <instance> (“<name>“) has reached its maximum size of <value>Mb. It is likely that a long-running transaction is preventing cleanup of the version store and causing it to build up in size. Updates will be rejected until the long-running transaction has been completely committed or rolled back. Possible long-running transaction:
Session-context ThreadId: <value>.
Note This event may report the issue with other Operations Manager processes, depending on the affected role.
We recommend that you set the version store size to double its default size. For example, if you set the version store size on a computer that hosts a Management Server role, set the registry value to 10240 (decimal).
2. Subkey: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\HealthService\Parameters\Management Groups\<MGNAME>
If you experience an issue with this registry value it will be noted by the following event:
Event Source: HealthService
Event Category: Health Service
Event ID: 2015
A workflow in the Health Service has generated a message which exceeds the size limit, and has been discarded.
Note When we increase the limit, this can adversely affect the Management group performance because this causes a large amount of discovery data being collected by the Management Server and the Root Management Server.