Archive for SCOM – Alerts

SCOM – Agent – Health Service will not start (System Event ID 7024)

Some times I have an issue with an agent.

Error Message:
The SCOM Agent won’t run on a monitored system. When one tries to start the OpsMgr HealthService this event is logged:
Log Name: System Source: Service Control Manager

Date: xx/xx/xxxx xx:xx:xx
Event ID: 7024
Task Category: None
Level: Error Keywords:
Classic User: N/A
Computer: xxxxx.xxx

Description: The OpsMgr Health Service service terminated with service-specific error 2147500037 (0x80004005).

Read more

Generate alerts based on entries from SQL Database

I just ran into this good article from Tao Yangs. I found it very use full. Please go and have look.

Link to ‘SCOM MP Authoring Example: Generate alerts based on entries from SQL Database’

 

 

Orphaned Agents

Please run the following query to list all orphaned Agents, please verify if this contains the Agents.

NOTE: This not a support solution from Microsoft

 

— Check for orphaned health services (e.g. agent).

declare @DiscoverySourceId uniqueidentifier;

set @DiscoverySourceId = dbo.fn_DiscoverySourceId_User();

 

SELECT TME.[TypedManagedEntityid], HS.PrincipalName

FROM MTV_HealthService HS

INNER JOIN dbo.[BaseManagedEntity] BHS with(nolock)

ON BHS.[BaseManagedEntityId] = HS.[BaseManagedEntityId]

— get host managed computer instances

INNER JOIN dbo.[TypedManagedEntity] TME with(nolock)

ON TME.[BaseManagedEntityId] = BHS.[TopLevelHostEntityId]

AND TME.[IsDeleted] = 0

INNER JOIN dbo.[DerivedManagedTypes] DMT with(nolock)

ON DMT.[DerivedTypeId] = TME.[ManagedTypeId]

INNER JOIN dbo.[ManagedType] BT with(nolock)

ON DMT.[BaseTypeId] = BT.[ManagedTypeId]

AND BT.[TypeName] = N’Microsoft.Windows.Computer’

— only with missing primary

LEFT OUTER JOIN dbo.Relationship HSC with(nolock)

ON HSC.[SourceEntityId] = HS.[BaseManagedEntityId]

AND HSC.[RelationshipTypeId] = dbo.fn_RelationshipTypeId_HealthServiceCommunication()

AND HSC.[IsDeleted] = 0

INNER JOIN DiscoverySourceToTypedManagedEntity DSTME with(nolock)

ON DSTME.[TypedManagedEntityId] = TME.[TypedManagedEntityId]

AND DSTME.[DiscoverySourceId] = @DiscoverySourceId

WHERE HS.[IsAgent] = 1

AND HSC.[RelationshipId] IS NULL;

 

 

Then please follow these steps which will bring these Agents back to Pending Management, from here please approve the Agents.

 

1. Backup the OperationsManager database

2. Run the below SQL queries to remove the orphaned agents.

 

— DELETE!!! all orphaned agents.

declare @TypedManagedEntityId uniqueidentifier;

declare @DiscoverySourceId uniqueidentifier;

declare @LastErr int;

declare @TimeGenerated datetime;

 

set @TimeGenerated = GETUTCDATE();

set @DiscoverySourceId = dbo.fn_DiscoverySourceId_User();

 

DECLARE EntitiesToBeRemovedCursor CURSOR LOCAL FORWARD_ONLY READ_ONLY FOR

SELECT TME.[TypedManagedEntityid]

FROM MTV_HealthService HS

INNER JOIN dbo.[BaseManagedEntity] BHS

ON BHS.[BaseManagedEntityId] = HS.[BaseManagedEntityId]

— get host managed computer instances

INNER JOIN dbo.[TypedManagedEntity] TME

ON TME.[BaseManagedEntityId] = BHS.[TopLevelHostEntityId]

AND TME.[IsDeleted] = 0

INNER JOIN dbo.[DerivedManagedTypes] DMT

ON DMT.[DerivedTypeId] = TME.[ManagedTypeId]

INNER JOIN dbo.[ManagedType] BT

ON DMT.[BaseTypeId] = BT.[ManagedTypeId]

AND BT.[TypeName] = N’Microsoft.Windows.Computer’

— only with missing primary

LEFT OUTER JOIN dbo.Relationship HSC

ON HSC.[SourceEntityId] = HS.[BaseManagedEntityId]

AND HSC.[RelationshipTypeId] = dbo.fn_RelationshipTypeId_HealthServiceCommunication()

AND HSC.[IsDeleted] = 0

INNER JOIN DiscoverySourceToTypedManagedEntity DSTME

ON DSTME.[TypedManagedEntityId] = TME.[TypedManagedEntityId]

AND DSTME.[DiscoverySourceId] = @DiscoverySourceId

WHERE HS.[IsAgent] = 1

AND HSC.[RelationshipId] IS NULL;

 

OPEN EntitiesToBeRemovedCursor

 

FETCH NEXT FROM EntitiesToBeRemovedCursor

INTO @TypedManagedEntityId

 

WHILE @@FETCH_STATUS = 0

BEGIN

BEGIN TRAN

 

— Delete entity

EXEC @LastErr = [p_RemoveEntityFromDiscoverySourceScope] @TypedManagedEntityId, @DiscoverySourceId, @TimeGenerated;

IF @LastErr <> 0

GOTO Err

 

COMMIT TRAN

 

— Get the next typedmanagedentity to delete.

FETCH NEXT FROM EntitiesToBeRemovedCursor

INTO @TypedManagedEntityId

END

 

CLOSE EntitiesToBeRemovedCursor

DEALLOCATE EntitiesToBeRemovedCursor

 

GOTO Done

 

Err:

ROLLBACK TRAN

GOTO Done

 

Done:

Private Bytes Exceeded threshold – Management Server

WORKAROUND

To work around this problem, apply an override that applies to management servers to the Health Service Handle Count Threshold monitor and to the Health Service Private Bytes Threshold monitor. To do this, follow these steps:

1. Click Start, point to All Programs, click System Center Operations Manager 2007, and then click Operations Console.
2. In the Operations Console, click Authoring.
3. In the Authoring pane, expand Management Pack Objects, and then click Monitors.
4. On the Operations Console toolbar, click View, and then click Scope.
5. In the Scope Management Pack Objects by target(s) dialog box, click Clear All.
6. In the Look for box, type Health Service.
7. Click to select the Health Service check box, and then click OK.
8. In the Monitors pane, expand Health Service, expand Entity Health, expand Performance, expand Health Service Performance, right-click Health Service Handle Count Threshold, point to Overrides, point to Override the Monitor, and then click For a group.
9. In the Select Object dialog box, click to select the group named Management Server Computer Group, and then click OK
10. In the Override properties dialog box, check to select the Agent Performance Monitor Type – Threshold check box.
11. In the Override Setting field, change the value from 2000 to 10000, and then click OK.
12. Right-click Health Service Private Bytes Threshold, point to Overrides, point to Override the Monitor, and then click For a group
13. In the Select Object dialog box, click to select the group named Management Server Computer Group, and then click OK
14. In the Override properties dialog box, check to select the Agent Performance Monitor Type – Threshold check box.
15. In the Override Setting field, change the value from 104857600 to 1610612736, and then click OK

Login failed for user ‘

Solution:

Open the SCOM Console – Administration – Security – Run-As Accounts

image

Data Warehouse SQL Server Authentication Account – Account tab
The Account name: Should be blank – You will see “………………….” in both password fields – this is normal

 

image

Reporting SDK SQL Server Authentication Account – Account tab
The Account name: Should be blank – You will see “………………….” in both password fields – this is normal

image

 

If the account the error is referencing is in either of these Account fields – it will need to be reset

How do I reset the account?

1. Next to Account name – Remove the name – the place the cursor in the first space of the Account Name – hit the space-bar once.

2. Clear the “Password” box – Place the cursor in the first space of the “Password:” field and hit the space-bar once.
3. Clear the “Confirm Password” box – Place the cursor in the first space of the “Confirm Password:” field and hit the space-bar once.
4. The Account name is now blank, and each of the password fields has a single “.”
5. Hit apply – then ok

When you place an account in either the Data Warehouse SQL Server Authentication Account or Reporting SDK SQL Server Authentication Account you are telling SCOM that you have a SQL account already in place that you would like to use to access the database. By leaving these accounts blank SCOM then rolls to accounts listed under “Type: Windows” which is directly below the accounts listed above.

Alert generated by Send Queue % Used Threshold

In future such things can be avoided by putting agents into maintenance mode.
To solve it right now:
You need to run FlushHealthServiceStateCache and HealthServiceRestart tasks.
  • Goto Mrg Console -> Monitoring
  • Create a new state view targetted to HealthService
  • Wait for the view to be populated
  • Select the HealthService in interest enable Actions Pane
  • Find the FlushHealthServiceStateCache and HealthServiceRestart tasks