SCOM Health Script/Audit

So, As part of my work at my company it’s my responsibility to perform bi-anual health check on all our SCOM clients.

Last year I’ve created a powershell script that checks and reports on the SCOM environment that it’s ran against. Things that the script put into an HTML file for me are things like

1. Uptime of the servers

2. Database last backup

3. Aggregations Behind on SCOM

4. Database free space on the ACS,DW,OPS and ReportServer databases.

5. Grooming Status

etc etc.

I’ve decided in the last month to start using the SCOM SDK with VB.net as to learn the SDK and refresh my VB.net programming knowledge, call it a challenge if you will.

So below is a screen shot of what the GUI looks like at this point, for now I’m just focusing on putting everything in listviews with some intellengence build into some of the list views, ie: turning the listview line red/yellow or green depending on the healthstate of the management server, turning the listview line red if the collation on the server/instance or on the database itself is wrong. The GUI will be enhanced and improved at a later stage, for now the information/data is the most important, after all the HTML file created by this application will be used instead of this GUI.

Screenshot

healthcheck

The user can then specify the saving location of the html file and save all information in the gui as an HTML file.

You specify the management server, for a SCOM 2007 environment you will specify the RMS, for SCOM 2012 SP1/R2 you specify any management server, you also specify the userid, password and domain of an domain account with sufficient permission to connect via the SDK, normally any account in the Operations Manager Administrators Role.

You can also specify a userid, password and domain to connect to the different databases, if not specified the account for the SDK connection will be used by the application to connect to SCOM.

The application picks up all the information required from the DatabaseServer name of the OPS db and DW from the registry and also using the SCOM SDK.

You will see that the Agent Properties and Agent Performance listviews are empty.

In the “Agent Properties” listview i will put the properties including “Proxy Enabled”, Port, Compression and Patch List. The idea is to check for hotfixes and UR/CU for the SCOM product and other properties. Grayed out agents will show gray and also colors to represent the health state of all other states, unintialized agents a.k.a unmonitored will show white in the GUI and HTML file.

The “Agent Performance” listview will show the last CPU and Memory performance counters for each and every agent for the last 7/4 hours.  In the HTML if the agent has not send either CPU/Memory in the last 7/4 hours the whole table row in the HTML file will show red so as to stand out in the report.

The SMTP server address, email address and port is retrieved from the SCOM notification channels, still have to think about multiple different SMTP servers and ports, which to use, maybe give the user a change to pick which one to use. The idea is to use this SMTP information to send the HTML report and Management Server alerts to me and to the SCOM engineers working on the account.

The large table OPS DB listview contains the largest tables in the OPS DB, any normal healthy environment will always have the performance and event tables as the largest tables in the DB, well at least that’s my best practise.

So there is still a huge amount of work involved in getting this application done, at the end of the day the hard work will be worthwhile for me as it makes my work and turn-around time less and quicker – I call this a win.

I will post new posts and screenshot in future.

SCOM agents not sending performance data

I’ve finally managed to enhance the query and the SSRS report.

Example of report

Image

This query focuses only agents that’s not grayed out and available (monitored).

Performance counters looked for in the query are

Processor > % Processor Time

LogicalDisk > % Free Space > C:

Memory > Available MBytes

Below is the query. Note: You will probably have to change the DisplayName_ and IsVirtualNode for your OperationsManager database.

This query targets the OperationsManager database. The query uses “with (nolock)” to prevent locks on the database while the query is executing.

if object_id(‘tempdb..#temptable’) IS NOT NULL
DROP TABLE #temptable

SELECT     distinct bmetarget.Name into #temptable
FROM        OperationsManager.dbo.BaseManagedEntity AS BMESource WITH (nolock) INNER JOIN
 OperationsManager.dbo.Relationship AS R WITH (nolock) ON
R.SourceEntityId = BMESource.BaseManagedEntityId INNER JOIN
OperationsManager.dbo.BaseManagedEntity AS BMETarget WITH (nolock) ON
 R.TargetEntityId = BMETarget.BaseManagedEntityId inner join mtv_computer d on bmetarget.name=d.[DisplayName_55270A70_AC47_C853_C617_236B0CFF9B4C]
 and d.IsVirtualNode_E817D034_02E8_294C_3509_01CA25481689 is null
WHERE     (bmetarget.fullname like ‘Microsoft.Windows.Computer%’)

if object_id(‘tempdb..#healthstate’) IS NOT NULL
DROP TABLE #healthstate

select  megv.path, megv.ismanaged, megv.isavailable, megv.healthstate into #healthstate
from managedentitygenericview as megv with (nolock) inner join managedtypeview as mtv with (nolock)
on megv.monitoringclassid=mtv.id
where mtv.name =’microsoft.systemcenter.agent’

if object_id(‘tempdb..#perfcpudata’) IS NOT NULL
DROP TABLE #perfcpudata

select Path, ‘CPU’ as ‘Cat’ into #perfcpudata
from PerformanceDataAllView pdv with (NOLOCK)
inner join PerformanceCounterView pcv on pdv.performancesourceinternalid = pcv.performancesourceinternalid
inner join BaseManagedEntity bme on pcv.ManagedEntityId = bme.BaseManagedEntityId
where (TimeSampled < GETUTCDATE() AND TimeSampled > DATEADD(MINUTE,-240, GETUTCDATE()))
and objectname =’Processor’ and countername=’% Processor Time’

if object_id(‘tempdb..#perfmemdata’) IS NOT NULL
DROP TABLE #perfmemdata

select Path,’Memory’ as ‘Cat’ into #perfmemdata
from PerformanceDataAllView pdv with (NOLOCK)
inner join PerformanceCounterView pcv on pdv.performancesourceinternalid = pcv.performancesourceinternalid
inner join BaseManagedEntity bme on pcv.ManagedEntityId = bme.BaseManagedEntityId
where (TimeSampled < GETUTCDATE() AND TimeSampled > DATEADD(MINUTE,-240, GETUTCDATE()))
and objectname =’Memory’ and countername=’Available MBytes’

if object_id(‘tempdb..#perfdiskdata’) IS NOT NULL
DROP TABLE #perfdiskdata

select Path,’Disk’ as ‘Cat’ into #perfdiskdata
from PerformanceDataAllView pdv with (NOLOCK)
inner join PerformanceCounterView pcv on pdv.performancesourceinternalid = pcv.performancesourceinternalid
inner join BaseManagedEntity bme on pcv.ManagedEntityId = bme.BaseManagedEntityId
where (TimeSampled < GETUTCDATE() AND TimeSampled > DATEADD(MINUTE,-240, GETUTCDATE()))
and objectname =’LogicalDisk’ and countername=’% Free Space’ and instancename=’C:’

if object_id(‘tempdb..#temptable1’) IS NOT NULL
DROP TABLE #temptable1
create table #temptable1 (
name nvarchar(250),
cat nvarchar(20),
val nvarchar(2)
)
insert into #temptable1
select name, ‘CPU’ as ‘cat’, ‘1’ as ‘val’
from #temptable where name not in
(select path from #perfcpudata)

insert into #temptable1
select name, ‘Memory’ as ‘cat’, ‘1’ as ‘val’
from #temptable where name not in
(select path from #perfmemdata)

insert into #temptable1
select name, ‘Disk’ as ‘cat’, ‘1’ as ‘val’
from #temptable where name not in
(select path from #perfdiskdata)

if object_id(‘tempdb..#output’) IS NOT NULL
DROP TABLE #output
create table #output (
name nvarchar(250),
cpu nvarchar(2),
memory nvarchar(2),
disk nvarchar(2)
)

insert into #output
select distinct tt.name ,’0′,’0′,’0′
from #temptable1 as tt, #healthstate as hs
where tt.name=hs.path collate SQL_Latin1_General_CP1_CI_AS
and hs.isavailable=1
and hs.ismanaged=1
and hs.healthstate is not null

update #output set cpu=1 where #output.name in (select name from #temptable1 where #temptable1.name=#output.name and #temptable1.cat=’CPU’)
update #output set memory=1 where #output.name in (select name from #temptable1 where #temptable1.name=#output.name and #temptable1.cat=’Memory’)
update #output set disk=1 where #output.name in (select name from #temptable1 where #temptable1.name=#output.name and #temptable1.cat=’Disk’)

select * from #output

Output of results in SQL management studio

Image