Script to check event log for id = 7 for a specific drive?

ken · February 22, 2022, 3:50pm

I’d like to monitor for bad blocks on system drives. Does anyone have a script for this?

If not are there script variables that has the timestamp from when it was last run on that specific asset so I could limit the event log search?

jordanritz · February 22, 2022, 4:30pm

There really aren’t any script variables for the last time a script was run. But you can probably base your time frame off of the recuring schedule you set for running the monitoring script.

Say in your policy you set the script to run once a week, just pull 7days of logs, or w/e timeframe you need.

jeff · February 22, 2022, 5:00pm

So on the topic of bad blocks…Yesterday I had a client send me a phone pic of a POST SMART DISK ERROR on reboot. I was not expecting that as Syncro’s disk monitoring was enabled. So I need to step up my disk monitoring even more.

I have a script of similar type that I use to watch for Veeam messages. This might help you build one.

github.com

mrmsp/msp-script-dev/blob/main/veeam-backup-monitoring.ps1

# Create a Syncro RMMAlert when a Veeam backup fails. Also sets the asset custom field "Backup Status" to latest Veeam result. 
# Required Asset custom field "Backup Status". Must be run daily as the event log search is limited to 1 day back.
# Optional pass in $prev_status platform variable from {asset_custom_field_backup_status} to Alert if no backups after x number of days
# Optional pass in $num_days runtime variable to allow a few days between backups for some machines (weekends/mobile devices). Default 5 days
Import-Module $env:SyncroModule

## Set the default number of days back to check for good backups
if (!(Test-Path variable:$num_days)) { $num_days = -5 }
## If a positive number was passed in, convert it to negative
if ($num_days -gt 0) { $num_days *= -1 }
## If days was set to 0, then change it to 1 day
if ($num_days -gt -1) { $num_days = -1 }

## Fetch the most recent Veeam log entry
$event = Get-EventLog "Veeam Agent" -InstanceID 190 -newest 1 -ErrorAction SilentlyContinue
##$event = Get-EventLog "Veeam Agent" -newest 1 -After (Get-Date).AddDays(-1) | Where-Object {$_.EventID -eq 190}

## No backup events found and Backup Status is empty
if ($event.count -eq 0 -and $prev_status -eq "") {
    write-host "Veeam Backup Missing! No Backups found."

This file has been truncated. show original

In a related matter… I have been trying to filter down the noise a bit on disk error messages from Syncro. I think I am getting some external SD card like messages like “The driver detected a controller error on \Device\Harddisk2\DR2”. There is not a good reference on what is “Harddisk2” or “DR2”. So I have built out a little script to get some more info to diagnose. If anyone wants to help debug here is what I have so far.

github.com

mrmsp/msp-script-dev/blob/main/diagnose-disk-controller-alerts.ps1

## Use this to diagnose which disk is reporting an event log "controller error". Most of the time it will be removable disks like USB flash drives 
## Syncro likes to alert on errors like this: "The driver detected a controller error on \Device\Harddisk2\DR2"
##
## Figuring out what is Harddisk2 (physical) DR2 (arbitrary sequantial disk number) can be done by looking at the output of dd.exe --list
## Determine \Device\HarddiskN\DRx Where N is the physical drive Number and x is the sequential disk number
##
## Download dd from http://www.chrysocome.net/dd and included it in Syncro files. Set Destination File Name to c:\windows\temp\dd.exe
##
## Note other messages like The device, \Device\Harddisk0\DR0, has a bad block. may also be generated. This script won't find those. 

if ($RunMode -eq "list_drives") {
    ## Use dd to dump a list of disks to the Script Output
    c:\windows\temp\dd.exe --list
    
    ##Get-PhysicalDisk | Select -Prop DeviceId,FriendlyName,SerialNumber
    Get-PhysicalDisk | Format-List DeviceId,FriendlyName,SerialNumber
    ## Index Time          EntryType   Source                 InstanceID Message 
    Get-EventLog -Logname System -EntryType Error -Message "*Device\Harddisk*" | Format-List TimeGenerated,Message
    #| Tee-Object -Variable hdderrors
    #Write-Host "Total disk device errors: $($hdderrors.count)"

This file has been truncated. show original

Jimmie · February 22, 2022, 6:21pm

Curious if you have the HD SMART Failure enabled on the policy. You can view the smart info in Backgrounding Tools, but beyond that, there’s no visibility into it.

jeff · February 22, 2022, 6:37pm

Thanks @jimmie I completely missed the SMART Monitor! All this time I was just watching the Windows event log.

Instructions:
In the policy Add a Monitor - Alerts - Add Trigger - “HD SMART FAILURE”. It should appear in the asset under Monitoring a few minutes after saving the policy. See image here: Syncro Asset Details

Jimmie · February 22, 2022, 6:40pm

In CWA, the data was cached and also showed values in red if they were out of bounds. Unfortunately, this is just another bit of data that is stuck in Backgrounding Tools and not available throughout the platform like it should be. The agent has checks built into it to check the SMART status, so it wouldn’t be that difficult for it to be logged in the system.

jeff · February 22, 2022, 6:45pm

I was curious what the results would look like on this failing drive I am now tracking. While I am waiting for the Alert to pop from Syncro here is what I can see in a terminal.

C:\WINDOWS\system32> wmic /namespace:\\root\wmi path MSStorageDriver_FailurePredictStatus
Active  InstanceName                                                   PredictFailure  Reason  
TRUE    SCSI\Disk&Ven_INTEL&Prod_SSDSC2BW240A4\4&36cd859d&0&000000_0   TRUE            0       
TRUE    SCSI\Disk&Ven_WDC&Prod_WD5000AAKX-60U6A\4&36cd859d&0&010000_0  FALSE           0       
C:\WINDOWS\system32>

Jimmie · February 22, 2022, 6:47pm

Backgrounding Tools > System Info > Smart, can you see the “failure” there?

jeff · February 22, 2022, 6:54pm

There are 2 drives (SSD/HDD) and Syncro Live has the SMART data reversed. Clearly one of the drives is failing by looking at the SMART values. ReallocatedSectorCount is way over the threshold.

Added-Clarity on the reversed drives. The boot drive is the SSD, which is failing. The SMART data presented for the SSD was the one with 0 ReallocatedSectorCount. So be careful out there and double check which drive you replace.

2022-02-22 12_48_38-HP-CCC _ Syncro Live

2022-02-22 12_48_03-HP-CCC _ Syncro Live

ken · February 22, 2022, 7:31pm

I heard you on the noise, I can’t enable the bad block detection, way too much noise. That’s why I want to filter based on the system drive and preferably data drives on a server as well.

jeff · February 22, 2022, 7:44pm

All of my servers are virtualized, so I use SNMP to watch the iDRAC which watches the hardware RAID card.

Jimmie · February 22, 2022, 9:20pm

The realData should be the right column to look at. It looks like your SSD has a value of 28 for reallocated sectors, the other drive has 0. It’s normal for some reallocation to happen, drives are designed with extra to accommodate for that. The Current and Threshold are more like a health value. If the Current value falls below the Threshold value, then that’s when it normally trips SMART. Syncro is missing the Worst attribute though, so it makes it a little hard to diagnose. So yeah, I don’t know what tripped the predictive failure lol. Might need CrystalDiskInfo to see the info better.

isaacg · February 22, 2022, 11:18pm

Here’s what I use Monitor - Drive SMART Values - Pastebin.com Syncro’s smart monitor is kind of useless as smart typically doesn’t fail until the drive is in really bad shape, if ever. I also have the ‘Disk’ events check in event log policy, minus the one for controller as it’s super noisy with goofy usb drive controllers.

jeff · February 23, 2022, 1:47am

Thanks @isaacg this script is great.

Topic		Replies	Views
Event log monitoring problem, using Hyper-V monitoring as an example Other	10	880	January 9, 2022
Windows backup monitoring	8	1537	June 16, 2023
Windows Backup Monitor	5	1029	September 28, 2022
Getting tons of script run failures Other	34	2459	February 17, 2022
Major scripting/policy issue	7	686	March 25, 2022

Script to check event log for id = 7 for a specific drive?

Related topics