Script to check event log for id = 7 for a specific drive?

I’d like to monitor for bad blocks on system drives. Does anyone have a script for this?

If not are there script variables that has the timestamp from when it was last run on that specific asset so I could limit the event log search?

There really aren’t any script variables for the last time a script was run. But you can probably base your time frame off of the recuring schedule you set for running the monitoring script.

Say in your policy you set the script to run once a week, just pull 7days of logs, or w/e timeframe you need.

So on the topic of bad blocks…Yesterday I had a client send me a phone pic of a POST SMART DISK ERROR on reboot. I was not expecting that as Syncro’s disk monitoring was enabled. So I need to step up my disk monitoring even more.

I have a script of similar type that I use to watch for Veeam messages. This might help you build one.

In a related matter… I have been trying to filter down the noise a bit on disk error messages from Syncro. I think I am getting some external SD card like messages like “The driver detected a controller error on \Device\Harddisk2\DR2”. There is not a good reference on what is “Harddisk2” or “DR2”. So I have built out a little script to get some more info to diagnose. If anyone wants to help debug here is what I have so far.

Curious if you have the HD SMART Failure enabled on the policy. You can view the smart info in Backgrounding Tools, but beyond that, there’s no visibility into it.

Thanks @jimmie I completely missed the SMART Monitor! All this time I was just watching the Windows event log.

Instructions:
In the policy Add a Monitor - Alerts - Add Trigger - “HD SMART FAILURE”. It should appear in the asset under Monitoring a few minutes after saving the policy. See image here: Syncro Asset Details

1 Like

In CWA, the data was cached and also showed values in red if they were out of bounds. Unfortunately, this is just another bit of data that is stuck in Backgrounding Tools and not available throughout the platform like it should be. The agent has checks built into it to check the SMART status, so it wouldn’t be that difficult for it to be logged in the system.

I was curious what the results would look like on this failing drive I am now tracking. While I am waiting for the Alert to pop from Syncro here is what I can see in a terminal.

C:\WINDOWS\system32> wmic /namespace:\\root\wmi path MSStorageDriver_FailurePredictStatus
Active  InstanceName                                                   PredictFailure  Reason  
TRUE    SCSI\Disk&Ven_INTEL&Prod_SSDSC2BW240A4\4&36cd859d&0&000000_0   TRUE            0       
TRUE    SCSI\Disk&Ven_WDC&Prod_WD5000AAKX-60U6A\4&36cd859d&0&010000_0  FALSE           0       
C:\WINDOWS\system32>

Backgrounding Tools > System Info > Smart, can you see the “failure” there?

There are 2 drives (SSD/HDD) and Syncro Live has the SMART data reversed. Clearly one of the drives is failing by looking at the SMART values. ReallocatedSectorCount is way over the threshold.

Added-Clarity on the reversed drives. The boot drive is the SSD, which is failing. The SMART data presented for the SSD was the one with 0 ReallocatedSectorCount. So be careful out there and double check which drive you replace.

2022-02-22 12_48_38-HP-CCC _ Syncro Live

2022-02-22 12_48_03-HP-CCC _ Syncro Live

I heard you on the noise, I can’t enable the bad block detection, way too much noise. That’s why I want to filter based on the system drive and preferably data drives on a server as well.

1 Like

All of my servers are virtualized, so I use SNMP to watch the iDRAC which watches the hardware RAID card.

The realData should be the right column to look at. It looks like your SSD has a value of 28 for reallocated sectors, the other drive has 0. It’s normal for some reallocation to happen, drives are designed with extra to accommodate for that. The Current and Threshold are more like a health value. If the Current value falls below the Threshold value, then that’s when it normally trips SMART. Syncro is missing the Worst attribute though, so it makes it a little hard to diagnose. So yeah, I don’t know what tripped the predictive failure lol. Might need CrystalDiskInfo to see the info better.

Here’s what I use Monitor - Drive SMART Values - Pastebin.com Syncro’s smart monitor is kind of useless as smart typically doesn’t fail until the drive is in really bad shape, if ever. I also have the ‘Disk’ events check in event log policy, minus the one for controller as it’s super noisy with goofy usb drive controllers.

3 Likes

Thanks @isaacg this script is great.