Am wondering if anybody has any good implementations in Syncro for a heartbeat monitor that will open a ticket if that asset stays down for a defined period of time.
It has been suggested I run a script verifying a service status, but that is too far down the query line and has complex scenario handling requirements.
I just want to know if the agent has been unavailable for 5 or 10 minutes. And I would prefer a ticket since a 10 minute outage is likely an infrastructure incident that I need to investigate.
Any thoughts, ideas, successful implementations and are willing to share?
We do this natively in policies. You’d use the offline alert and set the number of minutes before an alert fires. When the alert fires you can pick that up in Automated Remediation, and take whatever actions you’d like (including opening tickets).
Yep, go into Automated Remediation. That is where you manage alert-based actions. So use the Trigger Category for offline alert as a condition, and then as many actions as you need off of that from there. It’s a super granular system so you can do just about anything you want to.