The Site Reliability Guardian provides an automated change impact analysis to validate service availability, performance, and capacity objectives across various systems. This enables DevOps platform engineers to make the right release decisions for new versions and empowers SREs to apply Service-Level Objectives (SLOs) for their critical services.
While the Dynatrace Site Reliability Guardian simplifies the adoption of DevOps and SRE best practices to ensure reliable, secure, and high-quality releases in general, the provisioned workflow is key for automating those best practices in particular. Learn in this sample how the workflow will act on changes in your environment and how it will perform a validation to make the right decision in a releasing or progressive delivery process.
In this lab exercise, you will learn more about the workflow leveraged by the Site Reliability Guardian .
Workflows allows you to:
Apps -> Site Reliability Guardian
.+ Guardian
.create without template
. A new guardian is displayed in the editor. fetch logs
| fieldsAdd errors = toLong(loglevel == "ERROR")
| summarize errorRate = sum(errors)/count() * 100
run query
, select the last 1 hour to previw results of your current error rate.Lower than the these numbers is good
1
0.4
In todays lab for Workflow's we'll leverage the Guardian we created in previous step and run that on scheduled basis to ensure we always meet our SLO
Time interval trigger
. 10 minutes
and Rule parameter to everyday
In this section, you should have completed the following:
✅ Understand what a Site Reliability Guardian is and how it can strengten SRE practices
✅ Understand use cases for Workflows
✅ Created a Site Reliability Guardain.
✅ Created a simple workflow using the SRG