Top 10 Writing Prompt Templates for SREs to Boost Automation

In the fast-paced world of Site Reliability Engineering (SRE), automation is key to maintaining system stability and efficiency. Writing effective prompts can significantly enhance automation workflows, enabling SREs to troubleshoot, monitor, and optimize systems more effectively. Here are the top 10 writing prompt templates designed specifically for SREs to boost automation and streamline operations.

1. System Health Check Prompt

Generate a comprehensive system health report based on recent logs, metrics, and alerts. Include CPU usage, memory consumption, disk I/O, network traffic, and error rates to identify potential issues.

2. Incident Response Automation

Create a step-by-step incident response plan for [specific issue], including initial diagnosis, escalation procedures, and remediation steps. Tailor the plan based on recent incident data.

3. Deployment Rollback Script

Write a script to automate rollback procedures for failed deployments. Ensure it checks current deployment status, preserves data integrity, and restores previous stable versions.

4. Log Analysis and Anomaly Detection

Analyze recent logs to detect anomalies or patterns indicating potential failures. Summarize findings and suggest proactive measures to prevent future issues.

5. Resource Scaling Recommendations

Based on current system metrics, recommend optimal resource scaling actions. Include thresholds for CPU, memory, and network usage that trigger automatic scaling.

6. Security Audit Checklist

Generate a security audit checklist for the latest system configuration. Highlight common vulnerabilities, misconfigurations, and compliance issues to address.

7. Capacity Planning Forecast

Create a capacity planning forecast based on historical data. Include growth trends, peak usage times, and recommendations for future infrastructure investments.

8. Automated Alert Message

Draft an automated alert message template for critical system failures. Ensure it includes essential details such as affected components, severity, and recommended actions.

9. Configuration Drift Detection

Develop a prompt to compare current system configurations against baseline settings. Identify and report any drifts or unauthorized changes.

10. Post-Incident Analysis Report

Generate a detailed post-incident analysis report summarizing root causes, impact, response effectiveness, and lessons learned. Include actionable recommendations for future prevention.