Skip to content

feat: add Prometheus alerting rules for API latency and error-rate SLAs (#846)#863

Merged
RUKAYAT-CODER merged 13 commits into
rinafcode:mainfrom
KingDavid9999:fixes-issue-#846
Jun 30, 2026
Merged

feat: add Prometheus alerting rules for API latency and error-rate SLAs (#846)#863
RUKAYAT-CODER merged 13 commits into
rinafcode:mainfrom
KingDavid9999:fixes-issue-#846

Conversation

@KingDavid9999

Copy link
Copy Markdown
Contributor

Summary

Closes #846
Adds Prometheus alerting rules for SLA breach detection on the TeachLink backend, along with Alertmanager Slack routing config and on-call runbooks. Previously, SLA breaches were only detectable by manually watching dashboards.

Changes

New files

charts/teachlink-backend/templates/prometheus-rules.yaml: PrometheusRule CR with 4 alerts
charts/teachlink-backend/values.yaml: configurable thresholds, Alertmanager/Slack routing
charts/teachlink-backend/Chart.yaml: Helm chart metadata
charts/teachlink-backend/templates/_helpers.tpl: standard Helm template helpers
docs/RUNBOOKS.md: on-call runbook for all 4 alerts

@RUKAYAT-CODER

Copy link
Copy Markdown
Contributor

Thank you for contributing to the project.

@RUKAYAT-CODER RUKAYAT-CODER merged commit 97a6c56 into rinafcode:main Jun 30, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants