Overview
An automated test that verifies a service or system component is functioning correctly.
More in DevOps & Infrastructure
Vertical Scaling
CI/CDIncreasing the resources (CPU, RAM, storage) of an existing machine to handle more load.
Prometheus
ObservabilityAn open-source monitoring and alerting toolkit designed for reliability and scalability in cloud-native environments.
Helm
Containers & OrchestrationA package manager for Kubernetes that simplifies the deployment and management of applications using charts.
Rolling Update
CI/CDA deployment strategy that gradually replaces instances of the previous version with the new version.
GitOps
Infrastructure as CodeAn operational framework using Git repositories as the single source of truth for declarative infrastructure and applications.
Runbook
Site ReliabilityA documented set of procedures for handling routine operations and troubleshooting common issues.
Capacity Planning
Site ReliabilityThe process of determining the production capacity needed to meet changing demands for an organisation's products.
Alerting
ObservabilityAutomated notifications triggered when system metrics or conditions exceed predefined thresholds.