Overview
A quantitative measure of some aspect of the level of service being provided.
More in DevOps & Infrastructure
Incident Management
Site ReliabilityThe processes and tools for detecting, responding to, resolving, and learning from service disruptions.
Service Discovery
CI/CDThe automatic detection of devices and services on a network, enabling dynamic service-to-service communication.
Distributed Tracing
ObservabilityA method of tracking requests as they flow through distributed systems to diagnose latency and failure points.
Capacity Planning
Site ReliabilityThe process of determining the production capacity needed to meet changing demands for an organisation's products.
Horizontal Scaling
CI/CDAdding more machines or nodes to a system to handle increased load.
Ansible
Infrastructure as CodeAn open-source automation tool for configuration management, application deployment, and task automation.
GitOps
Infrastructure as CodeAn operational framework using Git repositories as the single source of truth for declarative infrastructure and applications.
Monitoring
ObservabilityThe continuous observation of system performance, availability, and health using automated tools and dashboards.