Overview
Adding more machines or nodes to a system to handle increased load.
More in DevOps & Infrastructure
Site Reliability Engineering
Site ReliabilityA discipline applying software engineering principles to infrastructure and operations to create scalable, reliable systems.
Runbook
Site ReliabilityA documented set of procedures for handling routine operations and troubleshooting common issues.
Rollback
CI/CDThe process of reverting a system to a previous version or state after a failed deployment or update.
Observability
ObservabilityThe ability to understand a system's internal state from its external outputs, encompassing metrics, logs, and traces.
Distributed Tracing
ObservabilityA method of tracking requests as they flow through distributed systems to diagnose latency and failure points.
Configuration Management
Infrastructure as CodeThe practice of systematically managing and maintaining the consistency of system configurations.
Logging
ObservabilityThe practice of recording events, errors, and system activities for debugging, auditing, and analysis.
Puppet
Infrastructure as CodeA configuration management tool that automates the provisioning and management of infrastructure.