Deep LearningArchitectures

Gradient Checkpointing

Overview

A memory optimisation that trades computation for memory by recomputing intermediate activations during the backward pass instead of storing them all during the forward pass.

More in Deep Learning