Overview
A partitioning algorithm that divides data into k clusters by minimising the distance between points and their cluster centroids.
More in Machine Learning
Feature Store
MLOps & ProductionA centralised repository for storing, managing, and serving machine learning features, ensuring consistency between training and inference environments across an organisation.
Ensemble Learning
MLOps & ProductionCombining multiple machine learning models to produce better predictive performance than any single model.
Cross-Validation
Training TechniquesA resampling technique that partitions data into subsets, training on some and validating on others to assess model generalisation.
Bias-Variance Tradeoff
Training TechniquesThe balance between a model's ability to minimise bias (error from assumptions) and variance (sensitivity to training data fluctuations).
Unsupervised Learning
MLOps & ProductionA machine learning approach where models discover patterns and structures in data without labelled examples.
Logistic Regression
Supervised LearningA classification algorithm that models the probability of a binary outcome using a logistic function.
Model Serialisation
MLOps & ProductionThe process of converting a trained model into a format that can be stored, transferred, and later reconstructed for inference.
SMOTE
Feature Engineering & SelectionSynthetic Minority Over-sampling Technique — a method for addressing class imbalance by generating synthetic examples of the minority class.