Overview
The process of reducing the numerical precision of a model's weights and activations from floating-point to lower-bit representations, decreasing memory usage and inference latency.
Cross-References(1)
More in Artificial Intelligence
Backward Chaining
Reasoning & PlanningAn inference strategy that starts with a goal and works backward through rules to determine what facts must be true.
AI Orchestration
Infrastructure & OperationsThe coordination and management of multiple AI models, services, and workflows to achieve complex end-to-end automation.
AI Agent Orchestration
Infrastructure & OperationsThe coordination and management of multiple AI agents working together to accomplish complex tasks, routing subtasks between specialised agents based on capability and context.
Reinforcement Learning from Human Feedback
Training & InferenceA training paradigm where AI models are refined using human preference signals, aligning model outputs with human values and quality expectations through reward modelling.
Frame Problem
Foundations & TheoryThe challenge in AI of representing the effects of actions without having to explicitly state everything that remains unchanged.
Cognitive Computing
Foundations & TheoryComputing systems that simulate human thought processes using self-learning algorithms, data mining, pattern recognition, and natural language processing.
AI Tokenomics
Infrastructure & OperationsThe economic model governing the pricing and allocation of computational resources for AI inference, including per-token billing, rate limiting, and credit systems.
Strong AI
Foundations & TheoryA theoretical form of AI that would have consciousness, self-awareness, and the ability to truly understand rather than simulate understanding.