"how numbers are stored and used in computers"
The MXFP8 (Microscaling 8-bit Floating Point) format is a low-precision floating-point number representation introduced in the Open Compute Project (OCP) Microscaling Formats (MX) specification. Designed specifically for artificial intelligence (AI) workloads—particularly inference—MXFP8 balances minimal memory footprint with sufficient dynamic range, making it well-suited for high-throughput, energy-efficient computing environments.
MXFP8 is one of the most promising formats in the emerging class of sub-16-bit representations. It is especially important in large-scale deep learning systems where memory bandwidth, cache size, and matrix throughput are dominant constraints.
Modern neural networks often tolerate significant reductions in numerical precision, especially during inference. With sufficient robustness in architecture and training, activations and weights can be quantized to formats as low as 8 or even 4 bits, while maintaining acceptable accuracy.
The MXFP8 format is optimized for:
This design is part of an industry-wide push to standardize compact numerical formats, with support from Intel, Meta, NVIDIA, AMD, Arm, and others through the OCP consortium.
MXFP8 specifies two floating-point subformats:
Each has:
These formats represent numbers in the IEEE-like normalized form:
Where:
| Format | Exponent Bits | Mantissa Bits | Exponent Bias | Dynamic Range | Precision |
|--------|---------------|----------------|----------------|----------------------|-----------|
| E4M3 | 4 | 3 | 7 | ~
E4M3 offers slightly more precision and a smaller range. E5M2 sacrifices precision for a wider range, which may be useful for unnormalized tensors or values that vary drastically in scale (e.g., logits, attention scores).
MXFP8 formats allow neural network inference to be:
These benefits are most pronounced in:
Frameworks and compilers typically use block floating point (BFP) or scaling factor techniques to keep tensors well-distributed across the limited exponent range. This can be done:
Hardware like Intel’s AMX, NVIDIA Tensor Cores, and Google TPUs have either implemented or proposed efficient handling of such ultra-low precision formats.
| Format | Bits | Exponent | Mantissa | Range | Use Case |
|----------|------|----------|----------|-------------------|------------------|
| FP32 | 32 | 8 | 23 | ~
Note that MXFP8 formats are not covered by IEEE 754, but follow similar encoding logic. Unlike INT8, MXFP8 values are non-uniformly distributed across the number line, enabling better representation of values near zero—common in deep learning.
Let’s encode the decimal number 1.0
in E4M3:
0
This yields the 8-bit binary:
For E5M2:
Binary:
Conversion from FP32 → MXFP8 typically uses nearest-even rounding or stochastic rounding to minimize error bias.
These formats are not appropriate for traditional scientific computing or financial systems where numerical integrity is critical.
The OCP MX Format Specification v1.0 establishes a shared, open standard for 8-, 6-, and 4-bit floating point formats. By aligning the industry around common encodings like E4M3 and E5M2, the MXFP8 format encourages:
As compiler stacks (e.g., TVM, XLA, MLIR) and frameworks (e.g., PyTorch, TensorFlow) integrate MXFP8-aware kernels, it is expected to become a dominant format for next-generation AI inference workloads.
MXFP8 provides an elegant tradeoff between range and precision within a single byte. Its two configurations, E4M3 and E5M2, offer flexibility depending on workload characteristics. When paired with intelligent quantization and block scaling strategies, MXFP8 enables compact, performant, and energy-efficient model execution—especially in production and at the edge.
For practitioners building high-performance AI systems, understanding and leveraging MXFP8 is quickly becoming essential.