# FP8 introduce ## Basics FP32: https://en.wikipedia.org/wiki/Single-precision_floating-point_format FP16: https://en.wikipedia.org/wiki/Half-precision_floating-point_format BF16: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format FP8: https://en.wikipedia.org/wiki/Minifloat FP8 in deeplearning: https://arxiv.org/abs/2206.02915 ## FP8 config arxiv.org/abs/2206.02915 use different bias in weight and act, in order to make it simple, we use same bias. 1.5.2 have less m_cnt, 2x2bit mul is easy to calculate in FPGA. 1.4.3 is more detail than 1.5.2, but less dynamic range. Note we don't have INF/NaN in fp8, just cutoff them. Here is the fp8 table: ``` print(" ", end="") for m in range(1<