Skip to content

Floating-point representation

IEEE 754 standard.

2 types:

  • single precision
  • double precision

Single precision

Uses bits.

  • sign bit - bit
  • exponent - bit
  • mantissa - bit

Sign bit

if positive or zero. if negative.

Exponent

Exponent field range - . In this range is defined for normal numbers. and are reserved for subnormal, infinite, signed zeros and NaN.

To support negative exponents, we subtract (half of ) from this range. . This range is the representable range.

Mantissa

In scientific notation, the part that doesn’t contain the base and the power.

In binary scientific notation, there will always be exactly one bit before the dot. So we don’t include that one.

Double precision

Uses bits.

  • sign bit - bit
  • exponent - bit
  • mantissa - bit

Sign bit

if positive or zero. if negative.

Exponent

Exponent field range - . In this range is defined for normal numbers. and are reserved for subnormal, infinite, signed zeros and NaN.

To support negative exponents, we subtract (half of ) from this range. . This range is the representable range.

Mantissa

In scientific notation, the part that doesn’t contain the base and the power.

In binary scientific notation, there will always be exactly one bit before the dot. So we don’t include that one.