Binary Number System: Representations | |||||||||||||||||||||||||
IEEE 32-Bit Floating Point Format (IEEE Standard 754)
where s = SDenormals: e = 0; Specials: (NaN, Inf, -Inf) e = 255 1.100000000 x 20 What is the IEEE 32-bit floating point representation for the decimal number -11.5?
1. Convert to binary. To the left of the binary point, represent the magnitude of the number to the left of the decimal point. To the right of the binary point, represent the fraction to the right of the decimal point (note: this may require a loss of accuracy). The first position after the binary point is the 2-1 position (0.5 decimal), then 2-2 (0.25), 2-3 (0.125), etc. -11.510 = -1011.122. Convert to normalized binary scientific notation (i.e. move binary point to the left or right as far necessary until a single one is to left of the binary point, e.g. 1.f): -1011.12 = -1.0111 x 23Note: in the special case of 0.0, all 32 bits are 0. This is a denormal since there is no 1 to the left of the binary point. 3. Determine s, e and f: s = 1 for negative, 0 for positive.4. Assemble the 32 bits, padding f to the right with zeroes:
5. Convert to hex:
Note special cases:
s = 0, e=255, f = all zeroes: +Infinity
Next Page |