Floating Point Numbers

Floating point numbers permit the display of fractional values for real numbers. Floating point numbers are known as scalars. While the compiler’s default for numbers is “no float”, C51 can represent scalar values in several different formats. Scalars can be managed and displayed as; IEEE-754, Reverse IEEE-754, and BCD float, double, and long double. The IEEE-754 format generates the smallest fastest code and is the standard that Franklin Software has used since C51 V2. Franklin’s implementation of Reverse IEEE-754 was developed for the “big endian” architectures such as the Intel 251. The BCD representations generate more code, but result in far more precision and accuracy.

IEEE-754 Single Precision Float

The default (single precision) floating point format for C51 is stored in 4 bytes (32-bits). This format complies with the standards as defined by IEEE - 754. There are two components to a floating point number: the mantissa and the exponent. The mantissa stores the actual digits of the number. The exponent stores the power to which the mantissa must be raised.

The exponent is an 8-bit value in the range 0 to 255 and is stored relative to 127. The actual value of the exponent is calculated by subtracting 127 from the stored value (0 to 255). The value of the exponent can be anywhere from +128 to -127. The mantissa is a 24-bit value whose most significant bit (MSB) is always 1 and is, therefore, not stored. There is also a sign bit which indicates if the floating point number is positive or negative.

IEEE-754 floating point numbers are stored in the memory of the 8051 using the following format:

Address

+0

+1

+2

+3

Contents

MMMM MMMM
MMMM MMMM
EMMM MMMM
SEEE EEEE
where:

S    represents the sign bit where 1 is negative and 0 is positive.

E    is the two’s complement exponent with an offset of 127.

M    is the 23-bit normalized mantissa. The highest bit is always 1 and, therefore, is not stored

Using the above format, the floating point number 12.5 would be stored as a hexadecimal value of 0x00004841. In memory, this would appear as follows:

Address

+0

+1

+2

+3

Contents

0x00
0x00
0x48
0x41

Using the above format, the floating point number -12.5 would be stored as a hexadecimal value of 0x000048C1. In memory, this would appear as follows:

Address

+0

+1

+2

+3

Contents

0x00
0x00
0x48
0xC1

It is fairly simple to convert floating point numbers to and from their hexadecimal storage equivalents. The following example demonstrates how this is done. For this example, we will use the value for -12.5 shown above.

Note that the floating point storage representation is not an intuitive format. The bytes are stored in the reverse order from what seems sensible. For the sake of this example, we will refer to them in an order that is more appealing. So, the value 0x000048C1 now becomes 0xC1480000. To convert this value to a floating point number, the bits should be separated as specified in the floating point number storage format table shown above.

For example:

Address

+3

+2

+1

+0

Format

SEEEEEEE

EMMMMMMM

MMMMMMMM

MMMMMMMM

Binary

11000001

01001000

00000000

00000000

Hex

C1

48

00

00

From this illustration, you can determine the following information:
  The sign bit is 1, indicating a negative number.

  The exponent value is 10000010  binary, 82 hex, or 130 decimal. Subtracting 127 from 130 leaves 3 which is the actual exponent.

  The mantissa appears as the following binary number:
10010000000000000000000
There is an understood decimal point at the left of the mantissa that is always preceded by a 1. This digit is not stored in the hexadecimal representation of the floating point number. Adding 1  and the decimal point to the beginning of the mantissa gives the following:
1.10010000000000000000000
Now, we adjust the mantissa for the exponent. A negative exponent moves the decimal point to the left. A positive exponent moves the decimal point to the right. Because the exponent is 3, the mantissa is adjusted as follows:
1100.10000000000000000000

Finally, we have a binary floating point number. Binary digits that are to the left of the decimal point represent the power of two corresponding to their position. For example, 1100  represents
(1 × 23) + (1 × 22) + (0 × 21) + (0 × 20) which equals 12.

Binary digits that are to the right of the decimal point also represent the power of two corresponding to their position. However, because these digits are to the right of the decimal point, the powers are negative. For example, .100…  represents (1 × 2-1) + (0 × 2-2) + (0 × 2-3) + … which equals .5.

Adding these values together, we get 12.5.  Because the sign bit was set, we must include a negative sign. So, the hexadecimal value 0x000048C1  is -12.5.

Reverse IEEE-754 Single Precision Float

Another format in common use today is the REVERSE IEEE 754 format. This format is used as it lends its self better to the “big-endian” processors like the Intel 251. Use and layout is similar to the IEEE 754 standard already described—but reversed.

The exponent is an 8-bit value in the range 0 to 255 and is stored relative to 127. The actual value of the exponent is calculated by subtracting 127 from the stored value (0 to 255). The value of the exponent can be anywhere from +128 to -127. The mantissa is a 24-bit value whose most significant bit (MSB) is always 1 and is, therefore, not stored. There is also a sign bit which indicates if the floating point number is positive or negative.

Reverse IEEE-754 floating point numbers are stored in memory using the following format:

Address

+0

+1

+2

+3

Contents

SEEE EEEE
EMMM MMMM
MMMM MMMM
MMMM MMMM
where:

S    represents the sign bit where 1 is negative and 0 is positive.

E    is the two’s complement exponent with an offset of 127.

M    is the 23-bit normalized mantissa. The highest bit is always 1 and, therefore, is not stored

Using the above format, the floating point number 12.5 would be stored as a hexadecimal value of 0x41480000. In memory, this would appear as follows:

Address

+0

+1

+2

+3

Contents

0x41
0x48
0x00
0x00

Using the above format, the floating point number -12.5 would be stored as a hexadecimal value of 0xC1480000. In memory, this would appear as follows:

Address

+0

+1

+2

+3

Contents

0xC1
0x48
0x00
0x00

The rules for converting these floating point numbers are similar to the method previously discussed.

NOTE: The current IEEE floating point implementation in C51 does not support double precision float. User requiring this kind of extended precision should refer to the BCD section below.

BCD Float (32-bit)

For high precision some applications can benefit from the BCD floating point format. The ANSI standard regards double and float as floating point variable type specifiers. As the 8051 has several instructions to perform calculations in the BCD format, the bcd keyword has been added to provide this format. One of the advantages of using the BCD format is that it ensures a constant and continuing precision. Numbers whose decimal representation is perfect in bcd (such as 0.01) have a similar representation in binary. The bcd keyword can be used with float (32-bits), double (48-bits), and long double (56-bits) specifiers.

BCD float values, are implemented as follows: A BCD float is represented in 4 bytes. The mantissa is contained in the first three bytes, and the exponent is contained in the last byte. The exponent in all cases is coded in a 6-bit format. The sixth bit is the opposite to the sign of the exponent (0 if negative, 1 if positive) and the seventh bit is the number sign. For BCD double, the mantissa is contained in 6 bytes (48-bits), and for BCD long double, the mantissa is contained in 7 bytes (56-bits).

BCD float values are stored in memory using the following format:

Address

+0

+1

+2

+3

Contents

MMMM MMMM
MMMM MMMM
MMMM MMMM
SNEE EEEE
where:

S    represents the sign bit where 1 is negative and 0 is positive.

N represents that this is a BCD “number”

E    represents the binary coded decimal exponent value

M    represents the binary coded decimal 24 bit mantissa value.

Using format described above, the floating point number 0.123789e+03 would be stored as a hexadecimal value of 0x12378943. In memory, this would appear as follows:

Address

+0

+1

+2

+3

Contents

0x12
0x37
0x89
0x43

Using the format described above, the floating point number -0.123789e+03 would be stored as a hexadecimal value of 0x123789C3. In memory, this would appear as follows:

Address

+0

+1

+2

+3

Contents

0x12
0x37
0x89
0xC3

BCD Double (48-bit)

A BCD double is represented in 6 bytes. The mantissa is contained in the first five bytes, and the exponent is contained in the last byte. The exponent is coded in a 6-bit format. The sixth bit is the opposite to the sign of the exponent (0 if negative, 1 if positive) and the seventh bit is the number sign.

Using format described above, the floating point number 0.1234567890e-02 would be stored as a hexadecimal value of 0x1234567893E. In memory, this would appear as follows:

Address

+0

+1

+2

+3

+4

+5

Contents

0x12
0x34
0x56
0x78
0x90
0x3E

BCD Long Double (56-bit)

A BCD long double is represented in 7 bytes. The mantissa is contained in the first six bytes, and the exponent is contained in the last byte.  The exponent is coded in a 6-bit format. The sixth bit is the opposite to the sign of the exponent (0 if negative, 1 if positive) and the seventh bit is the number sign.

Using format described above, the floating point number -0.123456789012e-01 would be stored as a hexadecimal value of 0x1234567890123F. In memory, this would appear as follows:

Address

+0

+1

+2

+3

+4

+5

+6

Contents

0x12
0x34
0x56
0x78
0x90
0x12
0x3F