• Madeira Team

Is it REAL?

I encounter too many databases in which numeric columns are defined as REAL or FLOAT. These data types are approximate, meaning they can only represent values approximately. They are meant to represent very large or very small numbers, and they should be used in databases where a very wide range of values should be expressed, such as scientific systems. In most cases, when we need to represent numbers, we actually need exact numbers, such as 15.873 or 98.2. Such numbers can represent transaction amounts, product sizes or room temperatures.

Using REAL or FLOAT when actually an exact number is required can lead to problems. The application that expects an exact value might behave unexpectedly when given an approximate value. Recently, I was involved in a data migration project, in which I was requested to transfer financial data from the current system to a new system. All the monetary columns in the current database were defined as REAL. Before I continue with the story, let me give you some background about these data types.

When it comes to numeric data types, SQL Server offers two categories of data types: exact and approximate. The exact data types include the integer family (TINYINT, SMALLINT, INT and BIGINT), the DECIMAL and NUMERIC types, and the MONEY and SMALLMONEY types. The approximate data types include the FLOAT and REAL data types.

Let’s put some order in this mess. The integer family data types are quite obvious. They differ in the storage size and the range of integer values. Nothing much to say about that.

The DECIMAL and NUMERIC types are equivalent. These are synonyms, so let’s talk about DECIMAL from now on. The DECIMAL type includes two parameters: precision (p) and scale (s). The precision determines the maximum total number of decimal digits that can be stored, both to the left and to the right of the decimal point. The scale determines the maximum number of digits, out of the total number of digits, which can be stored to the right of the decimal point. Therefore, 0 <= s <= p. The important thing to note about the DECIMAL data type is that it is exact. It has a fixed number of digits to the left and to the right of the decimal point, and you can represent any exact decimal value, which falls within the range of this type (-10^38 + 1 to 10^38 – 1).

The MONEY and SMALLMONEY are meant to represent monetary or currency values, but the implementation is actually very similar to DECIMAL and NUMERIC, so it’s not really interesting to talk about it either.

What I would really like to talk about is the FLOAT and REAL data types. These are approximate data types, and they are implemented based on the IEEE 754 standard for floating-point arithmetic. Floating point is a mechanism for representing very large or very small numbers. In this mechanism, numbers are represented approximately to a fixed number of significant digits and scaled using an exponent. The internal representation of a FLOAT or REAL number consists of 3 parts: The sign bit (S), the exponent (E) and the fraction (F). The IEEE 754 standard defines several types of floating point representations, which differ in the size of the exponent and the fraction parts. In SQL Server, only two types are supported: single precision and double precision. The following table presents the sizes of these two types:TypeSignExponentFractionSingle Precision1823Double Precision11152


FLOAT allows us to represent both types by supplying a parameter that indicates the number of bits that are used to store the fraction (plus one additional hidden bit). But SQL Server actually treats any value between 1 and 24 as 24, which determines a single precision number, and similarly any value between 25 and 53 is treated as 53, which determines a double precision number. So it doesn’t really matter if you specify FLOAT(25) or FLOAT(37). In both cases you will end up with a double precision number that uses 52 bits for the fraction and 8 bytes overall. The REAL data type is actually a synonym to FLOAT(24), and as you should already know by now, it represents a single precision number, in which there are 23 bits for the fraction and 4 bytes overall. So just like we focused on DECIMAL, and ignored NUMERIC, let’s focus on REAL from now on, and ignore FLOAT.

Is it getting too complicated? Let’s look at an example:

Transact-SQL

DECLARE @Real AS REAL = 1.234567; SELECT BinaryRepresentation = CAST (@Real AS BINARY(4)); GO

1

2

3

4

5

DECLARE

@Real AS REAL = 1.234567;

SELECT

BinaryRepresentation = CAST (@Real AS BINARY(4));

GO

The binary representation of the REAL number 1.234567 is 0x3F9E064B. Isn’t it great? Let’s analyze this value. The first bit is the sign bit (S), and it determines whether the value is positive (0) or negative (1). The next 8 bits represent the exponent (E), and the rest 23 bits represent the fraction (F). Without getting into all the bitwise stuff, here is how we can calculate S, E and F:

Transact-SQL

DECLARE @Binary AS BINARY(4) = 0x3F9E0651 , @S AS INT , @E AS INT , @F AS INT; SET @S = CAST (@Binary AS INT) & 0x80000000; SET @E = (CAST (@Binary AS INT) & 0x7F800000) / POWER (2,23); SET @F = (CAST (@Binary AS INT) & 0x007FFFFF) * 2; SELECT S = @S , E = @E , F = @F ; GO

1

2

3

4

5

6

7

8

9

10

11

12

13

DECLARE

@Binary AS BINARY(4) = 0x3F9E0651 ,

@S AS INT ,

@E AS INT ,

@F AS INT;

SET @S = CAST (@Binary AS INT) & 0x80000000;

SET @E = (CAST (@Binary AS INT) & 0x7F800000) / POWER (2,23);

SET @F = (CAST (@Binary AS INT) & 0x007FFFFF) * 2;

SELECT

S = @S ,

E = @E ,

F = @F ;

GO

In our case: S = 0, E = 127 and F = 3935382. Now, according to IEEE 754, the formula to calculate the value is: (-1)^S x 2^(E-127) x 1.F. Here it is in T-SQL:

Transact-SQL

DECLARE @S AS INT = 0 , @E AS INT = 127 , @F AS INT = 3935382; SELECT RealValue = CAST ( CAST (POWER (-1.0 , @S) AS REAL) * CAST (POWER (2.0 , @E-127) AS REAL) * CAST ((@F | 0x01000000) AS REAL) / POWER (2 , 24) AS REAL ); GO

1

2

3

4

5

6