###### Technical Articles

# Convert DF34_RAW to Decimal in SQL

This is a poor man’s implementation to read **DF34_RAW** datatypes in SQL.

## Introduction

For building BI reports in native HANA for SAP’s ACM module, we ran into a lot of tables that were using decimal 128-bit floating point numbers stored in **DF34_RAW** format. As this is a native ABAP type, it cannot be read directly in SQL.

We were already building a lot of calculation views and reports in HANA, so we had to have a way to get the values from these fields.

Discussions with SAP support did not lead to a solution, so I set out to find an alternate native SQL approach.

We’ve been using this approach for some time now, so I thought it was good to share what we came up with.

## Binary structure

When I started, I naively tried to see if **DF34_RAW** has the same inner workings as https://en.wikipedia.org/wiki/Decimal128_floating-point_format.

Although the structure looks similar, the representation is not.

What follows is my understanding of the format for our use-case. *Don’t expect this to be complete or 100% correct.*

Sign | combination | significand |
---|---|---|

1 bit | 17 bits | 110 bits |

s | mmmmmmm mmmmmmmmmm | cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc |

Here:

`s`

denotes the sign-bit`cccccccccc`

are 10-bit declets that can just be read as an integer with maximum of 999. (For negative sign-bit, values are inverse counted. So`999 - value`

)Sign declet value `1`

`0000000000`

000 `1`

`0001101100`

083 `1`

`1111100111`

999 `0`

`0000000000`

999 `0`

`1111100111`

000 `mmmmmmm mmmmmmmmmm`

contains both the most significant digit and the exponent value. This combined part can be read as an integer in a similar fashion as the declets.In the combined part, we can look at the first digit to find the most significant digit. (For negative sign-bit, it is

`9 - value`

)For the exponent, we ignore the first digit and subtract

`6176`

(Since 128-bit floating point numbers support a range of`±0000000000000000000000000000000000×10^−6176`

to`±9999999999999999999999999999999999×10^6111`

).

For the negative sign-bit, the exponent is a bit different.Sign combination int value exponent msd value `1`

`0111101 1011111101`

61765 0 5 5 `1`

`0111101 1100000011`

61771 1 1 10 `1`

`0111101 1100001111`

61783 2 3 300

## Challenges

To implement this in SQL however, we have two challenges:

- we need to individually select bit-ranges
- we need to convert those bit-ranges to integer values.

**Challenge 1** can be overcome by masking bit fields using `BITAND`

function, converting the binary value to hexadecimal VARCHAR’s via `BINTOHEX`

, and then selecting ranges using `SUBSTRING`

.

Since we use hexadecimal values, we select a multiple of 4-bits each time. The best we can do is select 12-bits instead of 10-bit for each declet, so some of our ranges are 2-bits misaligned. This means we need to bit-shift some of the values (or simply divide by 4).

**Challenge 2** got a bit harder, since we now have hexadecimal strings instead of binary values. No native function exists for this. The blogpost https://archive.sap.com/discussions/thread/3652555 tried to solve this with a scalar UDF function using a while loop. Although it works, the performance and especially the intermediate memory allocation are **terrible**. So instead, I opted for something that databases are good at: *lookup* tables.

For our purpose we only have 1000 distinct values anyway.

In our lookup table, we can also tackle the issue of bit-shifting.

## Implementation

Initially I wanted something modular and reusable, so I built several scalar UDF functions that could be called for any column containing DF34_RAW values. However this turned out to be pretty bad for performance. Each UDF allocated several Mb of memory for every line item it worked on. For some large datasets, I noticed that several TB of memory were being allocated and deallocated during execution. Rewriting the logic using lookup tables and simple left joins made things easier and ensured performance isn’t too bad.

The downside is that it needs to be implemented for each table and for each source field.

In the end my logic is to split the **DF34_RAW** field into 14 separate fields, then perform 13 joins to lookup values and finally bring it together in a string format that can be passed to the `TO_DECIMAL`

function, like `to_decimal(1.231e2)`

.

Instead of cluttering this post with hard to format code, the entire example can be found at: https://github.com/maxnuf/df34_raw_convesion.

I still hope that some day a native conversion function in HANA will be made available. In the meantime I hope this implementation can help out others that need to read **DF34_RAW** values using SQL.

I know the implementation is not perfect, so ideas and suggestion to improve it are welcome.