BLOGBUSINESS INTELLIGENCESAP HANA
Rutger Heijmerikx

Taming large hierarchies in SAP BW

Introduction

Hierarchies are one of the most powerful reporting objects in SAP BW and for that reason they are used extensively. From BW 7.3 the whole data loading concept for hierarchies was improved dramatically by SAP; from that moment it was possible to source the data from virtually anywhere.

The processing of hierarchy data now takes place in special transformations containing several segments that make a BW hierarchy (e.g. header data, hierarchy items and texts). Even though the setup is very flexible, it is still problematic to load large hierarchies.

Memory issues and performance issues can still occur when processing huge datasets, but this blog explains how one can overcome them with a little effort.

Problem description

In the example used here, a customer-territory hierarchy is loaded from a CRM system (non-SAP) which contains all the customers and their territories of all the countries. The company being a multinational, this hierarchy contained hundreds of thousands of entries. Not only because they had a lot of customers to begin with, but each customer could be part of multiple territories (e.g. different product groups). 

This hierarchy was a crucial part of the BW authorization model, because it was used to give each sales rep access to the customers and territories he was allowed to see. The daily loading process became pretty cumbersome as more countries went live in CRM, until one day, it was no longer possible to reliably load the data. The jobs crashed due to memory problems and time-outs.

The data model at a glance

The structure of the hierarchy was several layers deep, but at the top level there was always the country. The figure below shows the structure:

The source DSO already contained the data in the format required by the hierarchy transformation, e.g. the proper Node ID and Parent IDs, Descriptions, InfoObject references etc.

A solution was needed to load the hierarchy in a consistent way and without memory issues.

Divide and conquer

The best way to solve the issue was to look at the various data loading options available at the DTP Update-tab.

For hierarchies there are special options available under the heading Update method: Full update, Update subtree and Insert subtree. By default the Full update option is selected, which loads and replaces the whole hierarchy. The two other options are used to load a portion of the hierarchy: the update option fully replaces an existing subtree and the insert option inserts a subtree without deleting existing nodes.

Press F1 Help one of those radiobuttons to see the detailed SAP Help information, the differences are explained pretty well. 

By using a smart combination of the first two options (Full update and Update subtree), we can load the hierarchy in multiple steps.

Step 1 ­– The top level DTP

At first the top level of the hierarchy is loaded in full, which deletes any existing data and recreates the skeleton of our model. This DTP is set up in such way that only the root node is picked up. After data activation, the hierarchy looks as follows:

The essential DTP settings needed for this part are as follows:

The filter settings contained a restriction on 0H_NODEID (Technical ID) so that only the root node is picked up. In the source it was enforced that the Node ID of the root node was always 0000001, which made the filter setting pretty easy.

Extraction mode was set to Full, since the hierarchy is built up with each data load.

Next, the Update tab must be set to the default settings, i.e. Full Update.

Step 2 – The subtree DTPs

After loading the root node, the rest of the hierarchy has to be added step-by-step using multiple DTPs. One needs to create as few DTPs as possible for good performance, but as many as necessary to process the data without memory errors.

A reliable way is to split up the number range of the technical Node ID, as this is numeric field, so one can create ranges like: 2 through 999, 1000 through 1999 etc, to cover all possible values. Depending on the source data, other fields may be easier. Choose wisely in order not to miss any data and to create balanced sets of data.

The figure below displays the new situation when the DTP is loaded.

In this example, the source data was also polluted with orphaned nodes, which caused the DTP load to abort. The source data could not easily be fixed, so extra code was needed to work around this issue.

The DTP filter selects the country nodes only (the red boxes in the figure above), and the start routine in the transformation then recursively adds the child nodes (the purple boxes) from top-down. The start routine coding to build the hierarchy is beyond the scope of this document.

In short, the following settings were applied in the DTPs to load the subtrees:

  • Extraction mode Full

  • Node ID is set to the interval you want to cover (here it was set to 2 – 134214)

  • Parent ID is set to 00000001 to only select the nodes directly under the root node

On the Update tab, ensure that the Update Subtree option is selected. This is an essential step!

Adjust these settings for all the remaining DTPs; just make a copy of this DTP and adjust to suit.

Save the code changes and activate the transformation. There should be three DTPs at minimum under your transformation: A full load for the root node and at least two DTPs for loading the subtrees.

All you need to do now is create a process chain. The first step must load and activate the Root level DTP. After activation you can then load the remaining DTP for the subtrees.

For more detailled information or assistance you can always give us a call or contact us via our contact page