Rutger Heijmerikx

SAP HANA and R: how to integrate - the Outside-In Approach

Looking at the SAP landscape, ideally you would want to combine R Script (having installed R on a separate server) while using a HANA Analytical Process in SAP BW or integrate the R Script possibilities in your HANA SQL coding (Inside-Out scenario); this will be covered in our next Blog.

For this Blogpost we would like to discuss the Outside-In Approach

(note: SAP TechEd course slide).

However, we would first like to focus more on the basics: simply getting some data available to test your model and play around with either in the standard R environment or in R Studio.

The easy way out to test your R-model is to load a flat file into your R environment with one of the ‘read.’-statements:

E.g. for a simple ‘.csv’-file:

The following statement is already sufficient:

One of the problems with the above example is that if you would use the context and properties menu while looking at the example.csv file on your desktop, it will show something like: ‘C:\Users\nvand\Desktop\example.csv’. Copying that into the R Studio environment would give us the following error, because the ‘\’ is not recognized as being correct:

We of course know all the other issues we might run into when using flat files (e.g. decimal separators), or having to do all the work to create a sufficiently big file with proper content to do some testing.

If you would have access to a HANA testing system, we suggest the following approach instead (Outside-In scenario):

1. Make sure the correct ODBC-drivers are configured on your system (either 32 or 64 bits) via the Control Panel -> Data sources (ODBC) and adding a new DSN based on the HDBODBC32 or 64 driver:

(with the Server:Port defined as the instance server and the port generally being set at 30015 by default. Both settings can be found in your HANA Studio environment in the Additional Properties of the Database User Logon settings as well)

2.Install the ‘RODBC’-package within your R Environment with the ‘install.packages’-statement;

3..Initializing the RODBC-packge with the ‘library’-statement;

4. .Store the DSN name, including a valid user id and password, in a channel variable:

5. Pull the data from the HANA database to your R Environment, e.g.:

Where the channel is what you have defined at step number 4. It is wise to limit your dataset with the more detailed parameters available for the “sqlFetch” or “sqlQuery”-statements. For more information about those, use the ?-help functionality within R, for example, ‘?sqlFetch’. Additionally, you should be careful to which system you connect (e.g. for security or stability reasons).

When you want to proceed with the next step, whether this will be creating your own R-procedures or setting up a proof of concept, McCoy's consultants are specialized in Predictive Analytics and are more than willing to help you on the road of Predictive Analytics

Please feel free to contact us for more information.