Technical Articles
Import data from Object Store to SAP HANA Cloud
First
This will be a memorandum blog.
I was developed an application on SAP Cloud Platform and I have a requirement to import data from an Excel file on the client side into a table on SAP HANA Cloud.
As Maxime Simon previously explained, there are various ways to import data into SAP HANA Cloud. However, in this case, since the work was to be done by an office staff, an easy and low-cost method was required.
Note 1. The original requirement was to “import Excel files” as described in the text above. In the actual application, the Excel sheet is converted into a CSV file using a npm module and stored in the Object Store, but I will skip that part and explain how to upload the CSV file and import it. Note 2: This blog is for SAP Cloud Platform Cloud foundry on AWS, I think the same method can be used on Azure, but I have not checked. |
The first thing I did.
Initially, I didn’t think too hard about this requirement and simply wrote a code to store the uploaded CSV file as an array in memory and store it line by line in SAP HANA Cloud with Insert statement. The simplified code is as follows.
const csvSync = require('csv-parse/lib/sync');
let csvdata = fs.readFileSync('/app/public/temp/' + req.file.filename);
let csvarray = csvSync(csvdata); //Convert csv file to array
..........
var sql = 'INSERT INTO "HCIMPORTSAMPLE_HDI_DB_1"."hcimportsample.db::testtable" VALUES(?,?,?,?,?,?,?,?,?,?);';
var stmt = conn.prepare(sql);
for (let i=0;i<csvarray.length;i++){
try {
stmt.exec([csvarray[i][0],csvarray[i][1],csvarray[i][2],csvarray[i][3],csvarray[i][4],csvarray[i][5],csvarray[i][6],csvarray[i][7],csvarray[i][8],csvarray[i][9]]);
} catch (err) {
console.log("Insert error --- ", i.toString());
console.log("SQL=", sql);
console.log("DB Error: SQL Execution --- ", err);
return;
}
}
*This code saves the CSV file uploaded from the web browser under /app/temp/public once.
In conclusion, this process will be fine if the number of rows in the CSV file is at the level of a few lines. However, with a test file of 10000 rows, it took about 40 seconds to run. (including file upload)
I set
conn.setAutoCommit(false);
before this loop and running this loop as a single transaction did not result in a very dramatic improvement.
Whether this is a good idea depends on your requirements. However, I think it takes too much time.
SAP HANA Cloud can import data from S3 and Azure Storage, and since SAP Object Store is an entity of these, it can be imported into SAP HANA Cloud with IMPORT statement. So I decided to upload the CSV file to the Object Store and then import it.
Setting up access to the Object Store
Use this information to set up access from HANA Cloud. Also, since It will be importing
Don’t forget to give IMPORT permission to the user, which can be found in the Sensitive Data of the HDI container.
Import from Object Store
Uploading data from Node.js to the Object Store is done using aws-sdk. If you have bind your application and Object Store service , you can use vcap_services so that you don’t need to write access information directly into the code like this.
The simplified code for uploading files to S3 and importing data into the SAP HANA Cloud database looks like this
var AWS = require('aws-sdk');
let csvdata = fs.readFileSync('/app/public/temp/' + req.file.filename);
const credentials = new AWS.Credentials({ accessKeyId: vcap_services.objectstore[0].credentials.access_key_id, secretAccessKey: vcap_services.objectstore[0].credentials.secret_access_key });
AWS.config.credentials = credentials;
AWS.config.update({ region: vcap_services.objectstore[0].credentials.region });
var s3 = new AWS.S3();
var params = {
Bucket: vcap_services.objectstore[0].credentials.bucket,
Key: req.file.filename
};
params.Body = csvdata;
var putObjectPromise = s3.putObject(params).promise();
putObjectPromise.then(function (data) {
var sql = "IMPORT FROM CSV FILE 's3-" + vcap_services.objectstore[0].credentials.region + "://" +
vcap_services.objectstore[0].credentials.access_key_id + ":" + vcap_services.objectstore[0].credentials.secret_access_key + "@" +
vcap_services.objectstore[0].credentials.bucket + "/" + req.file.filename + "' INTO " +
'"HCIMPORTSAMPLE_HDI_DB_1"."hcimportsample.db::testtable";';
var stmt = conn.prepare(sql);
try {
stmt.exec();
} catch (err) {
console.log("IMPORT ERROR --- ");
console.log("SQL=", sql);
console.log("DB Error: SQL Execution --- ", err);
}
return;
}).catch(function (err) {
console.log(err);
return;
});
In this example, the CSV file is placed in a directory local to the application and copied to the Object Store, as I modified the code in the previous loop insertion example. Originally, it would be better to upload the data directly to the Object Store.
Then, import the file uploaded to Object Store(S3) , then to SAP HANA Cloud.
In this example, I was able to import the same data as before in less than “2 seconds”, including the upload. After all, IMPORT is fast.
Finally.
As you can see, the combination of Object Store Service and SAP HANA Cloud can be used in real applications, and it is especially useful for speed.
Since I could only paste part of the code in the article, I have prepared a simple sample that can be run as a web application.
I hope everyone is safe.