Skip to Content

The SDI File Adapter is one of the adapters preinstalled with every SDI Agent. Its purpose is to make any file, fixed width or delimited, in every character encoding and all formats available to Hana as a virtual table to select from as easy as from any other Hana table..




Once the File Adapter is made visible in Hana it provides multiple tables out of the box, e.g. FILEDIRECTORY to get a list of all available files or FILECONTENT with file content as one large BLOB. As this might allow any Hana user with the create-remote-source privilege to read files, there needs to be some security built around.

When deploying the adapter in the agent, a root directory is to be set. The adapter does not allow to reach outside of this root directory, ever. Therefore sensible files like /etc/passwd or C:\Windows cannot be read. Second is some password (or AccessToken as it is labeled) to make sure the person creating the remote source in Hana is allowed to use this adapter.

The administrator installing the agent decides on these settings by deploying the FileAdapter and changing the preferences of this adapter in the AgentConfig tool.



As one virtual table specifies the file format definition and multiple files can have the matching structure, there is no 1:1 relationship between files and virtual tables. So what should the adapter return when browsing the list of available virtual tables? To provide this browsing option the adapter does scan a File Format Root Directory for files of name *.cfg and each file contains the format information. This is the second directory to configure.


With this information the remote source and the virtual tables can be created and the file content selected from.


Remote Source


When creating a remote source the root and file format directories are set by default to the directories of the adapter and can be further restricted here. The idea is, maybe one remote source points to the plan data subdirectory, another to the employee data (both sub directories of the adapter root directory) and users are allowed to use either the one or the other remote source. Just in case more fine grained security is needed.

The AccessToken has to match the token entered in the adapter preferences as security measure.



Alternatively the remote source can be created via SQL as well



CONFIGURATION ‘<?xml version=”1.0″ encoding=”UTF-8″?>

<ConnectionProperties name=”ConnectionInfo”>

<PropertyEntry name=”rootdir”>/usr/sap/FileAdapter/FileServer</PropertyEntry>

<PropertyEntry name=”fileformatdir”>/usr/sap/FileAdapter/FileformatDefinitions</PropertyEntry>



  ‘<CredentialEntry name=”AccessTokenEntry”>




The built-in tables


The following tables are provided by the adapter always. These are supposed to help building applications using the File Adapter.



  • FILEDIRECTORY: List all files of the remote source, including the sub directories. Supports pushdown of LIKE and = on the PATH and NAME column.
  • FILECONTENT: Allows to read a single file as is. It requires the PATH and NAME column to be restricted to a single file using a where clause and returns the entire file content as one BLOB.
  • FILECONTENTTEXT: As above but this time the file content is returned as NCLOB. That means the file is read with a specific character encoding. This can either be the default – based on the Byte Order Mark of the data file or the OS default – or provided as dataprovisioning parameter. In this example a ASCII file is read with UTF32 codepage, hence producing wrong characters.
  • FILECONTENTROWS: Again returns a BLOB but this time one row per line. A line is supposed to end with a \n (ASCII newline) character/byte.
  • CODEPAGES: Returns the list of all supported values for codepages by querying the adapter’s JVM installation.
  • LOCALE: Returns the locales the JVM does support.


Defining file formats


One option of creating a virtual table for a given file format is executing the create virtual table statement manually and providing the format information as additional metadata. Below is an example of the syntax and it also lists all possible parameters. Only some of them are mandatory, most can be omitted to leave them at a useful default.


create virtual table “v_plan” at “FileSource”.”<NULL>”.”<NULL>”.”v_plan”

REMOTE PROPERTY ‘dataprovisioning_parameters’=

‘<?xml version=”1.0″ encoding=”UTF-8″ standalone=”yes”?>


<Parameter name=”FORCE_FILENAME_PATTERN”>plan%.txt</Parameter>

<Parameter name=”FORCE_DIRECTORY_PATTERN”></Parameter>

<Parameter name=”FORMAT”>CSV</Parameter>

<Parameter name=”CODEPAGE”>UTF-8</Parameter>

<Parameter name=”ROW_DELIMITER”>\r\n</Parameter>

<Parameter name=”COLUMN_DELIMITER”>;</Parameter>

<Parameter name=”TEXT_QUOTES”></Parameter>

<Parameter name=”SKIP_HEADER_LINES”>0</Parameter>

<Parameter name=”QUOTED_TEXT_CONTAIN_ROW_DELIMITER”>false</Parameter>

<Parameter name=”ESCAPE_CHAR”></Parameter>

<Parameter name=”TEXT_QUOTES_ESCAPE_CHAR”></Parameter>

<Parameter name=”ERROR_ON_COLUMNCOUNT”></Parameter>

<Parameter name=”LOCALE”>de_DE</Parameter>

<Parameter name=”DATEFORMAT”>dd. MMMM yyyy</Parameter>

<Parameter name=”TIMEFORMAT”>HH:mm:ss</Parameter>

<Parameter name=”SECONDDATEFORMAT”>yyyy.MM.dd HH:mm:ss</Parameter>

<Parameter name=”TIMESTAMPFORMAT”>yyyy.MM.dd HH:mm:ss</Parameter>

<Parameter name=”COLUMN”>YEAR;Integer</Parameter>

<Parameter name=”COLUMN”>COST_CENTER;varchar(8)</Parameter>

<Parameter name=”COLUMN”>PLAN_AMOUNT;decimal (15, 2)</Parameter>

<Parameter name=”COLUMN”>CHANGE_SECONDDATE;seconddate</Parameter>



The other option is to create a .cfg file inside the Directory of file format definitions or a sub directory thereof. The content of the file are logically the same. The file has to have a first line with a description and then lists all parameters as key value pairs. Above example as cfg file would look like:


Fileformat to read a CSV file with plan data




















COLUMN=PLAN_AMOUNT;decimal (15, 2)



At the end both versions have the same endresult, a virtual table with all these parameters as virtual table attributes. The only difference is with the cfg files the browsing shows something and then the virtual table is created, whereas in the first option the table is created directly.


Note: Changing the cfg file after creating the virtual table has no effect. These parameters are really copied into the Hana data dictionary.




In case the file is a fixed width file, so the column start and end position is always the same, the position information is needed in addition.


Fixed width demo









COLUMN=PLAN_AMOUNT;decimal (15, 2)



Parameters in detail



  • FORCE_FILENAME_PATTERN, FORCE_DIRECTORY_PATTERN: A user might execute a simple “select * from virtualtable” without any where clause on directory and name of file. In that case every single file in the root directory and subdirectories will be read and parsed according to this virtual table format definitions. That might take a while a produce many errors. But if the virtual table does match to files in a certain directory, directory tree or to certain file names only, it makes sense to hard wire this information into the virtual table directly. Examples:
    • FORCE_DIRECTORY_PATTERN=/usr/sap/FileAdapter/FileServer/plandata: Reads all files in that directory only
    • FORCE_DIRECTORY_PATTERN=/usr/sap/FileAdapter/FileServer/plandata/%: Reads all files in that directory and subdirectories
    • FORCE_DIRECTORY_PATTERN=/usr/sap/FileAdapter/FileServer/plan%: Reads all files in directories that start with “plan”
    • FORCE_FILENAME_PATTERN=plan%.txt: Reads files like plan20120101.txt
    • FORCE_DIRECTORY_PATTERN=/usr/sap/FileAdapter/FileServer/plandata  FORCE_FILENAME_PATTERN=plan%.txt: Files inside the directory and matching the provided name pattern
  • FORMAT: Is either CSV (default) or FIXED
  • CODEPAGE: The character encoding the file is read with. By default the operating system default is used. In case the file has a Byte Order Mark this codepage is used always. Valid values of the Java installation can be found by creating a virtual table for CODEPAGE and querying its contents.
  • ROW_DELIMITER: A character sequence indicating the end of a row in the file. In case these are non-printable characters they can be provided encoded as \d65536 or \xFFFF or as Unicode notation \u+FFFF. Alternatively the typical \r and \n is supported as well. Examples:
    • \n Unix standard
    • \r\n Windows standard
    • \d13\d10 Windows standard but characters provided as decimal number
    • \x0D\x0A Windows standard but characters provided as hex number
  • SKIP_HEADER_LINES: In case the file has a header, the number of lines to be skipped is entered here.
  • ERROR_ON_COLUMNCOUNT: By default a row with less columns then defined is considered okay. By setting this parameter to true, it is expected that all rows of the file have as many columns as defined.
  • LOCALE: The decimal and date conversion from the strings in the file into native numbers or dates might be locale specific. For example the text “3,1415” is a perfectly nice value for PI but in the German notation where the comma is the decimal separator. With an English locale it would be a large integer number 31415. Same thing with dates in case the month or day name is used. “14. Oktober 2000” with locale German is fine, for all other languages it is not. Again, valid values for the locale can be found by querying a virtual table based on the LOCALE table of the adapter.
  • DATEFORMAT, TIMEFORMAT, SECONDDATEFORMAT, TIMESTAMPFORMAT: The file format can use these datatypes for the date/time related values. Each can have a different format string. The syntax of the format string is the Java SimpleDateFormat syntax.
  • COLUMN: multiple entries each consist of the columnname;datatype where the datatype is any normal Hana datatype.


CSV only

  • COLUMN_DELIMITER: The character sequence indicating the next column. If non-printable characters are used then either one of the encodings will work \d65536, \xFFFF or \u+FFFF
    • ; Meaning the ; is the column separator, so a line looks like 2000;IT Costs;435.55
    • | using the pipe character as delimiter
    • \d09 using an ASCII tab character as delimiter
  • TEXT_QUOTES: Sometimes text data is enclosed in quotes so a column delimiter inside the text does not break the format. The line 2000;IT Costs; software related only;435.55 would appear as 4 columns as the text contains a semicolon as well. If the file was created with quotes like 2000;”IT Costs; software related only”;435.55 then there is no such issue but the file parser needs to act more carefully and not just search for the next column delimiter. It needs to check if the text is inside the text quote character or outside.
  • ESCAPE_CHAR: Another way to deal with inline special characters is to escape those, like in 2000;IT Costs\; software related only;435.55. Here the \ char is an escape char and indicates that the subsequent character is to be taken literally, not as e.g. column delimiter.
  • TEXT_QUOTES_ESCAPE_CHAR: Leaves the question on how to make quotes appear inside the text, like in IT Costs; “software related” only. One option the file creator might have used is to simply use the global escape character: 2000;”IT Costs; \”software related\” only”;435.55. Another popular method is the have the quotes escaped by another quote like in 2000;”IT Costs; “”software related”” only”;435.55. In that case both the TEXT_QUOTE=” and the TEXT_QUOTE_ESCAPE_CHAR=” are set to the ” character.
  • QUOTED_TEXT_CONTAIN_ROW_DELIMITER: The default value is false and tells the parser regardless of any quotes or escape characters the text inside a row does never contain the row delimiter character sequence. In this case the parser can break the file into rows much faster, it needs to search for the character sequence only, and only the column parsing has to consider the escape and quote characters. If set to true parsing will be slower.


Fixed Width only

  • COLUMNSSTARTENDPOSITION: In a fixed width file the column positions need to be specified for each column. Example:
    • 0-3;4-11;12-37;38-53 defines that the first 4 characters are the first column, the next 8 contain the data for the second column etc. Columns need to be in the proper order.
    • 0;4;12;38 is equivalent to above example, the last column ends with the line end
    • 0;4;12;38-53 can be used as well. In fact every single column can either specify the start and end position or just the start.
  • ROWLENGTH: In fixed with files there does not need to be a row delimiter. Often the file has some and then they need to be stripped away. Examples assuming the last data character is at index 53 as specified above:
    • ROWLENGTH=56 ROW_DELIMITER= would work for a file that has a row delimiter. The payload text ranges from 0..53 and hence 54 characters long plus two characters for \r\n. But the last column does not contain the \r\n as it is told to end at index 53.
    • ROWLENGTH=54 ROW_DELIMITER=\r\n is equivalent to above. Each row is expected to be 54 characters plus 2 characters long. The main advantage of this notation is that COLUMNSSTARTENDPOSITION=0;4;12;38 would work as well as the trailing \r\n is stripped away. In the previous example the the last column would start at 38 but end at index 55 due to rowlength=56 and hence contain the \r\n characters in the last column.




Using the virtual table


When the virtual table is created it has 5 additional columns:

  • PATH: The path name of the file being read
  • NAME: The file name this row belongs to
  • ROWNUMBER: The line number inside the file
  • PARTITION: All rows get evenly distributed into 10 partitions (ROWNUMBER modulo 10)
  • ERROR: A string with all column parsing errors




Executing a select * from virtualtable will return all rows of all files the virtual table is allowed to read, that is all files or the matching ones based on FORCE_DIRECTORY_PATTERN and FORCE_FILENAME_PATTERN settings. Therefore each row tells what file it belongs to in the PATH and NAME columns. As this text can be quite lengthy, it might be a good idea to not select those columns unless the information is needed of course.

This is actually an important capability of the adapter, it supports projections. Therefore only those columns actually needed are transferred, parsed, converted into the proper datatype. The “top” clause is pushed down to the adapter as well, it would read the first lines and then finish.

The adapter also supports filters on PATH and NAME hence all where clauses with equal, in-lists or like are pushed to the adapter and it can navigate to the matching files directly.


When a parsing error occurs the reading does continue still but the information about all errors in that row are put into the ERROR column. Things like “Cannot convert to seconddate using the provided format and locale (abcd)”. Hence reading the file without checking the ERROR column is dangerous.


In order to simplify finding issues in the file, the line number is returned in the ROWNUMBER column. In above example the file starts with row number 2 as the first line is the header line.


In case the file is very large and has lots of expensive data conversions the virtual table provides a PARTITION column to read a subset of the file only. This can be used to conveniently execute for example two insert…select statements in parallel, each processing half the file.

insert into target select * from v_us_county_census where partition in (0,1,2,3,4);

insert into target select * from v_us_county_census where partition not in (0,1,2,3,4);


Since the flowgraph supports partitioned reading as well, the file adapter works well together with that feature.


To report this post you need to login first.


You must be Logged on to comment or reply to a post.

  1. Former Member


    Thank you for your post! We’re using Smart Data Integration to virtualize multiple tables from existing SQL Server sources. From a practical perspective, accessing smaller reference information from legacy sources without having to actually replicate the data in HANA is a way to stay current from outside source systems without the overhead of maintaining an alternative subscription and publication model.

      1. Former Member

        LOL! Well, let me know when your newest baby is due–and of course– what the name/acronym for her will be, so that I can take her on a trial run as soon as possible! 🙂

        1. Werner Daehn Post author

          Well, as analogies go, I will have my hands full of nurturing it and let it get more advanced. With the adapters I am quite happy, the UIs take the most time.

  2. Colm Noonan


    This is a great post! I have the File Adapter working for CSV files. However, I am have difficulty with the Fixed File Format. As I am getting a “unknown keyword in the fileformat cfg file” error message. I cannot see an error in my CFG file. I’m wondering if Fixed File Format functionality was released in SPS 9 or a later release of HANA?




    1. Werner Daehn Post author

      As I recall it was an SPS10 functionality.

      Please keep in mind the cfg file has to have a first line with the description. But that is unrelated to the problem you are facing.

      1. Colm Noonan

        Hi Werner,

        Thank you for your reply. I upgraded everything to SPS 11 (DP Agent, HANA DB, HANA Client, HANA Studio) and it now works for me.



  3. Former Member

    Hi all,

    Thanks for this post! Helps a lot.

    I´m facing the follwing issue:
    config setting “locale=de_DE” is ignored when replicating decimals into SAP HANA.

    in config file i´ve added rows



    data file has entry e.g. 10,5

    when replicating is see 105 in SAP HANA virtual table (instead of 10,5)

    when changing data value in data file from 10,5 to 10.5 value is replicated correctly (although locale is still set to de_DE).

    anyone facing similar issue?



      1. Former Member

        Hello Werner, Andreas,

        had a similar issue: CFG file with locale ‘de_DE’ and the value ‘0,01’ was written into the target table as value ‘1’.

        Opened a internal incidenet ticket 1670064335 which was closed on March 24th,

        with the comment that the issue has been identified and a fix will be available

        in the next version of the DPAgent.

        Perhaps the resolution covers your above described problem that the “,” is not correctly interpreted.

        BR Dietmar

        1. Werner Daehn Post author

          Root cause got identified by development: The locale de_DE is stored in the virtual table properties table as German_Germany and this string the locale formatter does not recognize this as German locale.

  4. Former Member

    Hello, thank you for this, it helped greatly to get me going.  I am picking up a csv file from a share and loading it into a table in Hana DB and is working great.  I am running into an issue when I try to do more than one file.  I have even just copied the file and changed name to make sure it was not an issue with the other file.  I have my config file set up like below.  But get this error:  (dberror) 2048 – column store error: “_SYS_BIC”.”SUGAR.INNOVATION.SDI::SDI_FILE_DF_SP”: line 8 col 0 (at pos 193): [2048] (range 3): column store error: task framework: [403] internal error: Remote execution error Data is not read by Hana as it seems for query “SELECT “FieldInventory_Inventory”.”PATH”, “FieldInventory_Inventory”.”NAME”, “FieldInventory_Inventory”.”ROWNUMBER”, “FieldInventory_Inventory”.”PlantNumber”, “FieldInventory_Inventory”.”FiscalYear”, “FieldInventory_Inventory”.”FiscalPeriod”, “FieldInventory_Inventory”.”Material”, “FieldInventory_Inventory”.”Mat_DEsc”, “FieldInventory_Inventory”.”Unit”, “FieldInventory_Inventory”.”Quantity”, “FieldInventory_Inventory”.”Damaged”, “FieldInventory_Inventory”.”OnHold”, “FieldInventory_Inventory”.”Aged” FROM “Inventory” “FieldInventory_Inventory” “

    Fileformat to read a CSV file with plan data














    DATEFORMAT=dd. MMMM yyyy


    SECONDDATEFORMAT=yyyy.MM.dd HH:mm:ss

    TIMESTAMPFORMAT=yyyy.MM.dd HH:mm:ss











    1. Jim Havlicek

      Hi Edward,

      I see this question is more than a year old, but, yes, you are correct – this was a bug in the FileAdapter.  It was fixed a while back and is available in the latest versions on both the 1.3.x line and the 2.x line.



  5. Former Member

    Good day,

    What if the files that I receive have incremental information and at the end of the file processing I want to move the file so the FileAdapter does not have to scan that file again?, what would be the right approach to this?

    1. Jim Havlicek

      Hi Antonio,

      with the FileAdapter, you can create a virtual procedure that will run any batch/script file that you put on the DP Agent machine.   So you can write a batch file to delete/move the files, and then call it from a flowgraph.


      • Here are some links to the online doc that show how to create the virtual procedure

      START HERE : This is the main link that gets you started – it shows how to call a BAT file or script file through the FileAdapter.


      This page shows how to create a virtual procedure with WebIDE.  It is NOT specific to the FileAdapter.  The FileAdapter has fewer capabilities than a regular DBMS adapter, so some of the magic mentioned on this page didn’t seem to work for the FileAdapter.    For instance, step 3



      This link shows how to create a virtual procedure with SQL, but frankly it’s discusses some more  complicated topics than you need to get a simple one working.



      This page describes the CALL statement


      Good luck,  hope that helps,


  6. Former Member

    Great Blog Thanks,

    One issue i noticed is when i tried to use

    insert into “PhysicalTable” select Columns from “LogicalTable” where “PARTITION” in (0,1,2,3)

    i am getting errors saying IN operator is not allowed.

    SAP DBTech JDBC: [403]: internal error: Error opening the cursor for the remote database Unsupported condition (IN)


    its working fine when i use

    insert into “PhysicalTable” select Columns from “LogicalTable” where “PARTITION” = 0





  7. Former Member



    I m loading a fixed file, about 40 columns and about  700.00 rows.

    i m facing a very bad problem.

    ROWS: ‘112148’,’392516′

    Are ALWAYS loaded in a bad way. I tried to manage the file chaing rows, characters, position without any success.

    Have anybody faced this issue?

    thanks in advance

    1. Jim Havlicek

      Hi Matteo,

      yes, this was a bug in the fileAdapter – under some conditions, every few hundred thousand rows, the last few columns would get garbled.  This has been fixed – I think it in version 1.3.9, but certainly in the latest version on both the 1.3.x line and the 2.x line.




  8. Clemens Kopfer


    what is best way to see if that partitioning on virtual table (I use flowgraph) works?

    I use HANA 1.0 SPS12 rev 009 with DPA on 1.3.5. So that should be sufficient.
    Somewhere I should see several tasks running in parallel, but my gut feeling is that this is not working…

    Tried it with splitting up Northrine-Westphalia into its “counties”.


    1. Jim Havlicek

      Hi Clemens,

      yes, you are correct – there is a bug preventing partitioning from working with the FileAdapter.  This bug is being fixed in both the 1.3.x lines and 2.x lines and should be released soon.  For now, your only alternative is to manually split a large file into several files and load each individually.



  9. Former Member

    Hi Werner,

    .TXT files are placed in DP Agent Server and it connected to HANA using File Adapter of SDI. The Column are separated by | (Pipe Delimiter). The virtual table is not able to read the last column data


    i have data in this format in .txt file. The last column data (for e.g. 680769.8) is not visible in virtual table in hana for all the rows.

    Please find the cfg file .

    Please help me how can i get the last column data in virtual table.




    1. Jim Havlicek

      Hi Smith,

      yes, Vladimir is correct.  This was a bug in a couple of versions of the DP Agent, but it is fixed in the latest 1.3.x and 2.x versions.




  10. Former Member

    @jim.havlicek We are doing a scenario where we need to load HANA DB via SDI and source is File. Can you please let us know if xml files are supported in file adapter and does file adapter support complex nested structure in xml?


    File(xml file) –> SDI –> HANA DB table


    Thanks in advance,


    1. Jim Havlicek

      Hi Aman,

      no, the FileAdapter only supports delimited and fixed-length flat (i.e. no nesting) files.

      I will ask around a bit, but perhaps the Camel JDBC adapter in SDI would work for you.  You’d have to grab a JDBC driver for XML and try it.  I am not sure how well that will handle a nested structure either.


      Perhaps I am missing something – I will post back after I ask around the dev team.



  11. Former Member

    Hi Werner,

    We are trying to access multiple .txt files as source to one single virtual table. we have one .cfg file where we mentioned FORCE_FILENAME_PATTERN=*.txt to read all the files.  Are we doing anything wrong? we see only one file under file formats source connection .

    1. Jim Havlicek

      Hi Kishore,

      the FileAdapter uses % as the wildcard, not *.    So, %.txt should get your oars back in the water.


      There are some edge cases that I should make you aware of :

      • the ‘txt’ part is case-sensitive, even on windows.  So, if that causes an issue, there is a complicated workaround – let me know.
      • also, the period acts as a single-character wildcard.  So, %.txt will match to a filename of whatevertxt    (with no extension)



  12. Kok Weng Leong

    Hi Werner, Jim,

    I am testing with the FileAdapter on my local machine to a HANA 1.0 SPS12 DB on the Cloud Platform. Everything works fine, except it only reads up to 1,000 rows in the source file then stops. I tried changing the limits on the FileAdapter but it is still reading only 1,000 rows, and also tried adding parameter remote_objects_tree_max_size on the server.

    Any ideas where this limit is set?




    1. Kok Weng Leong


      I ran the data provisioning without the config tool running on my machine, and it is loading past the 1000 row limit. At least that is what I think fixed it. I have not changed any limits on HANA.





Leave a Reply