Skip to Content
Author's profile photo Former Member

Importing ” into BI

Recently I have been working with BI to do some management reporting.  One of my tasks is to take some text files with some fairly dirty data and import them into our BI system.

Our file contains a list of semi-colon separated fields with double quotes surrounding fields containing semi-colons as part of the data.  The first thing that threw me is that BI calls the surrounding quote characters the escape character.  In my world (being a Java programmer for as long as I can remember), the escape character is normally the back slash and is used to change the way the next character is interpreted.  So \n becomes a new line, \t is a tab and \” is used to put a double quote into the middle of a double quoted string.  This is also how the system that produces our source data handles the issue – if a double quote shows up in the middle of a double quoted string, it “escapes” it with a back slash – \”.

 File split error.  Hmmm, not good.

So after a quick search threw up no ideas, I fell back on the swiss army knife of the Linux user – Perl.  Before long I put together a quick Perl one liner to pull out these escaped double quote characters.


perl -pi -e ‘s/
“//g’ perl -pi -e ‘s/
“(?!;)//g’ for f in ls *; do echo Processing file $f; perl -pi -e ‘s/
“(?!;)//g’ $f; done</pre>

Assigned Tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member
      This sounds great. However how do I execute the perl commands in BI on the upload?