Skip to Content

Scenario:

     To load special characters such as non-English characters  from a flat file(.txt) to target table whose code page is set to UTF-8 .

In order to load special characters from a flat file to target table we need to set appropriate code page or encoding   in :

a)   a)    In the source file

We need to set the encoding of source file to utf-8. This differs based on the text editor we use. Below mentioned are the steps for changing the encoding in notepad.

File –> Save As . Change the encoding to UTF-8

/wp-content/uploads/2013/07/1_250912.png

b)    b)   Flat file properties.

    In BODS, normally the code page is set to <default> in flat file properties  as given below:

/wp-content/uploads/2013/07/2_250919.jpg

     Change it to utf-8.

     /wp-content/uploads/2013/07/3_250920.jpg

c) Datastore properties

    Go to the target data store properties in BODS. Normally the code page is set to <default> as given below:

/wp-content/uploads/2013/07/4_250921.jpg

    

      Change Code page and Server code page to utf-8.

   /wp-content/uploads/2013/07/5_250922.jpg

To report this post you need to login first.

16 Comments

You must be Logged on to comment or reply to a post.

  1. Simon Le Monnier

    What if the underlying target (and / or source) code page for the DataStore is not utf-8 ?

    It could be set to, for instance, iso-8859-1. The Data Services documentation specifies that the DataStore Codepage should be set to match the underlying Database / Adapter / Web Service code page. If they’re different there is the potential to lose characters that are used in the superset (utf-8) but, not coded in the subset (iso-8859-1) – even with transcoding – not good!

    In an ideal world every data source and target would be unicode but, in reality, that is just not always the case!

    (0) 
    1. Theju R

      Simon,

      Its not a problem. If the target database support UTF-8, this will work fine. Normally all databases support these codes. The ODBC settings also needs to be changed to the desired code page. Otherwise there could be data loss while inserting to the target.

      (0) 
  2. Christian A Gonzaga

    Just to add that if the target table is on SQL Server, there is one more setting to be considered:

    change varchar to nvarchar datatype.

    This will ensure that all other special characters / multi-byte / non-English characters will be handled

    (0) 
  3. Jay vardhan

    you are right when you talk about flat file that it should be saved as utf-8 and in formats settings to be changed as utf-8 but as for DB are concerned, u dont need to change as they are mostly unicode.

    Thanks for the post.

    (0) 
  4. Naresh Veeramachineni

    Good information, thank you. But we run into a issue while handling Japanese characters. Source is Oracle (UTF8). And we set the code page all settings similar to the one you described above. Since two years our code is working fine processing Japanese characters. But we got a record which has Japanese characters with length 15000. Job is hanging or sometimes giving ‘End of Communication channel’. If we trim the data to 14000, its working fine. We tried different NLS_LANG, code pages, still not working. Any suggestions please?

    (0) 

Leave a Reply