Skip to Content

This post offers some advice to people embarking upon their Unicode  process, it is definitely not an exhaustive list of things to do.

Your  company or client has decided to do either an Upgrade and Unicode  conversion or just a straight Unicode conversion of your MDMP System,  the first piece of advice I will give you, my technical colleagues, is  be prepared for a wild ride and to be saddled with many  responsibilities.

An MDMP system is a system which serves many  different countries where the languages cannot be displayed using the  default SAP 1100 code page, as a result different codepages were  introduced to expand the number or characters applications could  support. Unicode is capable of displaying every character in every  language, so it simplifies many of the system operations.

Due to  the data intensive nature of the process it is necessary to have a cross  discipline team to be responsible for this part of the process.
  Having been through this process, I would define my project dream team  as having the following members
1. Basis consultant – responsible  for the running of the Unicode conversion process
2. Data migration  consultant – responsible for ensuring that the language scans and  processes are done properly and vocabularies are properly maintained.  (this could also be an internationalisation expert (I18N), but there are  not many around)
3. A client representative – responsible for  talking to the business to determine the data flow of processes, as well  as how the tables and data is being used
4. Language assignment team –  responsible for assigning the unknown words in the vocabulary to a  language and a codepage.
5. Good team lead with a strong technical  background.

This is a wish list, and is only based on a single  project, but I probably got much heavily involved than many basis  consultants. Effectively I took on the roles of 1,2 and 5, from the list  above.

I am going to break down the timeline for our Unicode  conversion process by each system.
First we ran the process against  DEV, this was challenging because 4.6C does not support the full  pre-Unicode tools, there is no SPUMG or UCCHECK – instead there is a  limited tool called SPUM4 which is used to scan every word in every  record in every table to ensure that it has a language assigned to it.  If a word in a record is detected as not being present in a vocabulary,  then it is flagged as needing assignment.

We engaged SAP and received assistance from one of their  I18N experts, it would be fair to say he wrote the book on the SAP Unicode process. With Nils we ran the scans  throughout the system, and started the data analysis of the results. We  found a very nasty surprise within the vocabulary – users from Russian  had been entering data in a non-standard way, the users had been using  the I18N settings as shown below. This meant that the data in the system  was effectively using Microsoft Russian codepage ASCII values, not SAP  codepage ASCII values. This meant that if a word was not assigned  correctly between the codepages, the ASCII value of the letters will be  wrongly converted and the word will be corrupted.

After much  deliberation, we established that the data within the DEV  system was  not great, and we needed to know the scale of the problem, so  we  completed the language assignment to within 10000 unknown unique  words  and ran the CUUC process. Once the conversion was completed we found  massive levels of corruption in the database, far too much to fix using  SUMG.
We learnt a valuable lesson about the Russian data entry, that  it was going to be one of the major challenges throughout the process.  We also decided to use copies of Production to improve our data quality  and provide a more iterative approach to the vocabulary conversion.

The  project team obtained a copy of Production and copied it back to create  a new QAS system. We began the process anew, but this time we did three  new things,

Introduced a new codepage, this codepage is designed  to accommodate the Microsoft Russian words (English/Russian)

Repurposed  a language using the same codepage as SAP Russian (codepage 1500), in  this case BG – this became our SAP Russian (Russian/Russian)

Add  both SAP Russian (RU) and English (EN) to the ambiguous language list.  This means any word the system recognises as being either language is  placed in the vocabulary to be checked.


Once we got within our comfort  zone of 10000 unique words, we executed the CUUC with worse results  than the previous run. This was because a table UMGCCTL (the Unicode  control table) became corrupted during process and meant that all the  tables were converted using codepage 1100 as the R3load process could  not determine the correct codepage for each record. This was a horrible  turn of events as the technical team gave up much of their Christmas to  complete the CUUC process, but there was a silver lining.
We had  another items to check to ensure a correctly running export and  conversion. It also prompted the project to grant another 2 attempts at  conversion, and also a repeat of the Unicode process on QAS. This time  it was completed successfully, but we had chosen the wrong road when  assigning English and Russian/Russian as ambiguous.

At this  point the project team were a little bruised and battered, but we had  learnt a great deal about the process, these lessons would give us a  great deal of confidence in the later phases, because you can learn more  from your mistakes than you can from your successes.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply