Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
former_member230159
Contributor
0 Kudos
Greetings for the day!

PART I


The ERROR PHASE :


I recently faced an issue while upgrading ECC 6.0 to EHP8 with oracle 12C using SUM 2.0

The issue was in phase MAIN_SHDPREPUT/START_SHDI_PREPUT!

The detailed error is :

Severe error(s) occurred in phase MAIN_SHDPREPUT/START_SHDI_PREPUT!

Last error code set: Shadow instance couldn't be started, check 'STARTSSC.LOG' and 'DEVTRACE.LOG': Process /usr/sap/<SID>/SUM/abap/exe/sapcontrol exited with 2, see '/usr/sap/<SID>/SUM/abap/log/SAPup.E CO' for details

To analyze errors during the start of the shadow instance, check the 'STARTSSC.LOG' and 'DEVTRACE.LOG' files in directory '/usr/sap/<SID>/SUM/abap/log'.

Repeat the phase until the shadow instance is started and you can log on to instance number '02'.

Trouble Ticket Generation A trouble ticket and an archive with all relevant log files have been generated.
Trouble ticket: "/usr/sap/<SID>/SUM/abap/log/SAPup_troubleticket.log"
Log archive: "/usr/sap/<SID>/SUM/abap/log/SAPup_troubleticket_logs.sar"
Repeat phase MAIN_SHDPREPUT/START_SHDI_PREPUT to continue at the point it


Environment :


My OS is running on Red hat linux version 7.2

PART II


ANALYSIS PHASE :


As first step I read about what is the meaning of the phase "START_SHDI_PREPUT" . It means that SUM starts the shadow instance the first time.

Next I started to look for more information in the following traces STARTSSC.LOG' and 'DEVTRACE.LOG

The contents of the STARTSSC.LOG :-

1 ETQ201 Entering upgrade-phase "MAIN_SHDPREPUT/START_SHDI_PREPUT"
1 ETQ399 SYSTEM HEALTH MANAGER: check for instance processlist.
1 ETQ399 SAPCONTROL MANAGER: getProcessList with host: sanysandbox and instance: 02
3 ETQ120 : PID execute '~/exe/sapcontrol -format script -prot NI_HTTP -host sanysandbox -nr 02 -function GetProcessList', output written to '/usr/sap/SID/SUM/abap/log/SAPup.ECO'.
3WETQ122 : PID exited with status 4 '' (time:     0.0/    0.0/    0.0/85MB real/usr/sys/maxmem)
1 ETQ399 SYSTEM HEALTH MANAGER: System is down, go on with start action
1 ETQ399 SAPCONTROL MANAGER: StartWait with host: sanysandbox and instance: 02
3 ETQ120 : PID execute '~/exe/sapcontrol -prot NI_HTTP -host sanysandbox -nr 02 -function StartWait 300 10', output written to '/usr/sap/SID/SUM/abap/log/SAPup.ECO'.
3WETQ122 : PID exited with status 2 '' (time:    41.0/    0.0/    0.0/85MB real/usr/sys/maxmem)
1 ETQ399 SYSTEM MANAGER: SAPControl action START failed for instance 02 ('SAPCONTROL MANAGER: call (sapcontrol) failed with return code '-1'
1 ETQ399 ').
1 ETQ399 SYSTEM MANAGER: CheckSystemStatus.
1 ETQ399 SAPCONTROL MANAGER: getProcessList with host: sanysandbox and instance: 02
3 ETQ120 : PID execute '~/exe/sapcontrol -format script -prot NI_HTTP -host sanysandbox -nr 02 -function GetProcessList', output written to '/usr/sap/SID/SUM/abap/log/SAPup.ECO'.
3 ETQ122 : PID exited with status 0 '' (time:     1.0/    0.0/    0.0/85MB real/usr/sys/maxmem)
1EETQ399 SYSTEM MANAGER: START of mandatory instance 02 on server sanysandbox has failed
2EETQ399 Starting shadow instance failed
1EETQ399 Last error code set is: Shadow instance
1EETQ399Xcouldn't be started, check 'STARTSSC.LOG' and 'DEVTRACE.LOG': Process /usr/sap/SID/SUM/abap/exe/sapcontrol exited with 2, see '/usr/sap/SID/SUM/abap/log/SAPup.ECO' for details
1EETQ204 Upgrade phase "START_SHDI_PREPUT" aborted with severe errors

So from the above information I can see that the sapcontrol tried to start the instance but it failed to start the 02 instance.

If we check the GetProcessList output then we can see the output as :-

 

GetProcessList Output :-


EXECUTING /usr/sap/SID/SUM/abap/exe/sapcontrol -format script -prot NI_HTTP -host <hostname>-nr 02 -function GetProcessList

DATE / TIME

GetProcessList
OK
0 name: msg_server
0 description: MessageServer
0 dispstatus: GREEN
0 textstatus: Running
0 starttime:
0 elapsedtime: 0:00:41
0 pid:
1 name: enserver
1 description: EnqueueServer
1 dispstatus: GREEN
1 textstatus: Running
1 starttime:
1 elapsedtime: 0:00:41
1 pid:
2 name: disp+work
2 description: Dispatcher
2 dispstatus: GRAY
2 textstatus: Stopped
2 starttime:
2 elapsedtime:
2 pid:

Here as we can see the "disp+work" or Dispatcher is stopped ( in GRAY ) status.

Next I checked in another trace file "DEVTRACE.LOG"

In DEVTRACE.LOG I could find the entries as :-

M  ***LOG Q01=> ThInit, WPStart (Workp. 0 1 ) [thxxhead.c   1108]
M
M DATE / TIME
M  ThInit: running on host <hostname>
I  MtxInit: 0 0 0
I  *** ERROR => e=28 semget(20214,1,2016) (28: No space left on device) [semux.c      502]
M  *** ERROR => ThrRegisterSem: SemInit(14) failed [thxxrun1.c   311]
M  *** ERROR => ThCallHooks: event handler ThrRegisterSem for event BEFORE_DB_CONNECT failed (-1) [thSos.c      2124]
M  in_ThErrHandle: 1
M  *** ERROR => ThCallHooks: hook failed (step TH_INIT, thRc ERROR-CORE-INTERNAL_ERROR, action STOP_WP, level 1) [thxxhead.c   2560]

 
And finally the dispatcher process (disp+work) gave up !!

M
M  ***
M  *** work process W0 died => ThCallHooks: hook failed
M  call ThrShutDown (1)...
M  ***LOG Q02=> wp_halt, WPStop (Workp. 0 ) [dpuxtool.c   317]

 

This is all the information that we need to analyze the issue. 😉

 

PART III


Resolution Phase :


From the trace entries DEVTRACE.LOG

I  *** ERROR => e=28 semget(20214,1,2016) (28: No space left on device) [semux.c      502]
M  *** ERROR => ThrRegisterSem: SemInit(14) failed [thxxrun1.c   311]
M  *** ERROR => ThCallHooks: event handler ThrRegisterSem for event BEFORE_DB_CONNECT failed (-1) [thSos.c      2124]
M  in_ThErrHandle: 1

It is a indication of too low number of available semaphores on the system.The ENOSPC (error code 28) error for the 'semget' system call is returned if the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS) is reached.

The limits for the sempahores and semapohre sets is controlled through the 'kernel.sem' Linux parameter.

My initial value for kernel.sem was set to
kernel.sem = 250 32000 100 128

I logged in via OS level , edited the sysctl.conf file (sysctl.conf can be found under /etc/sysctl.conf) .

Once I made the changes I ensured the changes had been reflected. For this I run the "sysctl -p"

The output of the sysctl -p looked like :

net.ipv4.ip_forward = 0
kernel.shmmax = 89161967780
kernel.shmmni = 4096
kernel.sem = 1250 256000 100 1024 /* This is the value which I changed to */
kernel.shmall = 18350080
Post this I resumed my upgrade and it got successful 🙂

 

The SAP notes /KBAs and wiki links I referred to resolve my issue are listed below :

 

SAP KBAs/Notes :


2663418 - Semaphore error - e=28 semget No space left on device

1496410 - Red Hat Enterprise Linux 6.x: Installation and Upgrade

2002167 - Red Hat Enterprise Linux 7.x: Installation and Upgrade

1635808 - Oracle Linux 6.x SAP Installation and Upgrade

2069760 - Oracle Linux 7.x SAP Installation and Upgrade

1275776 - Linux: Preparing SLES for SAP environments (For SLES, the sapconf/saptune will configure high enough values for kernel.sem)

941735 - SAP memory management system for 64-bit Linux systems

SAP Wiki :


ERROR shmget ... (28: No space left on device), ERROR semget ... (28: No space left on device)

 

Let me know in case you face any queries.

 

Thank you and Best regards,
Manjunath Hanmantgad
3 Comments