Skip to Content
Author's profile photo Former Member

Restarting the unresponsive Adaptive Job Service using a script

We recently came across an issue where our scheduled jobs on BI 4.1 platform wouldn’t run as expected; they would just not start. We have a clustered server setup and the job servers are running on couple of nodes. When this happens, the job server status on CMC would appear to be fine. We have raised this issue with SAP support and working on finding a resolution. In the meantime, we used the below Power Shell script as a work around to restart the job service when the schedules don’t run as expected.

Solution

We scheduled a simple web intelligence report that runs every 5 mins; Using PS, we checked for the status of this report through Auditing database. When there are no entries made in the auditing database for a set threshold time, we use CCM.exe to restart the JobServer.


#Auditing database credentials
$dataSource = "Auditing Database Server"
$user = "Auditing database user"
$pwd = 'Auditing database password'
$database = "Auditng Database Name"
$threshold_mins = 10
$connectionString = "Server=$dataSource;uid=$user; pwd=$pwd;Database=$database;Integrated Security=False;"
#Win AD authentication to SQL Server
#$connectionString = "Server=$dataSource;Database=$database;Integrated Security=True;"
#Query to check the scheduled report event in auditing database
#-480 mins is to adjust for local time Western Australia Standard Time
$query = "select count(0) from ADS_EVENT where Event_Type_ID=1011 and Object_name='Webi Heart Beat v1.0' and Start_time>=dateadd(minute,-480-$threshold_mins ,getdate())"
$connection = New-Object System.Data.SqlClient.SqlConnection
$connection.ConnectionString = $connectionString
$connection.Open()
$command = $connection.CreateCommand()
$command.CommandText = $query
$result = $command.ExecuteReader()
#write-output $result.HasRows
while($result.read())
{
    Write-Output $result.GetValue(0)
    if ($result.GetValue(0) -gt 0)
    {
        Write-Output "Job Server Ok"
    }
    else
    {
    //Code to send email
    send-mailmessage -to "abc@xyz.com" -from "abc@xyz.com" -subject "Job Server being restarted" -SmtpServer smtp.xyz.com.au
    //Code to restart the server
  Invoke-Expression '& "C:\Program Files\SAP BusinessObjects Enterprise XI 4.0\win64_x64\ccm.exe" -managedstop SIA_BOBJ_DEV.AdaptiveJobServer -username administrator -password ######'
    Invoke-Expression '& "C:\Program Files\SAP BusinessObjects Enterprise XI 4.0\win64_x64\ccm.exe" -managedstart SIA_BOBJ_DEV.AdaptiveJobServer -username administrator -password ######'
    }
}
$connection.Close()

Assigned Tags

      12 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Sivakumar Chandrasekaran
      Sivakumar Chandrasekaran

      yes Mohanraj, it's bit tricky to find hung service. we can find easily if the service is failed but it's not easy to find the service working properly or not.

      we also used another mechanism to find the service is responding or not. we were using a perl script :-). I was thinking another option like checking the process is moving or not.. never got change to try this..

      Author's profile photo Former Member
      Former Member
      Blog Post Author

      With the job server, it is one of the JobServerChild process that seems to be hanging.

      Author's profile photo Sivakumar Chandrasekaran
      Sivakumar Chandrasekaran

      if 5 jobs are running then 5 job server child process will be there so 🙂

      Our BO environment got 4 job server and each can run 5 jobs at a time so we were struggling with this issue for some time 🙂

      Author's profile photo Former Member
      Former Member
      Blog Post Author

      4.1 is much stabler and our job servers do not fail often, I hope 4.2 makes it even more better.

      Author's profile photo JinChong Tsai
      JinChong Tsai

      Have you test the following to the end of the AdaptiveJobServers' command line,

      -RequestTimeoutMinutes 60 -type outproc

      so that AJS will timeout after 60 mites of idle time and terminate children when job is completed.

      Regards,

      Jin-Chong

      Author's profile photo Former Member
      Former Member
      Blog Post Author

      We have been asked by SAP to add these additional parameters to Job Server this week. I shall wait for few weeks before confirming whether it helps in our case or not. Usually our job server goes to the unresponsive state once a month.

      Author's profile photo Former Member
      Former Member

      Have you noticed any change? This has recently plagued our system and we are trying to find root causes and fixes other than restarting SIA. We too have a large clustered environment with 15,000 scheduled jobs and growing.

      I do appreciate the PowerShell script you've created. Will definitely be looking to implement something similar if this continues.

      Thanks,

      Corey

      Author's profile photo Former Member
      Former Member
      Blog Post Author

      Hi Corey,

      We couldn't find the root cause for the issue. Restarting just the Adaptive Job Server fixes our issue; the script above just restarts the Job Service. In case if your deployment has more than 1 Job Server, the chances of them going down at the same time is much lower.

      Author's profile photo Sivakumar Chandrasekaran
      Sivakumar Chandrasekaran

      we are using this parameter but no luck. outproc parameter is good, each job is running on the separate child process.

      Author's profile photo JesĂșs Antonio Santos Giraldo
      JesĂșs Antonio Santos Giraldo

      Well,

      It seems BO 4.2 SP3 hasn't changed at all. I have the exact same behavior.

      I'm going to try the "-RequestTimeoutMinutes 60 -type outproc" parameter and see how things goes.

      Hope not to have to implement the powershell described here...

      J.

      Author's profile photo Former Member
      Former Member

      Hello,

      we have the same issue with the Adaptive JobServer that's why we do nightly a restart of the SIA.

      Sometimes some futher servers are not active or do not run after the restart, but onlysome times. Now we would like to overcome this problem by implementing a kind of server probe. Is there a way to get the states of the Server (running, active) out of the cms database so that we can do a restart by this script?

      Thanks.
      Kind regards,
      Tobias 

      Author's profile photo Srinivas Perumandla
      Srinivas Perumandla

      Hi Mohanraj,

      We are experiencing the same issue with the adaptive job server and would like to try this power shell script. But , I am novice to power shell script. Can you please help with the steps how to deploy this script.

      Thanks,

      Srinivas