SAP Batch Job Restart on Error
Background
IT-Conductor offers an automation solution for handling the restart of SAP batch jobs when they fail. This covers the detection of aborted batch jobs, automatic restart of the failed jobs, and notification of the appropriate job owner, including the delivery of the job log as an attachment. In advanced cases, IT-Conductor can also restart the job with a specific variant, and/or from a specific step. Depending on the complexity of the conditions on how you want to restart a particular job, IT-Conductor can be configured to execute this process to reduce the MTTR (Mean Time to Repair).
Prerequisite Requirements
In your SAP environment, create a dedicated SAP service user to monitor and execute the jobs.
In the IT-Conductor main menu, navigate to Support → Downloads → SAP Security Downloads, and download the “SAP NW Batch Scheduling Role Import” file.
Assign this role to the recently created job monitoring SAP user using the PFCG transaction code.
Navigate to a system in the IT-Conductor Service Grid where you’ll be creating batch jobs and select “Accounts”.
Create a robot user in IT-Conductor and associate it with the previously created SAP account. Give the user a descriptive name.
Create Threshold Override for Job Restart
You may create a threshold override from a template. IT-Conductor has templates for all metrics. In this case, since we want to restart a job after it’s failed or it’s been aborted, we’re going to navigate to the existing overrides for this metric.
Navigate to System → Background jobs → Aborted → Threshold override.
Click the “Create Override from the Templates” icon.
Click the template to create a new override.
Click Save to complete the override configuration.
Create a Recovery Activity to Restart the Job
A recovery activity is an option that allows you to automatically take action whenever an incident occurs in IT-Conductor. Recovery activities are predefined by IT-Conductor Support based on the required automation process or scenario.
Click back to the recently created Threshold Override and scroll down to the “Recovery” section.
To turn on the recovery activity, select “Warning”, or “Alarm” on the “Recovery on” option.
If you select “Warning”, the recovery activity will run when the Warning threshold is exceeded.
If you select “Alarm”, the recovery activity will run when the defined Alarm threshold is breached.
Select a recovery activity from the “Recovery” list. In this case, we’re going to select the activity for Copy and Start Job.
Select the previously created automation user as “Owner”.
Check the “Alert” box if you want to be alerted whenever this recovery activity occurs.
Save Recovery Activity.
If you wish to be notified when a job has failed, select either “Warning” or “Alarm” in the “Alert On” option. (Optional)
Batch Job Recovery Activity in IT-Conductor