In the coming weeks, Atlassian will introduce a new automation enhancement for all its products and editions to increase the reliability of rules. The automation extension will now attempt to re-execute automation rules that have been interrupted by temporary system or configuration issues. This will ensure that workflows continue to run smoothly and without interruption.
A great feature, but one that still leaves some questions unanswered. We answer the most important ones.
When is an automation rule restarted?
To answer this question, you need to know the difference between configuration errors and system errors:
Configuration errors and system errors differ essentially in that configuration errors are of a long-term nature, while system errors only occur for a short time or temporarily.
- Configuration errors occur when incorrect rules are set up when setting up the components. This happens, for example, when creating a work item in a project that has already been deleted.
- Configuration errors also occur when changes are made in external systems. For example, if a web request in the action of an automation rule refers to an outdated API endpoint from a third-party provider.
These error types do not trigger an automatic retry, as no different result can be expected even if they are repeated. But what are system errors?
- System errors are temporary interruptions within the Atlassian platform, such as brief service interruptions. These are usually rectified quickly.
- A system error also occurs if, for example, a work item is to be transferred from one status to another as an action in an automation due to certain conditions and execution is interrupted due to a temporary error.
In such cases, the automation will attempt to restart the rule from the point of interruption. This process will continue until either the action has been successfully executed or the 7-day window for the retry (from the time of the original error) has expired.
The retry mechanism itself does not contribute to the processing time of a rule. Only the time for processing the actions performed is counted.
Can the repetition of rules be tracked?
If the repetition of an automation rule works, the fact that it is a repetition usually goes unnoticed. However, you can see in the audit log which rules are waiting to be repeated and which have already been successfully repeated.
1. New Status: In the Queue for Retries
If a rule has been stopped due to a system error and is waiting to be repeated or is currently being repeated, it is given the status "In the retry queue". To find out exactly where the rule was stopped, click on "Show more".
2. Existing Status: Successfully Repeated
If a rule was canceled due to a system error and then successfully retried, the existing success status is displayed. If you click on "Show more", the component that was retried is displayed with a circular arrow symbol instead of a tick.
3. Existing Status: Retried and Failed
If a rule was stopped and retried due to a system error but failed, the existing error status and details are displayed under "Show more".
Open Questions
Some questions could be clarified, others remain open even after reading the documentation. For example:
- To what extent is this function configurable? (Project / Rule / Action) How many retries are there?
- The period of 7 days seems quite long. Can it be shortened or even canceled manually?
- Can the creator also be notified in the event of a repeat of a failed action?
- Can the rule be changed within the 7-day retry period?
- What happens to the retries if the automation rule is changed in the meantime?
- How can temporary interruptions be quickly identified in the audit log?
- Can certain actions be identified for which it makes sense to repeat them?
Hopefully Atlassian has at least provided some configuration options, but this is not certain at the moment. We will know more in a few weeks.