Welcome back to the sequel of our beyond the deployment journey! In our previous adventure, we delved into the world of user support during the post-go-live phase. Today, we're diving deeper into exploring the crucial aspects of system monitoring during the Hypercare period.
As explained previously, Hypercare is a post-deployment phase where extra resources and attention are focused on supporting users and business processes. The project delivery team stays involved to ensure a smooth transition and address any unexpected issues.
For me, the most important part of this phase is the user’s support as covered in Part 1. However, there is another equally very important element that we need to focus on –system monitoring.
One of the first things I like to actively check is the system automation and integrations.
System automation comes in different flavours, but as a functional consultant, I would mainly focus on the Power Automate flows and classic workflows.
Power Automate
When it comes to Power Automate, there are a few things that we can do.
First one - you ideally have implemented the “run after fail” to all your Flows, so this should be a breeze. So simply - just monitor the setup you have configured – whether it’s a mailbox that receives failure notifications or a dashboard that shows the failed runs.
If you are not sure what I am talking about, watch Griffin’s video “How to CORRECTLY USE Run After in POWER AUTOMATE” where he explains it: https://www.youtube.com/watch?v=3UcWxFl9bww
If your Flows have not been built with the "try...catch" approach, it is not the end of the world – there are other ways of monitoring the Flows.
The alternative is to check the Flow runs via the Power Platform admin centre, where under “Analytics” you will find Power Automate. In there, you will see runs, usage, errors, etc.
Of course, you can also check the flows individually. However, when you have 100+ flows, you want to avoid that for sure. Therefore, check the “errors” tab regularly. Being proactive will help you to catch and resolve any potential issues. Sometimes, you will be able to catch the errors before users will report them. In some cases, users might not be even aware of any issues or there might be delays in them noticing errors and reporting them.
Classic Workflows
If you are like me and still using the classic workflows, make sure you monitor them as well. There are a few ways of doing this.
System Jobs
First one - via Settings, then System Jobs. In there, you will have all types of system jobs. You can filter the rows to those where the System Job Type is Workflow. For me, the unfortunate thing is that you need to apply that filter every time….
Advanced Find
However, I have an alternative for that! What is it? I simply use the Advanced Find. Yes, the tool that you use to find Accounts, Contacts, etc., can also be used to check on your workflow runs.
For the background workflows, all the runs can be found under the “Processes” table (entity).
I create a view with all the “failed” workflows and the ones in “waiting”.
For the latter ones, I would usually set a date range from “created on” as sometimes it might take a few days for the workflow to meet all the conditions and resume. Nonetheless, there also be situations when, unfortunately, a workflow was purely designed... And since the conditions will never be met, that will result in a workflow being in “waiting” till it’s manually cancelled.
While a background workflow is running, you have the option to Cancel, Pause, or Postpone it. If you have previously paused a workflow, you can Resume it.
If you are interested in this method, I created a post that shows how to use Advanced Find to create a view for monitoring system processes:
However, the real-time workflows (and actions) do not use System Job records because they occur immediately. If you want a view of all the errors for the synchronous processes, use the Process Session entity. (There is no log for successful operations.)
To monitor the issues, you need to enable logging for errors by checking the “Keep Logs for workflow jobs that encountered errors” option in the “Workflow Log Retention” area at the bottom of the Administration tab for the process (shown below).
Background Processes
If you know the specific record that the workflow failed for, you can access it via the Related tab on a record, then Background Processes. This will show all the system jobs that have been started in the context of the record, but you will also see their status and will be able to troubleshoot it from there.
Process Session
You can also open the workflow itself, and navigate to the Process Session tab. This will show only the system jobs for this workflow. However, if you have multiple issues regarding the same workflow, it is good to check the overall history of that workflow execution.
Additional tip:
Over time, the logs will unnecessarily take up the storage. Therefore, when you opt in for keeping the logs, also set up a periodical bulk delete job to clear them up.
Plugins
If you implemented any plugins to handle the automation, there are a few ways to monitor them too.
Settings
The first one is to refer to Settings, then Plug-In Trace Log.
Advanced Find
The alternative is to also use the Advanced Find, just like for the workflows. The table you want to use for your query is called “Plug-In Trace Logs”, which is much more obvious than looking for the workflow execution details, isn’t it?
XrmToolBox
Last, but not least (that I know of), is to use the XrmToolBox and the Plugin Trace Viewer that allows you to easily filter and investigate the plugin runs.
Integrations
Integrations are such an extensive subject – you can either have a one-directional integration or a bi-directional one and you can set up different frequencies for them with simple or complicated rules.
There are also multiple methods to handle them - Power Automate, dual rights (e.g. D365 model-driven app to D365 FinOps), APIs or more complex and sophisticated approaches.
As a functional consultant, I would usually be responsible for limited activities around this.
However, I would still engage with a more technical resource asking to support the review of the integration performance.
The second one is to verify that the integration is working as it should by consolidating the data from both resources to ensure it was integrated properly. It is a more time-consuming exercise, but if the automation does not show any errors, that method ensures that the created logic works as it should.
Low Code Plugins – the future of real-time automation?
At the time of writing this blog, the low-code plugins are still in preview. However, this technology seems to be the next great thing that could potentially replace the classic workflows. The latest and best feature of the low-code plugins is the “plugin monitoring”, which will be invaluable for the early days of monitoring (and obviously beyond the Hypercare period too!)
To find out more about it, check Nathan’s Rose post about it:
Conclusion
In summary, mastering the art of automation monitoring in the early days post-deployment is not merely about reacting to issues, but also about proactively checking on the seamless operations of the system automation and integrations.
With the right tools, knowledge, and foresight, we can easily catch and resolve any potential issues and minimise the negative impact of some of the errors that have not been caught during the testing phase.
However, apart from the users’ support and system monitoring, there are other very important activities of the Hypercare period – I’m looking forward to sharing with you the next part.
Stay in touch, so you won’t miss it!
Comments