Tools

App Services Auto heal

How many times has it happened to us that our application begins to experience problems?

  • Memory problems
  • Slowness in requests.
  • 500 errors

These problems may be temporarily resolved with a simple restart or other mitigation steps. The problem is that this kind of issues can occurs during our non-business hours. The auto heal feature is a great tool that we can use to set up a custom mitigation actions to run when some conditions are met( You can configure what you need).

Where can I configure the Auto-Heal tool?

  • On the Azure Portal we Will need to move to the app Service itself. On the menu options   we have the “Diagnose and solve problems”, please click on it.
  • On the Azure Portal we Will need to move to the app Service itself. On the menu options   we have the “Diagnose and solve problems”, please click on it.
  • On the Azure Portal we Will need to move to the app Service itself. On the menu options   we have the “Diagnose and solve problems”, please click on it.
  • On the Azure Portal we Will need to move to the app Service itself. On the menu options   we have the “Diagnose and solve problems”, please click on it.

¿How to configure the Auto Heal feature? 

Step 1:

First we need to do is select the condition that we want to set on the mitigation rule.

The conditions supported on the Auto heal:

Request Duration: examines slow requests.

Memory Limit: examines process memory in private bytes.

Request Count: examines number of requests.

Status Codes: examines number of requests and their HTTP status code

Note: Parameters that have a red asterisk next to it are required fields.

After reading the description, select the blue button to configure the rule parameters that you need.

Step 2:

Select the option that best matches the auto heal mitigation action that you need to perform under the mitigation rule conditions that we configured on the step 1.

The auto heal feature supported 3 possible actions to mitigate the problems:

  1. Recycle Process:
  2. Log an Event:
  3. Custom Action:
    1. Run Diagnostics:
      1. Memory dump
      1. Java Memory Dump
      1. Java Thread Dump
      1. CLR Profiler
      1. CLR Profiler with Thread Stacks
    1. Run Any Executable
Actions Models Available:

Collect: In this mode, we will collect data depends of the action selected. When the data is collecting, the process is frozen until the data collection completes. The time the process is frozen depends directly on the memory consumed by the process.

Kill: In this mode, the process is killed when the condition is met. Kill is a forceful termination of the process and not a graceful exit. All requests that the current worker process is processing will be terminated, and end users may see 502 errors.

Collect and Kill: This mode is a combination of the collect and the Kill mode, the data is collected( the process will be frozen until the dump generation) and then the process is killed when the process met the condition configured. That means that we will kill the process and if there is any requests processing on the worker there will not gracefully time for the requests to finish to execute. Will be a force kill of the process.  

No analysis is performed but after the session finishes, you have an option to analyze the memory dump after the session ends by clicking the Analyze button.

Collect, Kill, and Analyze: In this mode, the data is collected, and the process is killed when the process met the condition configured. In addition, the data is analyzed and an analysis report is generated.

Step 3:

Sometimes when an app has a long startup time, depending on the mitigation rule conditions that are set, it may kick off the mitigation action during app startup, which is not the intended use case. By modifying the startup time, you can specify how much time the mitigation rule should wait after the process startup before the mitigation rule kicks off.

Step 4:

Review and save the settings.

Saving the settings the application will be restarted for this reason we recommend to make this changes during non business hours.

We need to take into consideration that this mitigation actions should be considered a temporary workaround. The idea of this feature is to provide tools that can help to identify the cause of the unexpected behavior.

The auto heal is different to the Proactive Auto Heal.

NOTE: This feature is only available on windows applications.

One thought on “App Services Auto heal

Leave a Reply

Your email address will not be published. Required fields are marked *