Common issues

SNAT App Services

¿What is SNAT?

Source NAT occurs when a internal or private host need to make a connection to an external host.  

The device that make the NAT, will be the one in charge to change the private IP of the host and give a public IP

Tuple of connection:
  • TCP
  • Source IP
  • Source Port
  • Destination IP
  • Destination port

What would happen if two Vm’s inside the same cloud service create an outgoing connection?

We can’t change the destination IP or destination port. The tuple should be unique so we will need to change something, the option is changing the Source information.

This translation is very common, for example when you connect your phone to a public Internet site through WIFI router, your router does the network address translation too.

SNAT with App Service

It’s very common, the  web application need to connect to other endpoints inside or outside of Azure. Like Database, Redis, API’s or any other Restful webservice.

App service web application can’t establish network connections to the external Internet endpoints directly.

This is because of the Architecture  of the App Service. We have a scale unit(stamp), the worker instances bounded inside the scale unit. The app services is hosted by one or some App Service worker instances.

The workers don’t have Internet IP addresses assigned. In each stamp we have VIP and is shared among all the workers inside it, we can’t have a dedicated VIP for each worker. Every stamp has a load balancer will be the one in charge to do the Source Network Address Translation, aka SNAT, to connect to external IP addresses.

How does SNAT work?

  1. App Service application sends a TCP package to an Internet IP address. The source IP address and port number of the package is internal(the IP of the worker)
  2. The TCP package is routed from a worker instance to the load balancer. The load balancer perform the SNAT, changes the source IP and port of the TCP package into its own ones and sends it out to the Internet.
  1. The Internet server receives the TCP package. Later when it sends back any package, it uses the IP address and port of the load balancer, as the destination of the package.
  2. When the load balancer receives a package routed back from the Internet server, it changes the destination IP and port to the ones of the worker instance, by using the mapping record above. The package then can be routed back to the worker instance.

Azure offer 160 ports on the VIP of the load balancer and the Stamp have 5 IP to make the SNAT. Every IP have 65536 ports, these ports are shared by all instances inside a stamp. A typical App Service stamp has thousands of instances. The SNAT ports is a limited resource of a stamp.

This behavior is usually a result of when the application is not reusing existing connections. Creating an outbound connection per request is bad practice and can lead to connection exhaustion.

How to know if my application is experiencing SNAT port exhausted

When an instance’s SNAT ports are exhausted, the following symptoms can be observed from the application:

  • Slow and pending on connecting to the remote endpoint.
  • Socket exceptions when the connections timeout in the web application
  • Intermittent issues connecting to other resource.

How to solve SNAT issues

At this point you need to make some code changes and follow the best practices to avoid SNAT. To resolve the SNAT issue we need to:

  • Modify the application to reuse connections.
  • Modify the application to use connection pooling
  • Modify the application to use less aggressive retry logic
  • Use keepalives to reset the outbound idle timeout
  • Ensure the backend services can return response quickly.
  • Scale out the App Service plan to more instances
  • Use App Service Environment, whose worker instance can have more SNAT ports, due to its smaller instances pool size.
  • Stay below 100 parallel connections to the same endpoint. 

One of the most recent recommendations is consider using Regional VNET integration.

Regional VNET Integration with Services Endpoints

If your destination is an Azure service that supports service endpoints, you can avoid SNAT port exhaustion issues by using regional VNet Integration and service endpoints or private endpoints.

When you use regional VNet Integration and place service endpoints on the integration subnet, your app outbound traffic to those services will not have outbound SNAT port restrictions. Likewise, if you use regional VNet Integration and private endpoints, you will not have any outbound SNAT port issues to that destination.

Pros:
  • The application will no longer be restricted by SNAT limitations.
  • No code changes or application redeployment required.
  • Low cost, low maintenance configuration that can provide quick mitigation while you evaluate code changes.
Cons:
  • All the endpoints the app connects to must be hosted on Azure.
  • Additional configuration is required on dependent services to enable service endpoints.

VNET integration with NAT Gateway.

If your destination is hosted outside Azure, you can avoid SNAT port exhaustion issues by using VNet Integration and routing the traffic through a NAT gateway.

As the traffic is routed through VNET, it is not subjected to SNAT limits. Outbound calls to the internet endpoint is made from the NAT gateway, at this point, the connection will use one of the 64K SNAT ports pre-allocated to the NAT gateway.

All the applications that are a part of this VNET and are routing traffic through NAT gateway will share these 64K ports. This gives you a lot more than the default 128 pre-allocated SNAT ports when running on app services without this setup.

If your site is already in a VNET, you do not need to change its VNET configuration. Continue with the next step to setup and configure a NAT gateway and route traffic through it.

Note: If the destination endpoint resolves to a public IP address, you may need to configure the app setting WEBSITE_VNET_ROUTE_ALL=1 on your site to force all traffic through VNET, and hence, through the NAT gateway.

Pros:
  • Available SNAT ports are much more than the default 128.
  • No code changes or application redeployment required.
  • This approach works even if the application is connecting to endpoints hosted outside Azure.
Cons:
  • Additional cost for VNET and NAT gateway. Consider this an interim mitigation while you evaluate code changes for a more long term fix.
  • Additional configuration is required outside of Azure App Services.

Leave a Reply

Your email address will not be published. Required fields are marked *