¿What is SNAT?
Source NAT occurs when a internal or private host need to make a connection to an external host.
The device that make the NAT, will be the one in charge to change the private IP of the host and give a public IP
Tuple of connection:
- Source IP
- Source Port
- Destination IP
- Destination port
What would happen if two Vm’s inside the same cloud service create an outgoing connection?
We can’t change the destination IP or destination port. The tuple should be unique so we will need to change something, the option is changing the Source information.
This translation is very common, for example when you connect your phone to a public Internet site through WIFI router, your router does the network address translation too.
SNAT with App Service
It’s very common, the web application need to connect to other endpoints inside or outside of Azure. Like Database, Redis, API’s or any other Restful webservice.
App service web application can’t establish network connections to the external Internet endpoints directly.
This is because of the Architecture of the App Service. We have a scale unit(stamp), the worker instances bounded inside the scale unit. The app services is hosted by one or some App Service worker instances.
The workers don’t have Internet IP addresses assigned. In each stamp we have VIP and is shared among all the workers inside it, we can’t have a dedicated VIP for each worker. Every stamp has a load balancer will be the one in charge to do the Source Network Address Translation, aka SNAT, to connect to external IP addresses.
How does SNAT work?
- App Service application sends a TCP package to an Internet IP address. The source IP address and port number of the package is internal(the IP of the worker)
- The TCP package is routed from a worker instance to the load balancer. The load balancer perform the SNAT, changes the source IP and port of the TCP package into its own ones and sends it out to the Internet.
- The Internet server receives the TCP package. Later when it sends back any package, it uses the IP address and port of the load balancer, as the destination of the package.
- When the load balancer receives a package routed back from the Internet server, it changes the destination IP and port to the ones of the worker instance, by using the mapping record above. The package then can be routed back to the worker instance.
Azure uses an algorithm to determine the number of preallocated SNAT ports available. The algorithm bases the number of ports on the size of the backend pool. You can find more details about it on the following article:
This behavior is usually a result of when the application is not reusing existing connections. Creating an outbound connection per request is bad practice and can lead to connection exhaustion.
How to know if my application is experiencing SNAT port exhausted
When an instance’s SNAT ports are exhausted, the following symptoms can be observed from the application:
- Slow and pending on connecting to the remote endpoint.
- Socket exceptions when the connections timeout in the web application
- Intermittent issues connecting to other resource.
How to solve SNAT issues
At this point you need to make some code changes and follow the best practices to avoid SNAT. To resolve the SNAT issue we need to:
- Modify the application to reuse connections.
- Modify the application to use connection pooling
- Modify the application to use less aggressive retry logic
- Use keepalives to reset the outbound idle timeout
- Ensure the backend services can return response quickly.
- Scale out the App Service plan to more instances
- Use App Service Environment, whose worker instance can have more SNAT ports, due to its smaller instances pool size.
- Stay below 100 parallel connections to the same endpoint.