Session Initiation Protocol (SIP) is a communications protocol for Internet telephony, defined by IETF in RFC3261 for signaling and controlling multimedia sessions
SIP architecture
SIP architecture is composed of several components.
-
- SIP devices like IP phones, soft phones are User Agent clients when receiving a call, and are User Agent Server when starting a call.
-
- In middle of signaling path is SIP proxy which handles registration and user location
-
- Back 2 Back User agent is a server directly connected with a client and may connect to a SIP proxy or act like one.
SIP is only responsible for signaling as the name suggests, the media flows directly between the User Agent components.
A gateway is needed to connect to PSTN (Public Switch Telephone Network) or PBX (Private Branch eXchange) since you cannot insert telephony cards or phone lines in SIP server.
Registration
The client first registers to the proxy server so the entry in proxy server UserLocation table can record the information about the client.
Client sends a packet with following information to Proxy.
To: sip:1001@domain1.com
Contact:<sip:234793489234@192.168.1.1:20202>
Server will send a challenge, sends an authorization digest back with 401 unauthorized message
Client then responds back with following information:
A response MD5 hash based on authorization digest, nonce, username, password combination
Server will now send a 200/OK
Making the call: Session setup
SIP ladder shows signaling like a sequence diagram, and trapezoid shows infrastructure topology
In these diagrams, User A on Domain A is calling User B on Domain B
The request to Domain A proxy must go to a DNS server to locate Domain B server to determine the IP address of Domain B
For INVITE request the proxy immediately responds with 100 Trying.
Domain B server locates User B using UserLocation table and sends the request to User B and completes the call. This triggers a response of 180 Ringing back to User A. When the user B answers, the response goes back as 200 OK
In INVITE message, we have Session Description Protocol in Message body which contain codec information
In Response message also we have a message body which shares the acceptance of the code information in SDP packet.
Now the media session is established and the session is confirmed using ACK which can be sent withing next 30 seconds otherwise the session will be disconnected.
When session is complete, one of the parties will send BYE and other will respond with 200 OK.
SIP Proxy vs B2BUA (Back to Back User Agent)
SIP Proxy: It is similar to HTTP proxy, used by providers for high call volume
Examples include: openSIP, Kamailio
Proxy is mostly passthru, just changes the address as a proxy, as shown below it is changing the address of request URI
B2BUA (Back to Back User Agent): It has more overhead but have features like call transfer, wait in queue, used in PBX: freeswitch, asterix
We can say it has 2 legs, as first call is terminated at the server, second call is between server to client. In this case the Contact is different, call id is different, And it is 2 calls bridged on the Server.
Types of NAT (Network Address Translation)
Cone Type: Any computer sending packet to 1234 will get delivered to 10.0.0.1:8000
Full Cone: static
Router has a static mapping between an external IP and an internal IP on the router.
The fixed mapping done manually in router will be (Packet In at port 1234 -> 10.0.0.1:8000)
Any computer sending packet to 1234 will get delivered to 10.0.0.1:8000
Restricted Cone: Dynamic, restricts to IP
The internal computer has to start a communication to receive packets back, So IP of external computer is mapped to internal computer address only if the request went out from internal computer first.
Any computer sending packet to 1234 will get delivered to 10.0.0.1:8000 only if 10.0.0.1 sent a packet first to that computer
The mapping in router will be (Packet In at port 1234 from whitelisted IP -> 10.0.0.1:8000), and will only be created after the 10.0.0.1 tries to go to the external IP
Port restricted cone: Dynamic, Restricts IP and port.
This is similar to above but it has additional restriction on ports, it will add ip and port mapping and restrict that so any other port from external computer will not be able to reach internal.
Any computer sending packet to 1234 will get delivered to 10.0.0.1:8000 only if 10.0.0.1 sent a packet to that IP:port first
The mapping in router will be (Packet In at port 1234 from whitelisted IP && whitelisted port -> 10.0.0.1:8000), and will only be created after the 10.0.0.1 tries to go to the external IP
Symmetric NAT
Here a new mapping (port) is generated for each new external destination, so computer 2 port 3000 will be blocked on 1234 but allowed on 5678
So there will be 2 mappings in this case
(Packet In at port 1234 from 200.210.1.1:2000 -> 10.0.0.1:8000)
(Packet In at port 5678 from 200.210.1.1:3000 -> 10.0.0.1:8000
STUN server
A STUN (Simple traversal of UDP over NAT) server helps in bridging the call between devices behind a NAT (The NAT should not be a symmetric NAT)
It has 2 functions
1. First it discovers if user is behind NAT and what’s the external address of user
2. Then it discovers what type of NAT it is.
TEST 1: Binding request with no flag, The server responds in the same IP:port sent
TEST 2: Binding request with change IP and change port flags, Responds with different IP and port
TEST 3: Binding request with just change port, Responds with different port
To do so during the binding process (before registration) it send a binding request with no flag to the device and discovers its external IP
Then it detects the NAT type by sending change requests and checks changed address to decide what NAT type we have.
Since in (Full Cone, Static Cone and port restricted cone) NATs the mapping on NAT device to an internal device is static and any external device can use that mapping to come back in the next request, the one time discovery of that external facing port is sufficient for subsequent calls to direct traffic to the port for its destination.
TURN server
A TURN (Traversal of UDP over relay NAT) server helps in bridging the call between devices behind a symmetric NAT
In Symmetric NATs the mapping on NAT device to an internal device is not static and will be blocked if the traffic did not originate from behind the NAT first, any external device cannot directly use that mapping to come back in the next request with RTP packets, the external facing port changes for each caller/responder, so we need a SIP and media RTP proxy which can provide a static rendezvous point to send the traffic for IP address translations in packets for the flow to work.
Cheers – Amit Tomar