sip 1.12.11devel
Loading...
Searching...
No Matches
Sofia SIP User Agent Library - "sip" - SIP Parser Module

Module Meta Information

The Sofia sip module contains interface to the SIP parser and the header and message objects.

Contact:\n Pekka Pessi <Pekka.Pessi@nokia-email.address.hidden>
Status:\n Sofia SIP Core library
License:\n LGPL

Overview

The structure of each header is defined in <sofia-sip/sip.h>. In addition to the header structure, there is defined a header class structure and some standard functions for each header in the include file <sofia-sip/sip_header.h>. For header X, there are types, functions, macros and header class declared in <sofia-sip/sip_protos.h> and <sofia-sip/sip_hclass.h>. See SIP Header X - Conventions for detailed description of these header-specific boilerplate declarations.

In addition to this interface, the SIP parser documentation contains description of the functionality required when a parser is extended by a new header. It is possible to add new headers to the SIP parser or extend the definition of existing ones.

Parsing SIP Messages

Sofia SIP parser follows recursive-descent principle. In other words, it is a program that descends the SIP syntax tree top-down recursively. (All syntax trees have root at top and they grow downwards.)

In the case of SIP such a parser is very efficient. The parser can choose between different forms based on each token, as SIP syntax is carefully designed so that it requires only minimal scan-ahead. It is also easy to extend a recursive-descent parser via a standard API, unlike, for instance, a LALR parser generated by Bison.

The abstract message module msg contains a high-level parser engine that drives the parsing process and invokes the SIP parser for each header. As there are no framing between SIP messages, the parser considers any received data, be it a UDP datagram or a TCP stream, as a message stream, which may consist of one or more SIP messages. The parser works by first separating stream into fragments, then building a complete message based on parsing result. After a message is completed, it can be given to the message stream customer (typically a protocol state machine). The parser continues processing the stream and feeding the messages to protocol engine until the end of the stream is reached.

For each message, the parser starts by separating the first fragment, which is either a request or status line. After the first line has been processed, the parser engine continues by separating the headers one-by-one from the message. After the parser encounters an empty line separating the headers and the message body (payload), it invokes a function parsing the separator and payload fragment(s). When the message is complete, the parser can hand the message over to the protocol engine. Then it is ready to start again with first fragment of the next message.

Separating byte stream to messages

When the parsing process has completed, the request or status line, each header, separator and the payload are all in their own fragment structure. The fragments form a dual-linked list known as fragment chain as shown in the above figure. The buffers for the message, the fragment chain, and a whole other stuff is held by the generic message type, msg_t, defined in <sofia-sip/msg.h>. The internal structure of msg_t is known only within msg module and it is hidden from other modules.

The abstract message module msg also drives the reverse process, invoking the encoding method of each fragment so that the whole outgoing SIP message is encoded properly.

SIP Header as a C struct

Just separating headers from each other and from the message body is not usually enough. When a header contains structured data, the header contents should be converted to a form that is convenient to use from C programs. For that purpose, the message parser needs a special function for each individual header. The header-specific parsing function divides the contents of the header into semantically meaningful segments and stores the result in a header-specific structure.

The parser passes the fragment contents to a parsing function immediately after it has separated a fragment from the message. The parsing function is defined by the header class. The header class is either determined by the fragment position (first line, separator line or payload), or it is found from the hash table using the header name as key. There is also a special header class for unknown headers, headers with a name that is not regocnized by the parser.

For instance, the From header has following syntax:

from = ("From" | "f") ":"
( name-addr | addr-spec ) *( ";" addr-params )
name-addr = [ display-name ] "<" addr-spec ">"
addr-spec = SIP-URL | URI
display-name = *token | quoted-string
addr-params = *( tag-param | generic-param )
tag-param = "tag" "=" ( token | quoted-string )

When a From header is parsed, the header parser function sip_from_d() separates the display-name, addr-spec and each parameter in the addr-params list. The parsing result is assigned to a sip_from_t structure, which is defined as follows:

typedef struct sip_addr_s {
sip_unknown_t *a_next;
char const *a_display;
char const *a_tag;
msg_param_t sip_param_t
SIP parameter string.
Definition sip.h:124
Structure for From and To headers.
Definition sip.h:382
msg_param_t const * a_params
Parameter table
Definition sip.h:387
url_t a_url[1]
URL.
Definition sip.h:386
sip_common_t a_common[1]
Common fragment info.
Definition sip.h:383
char const * a_display
Display name.
Definition sip.h:385
char const * a_tag
Tag parameter.
Definition sip.h:390

The string containing the display-name is put into the a_display field, the URL contents can be found in the a_url field, and the list of addr-params parameters is put in the a_params array. If there is a tag-param present, a pointer to the parameter value is assigned to a_tag field.

SIP Message as a C struct

It is not enough to represent a SIP message as a collection of headers following each other. The programmer also needs a convenient way to access certain headers at the SIP message level, for example, accessing directly the From header instead of going through all headers and examining their name. The structured view to the SIP message is provided via a C struct with type sip_t.

In other words, a single message is represented by two types, first type (msg_t) is private to the msg module and inaccessable by an application programmer, second (sip_t) is a public structure containing the parsed headers.

The sip_t structure is defined as follows:

typedef struct sip_s {
msg_common_t sip_common[1]; // Used with recursive inclusion
msg_pub_t *sip_next; // Ditto
void *sip_user; // Application data
unsigned sip_size;
int sip_flags;
sip_error_t *sip_error; // Erroneous headers
sip_request_t *sip_request; // Request line
sip_status_t *sip_status; // Status line
sip_via_t *sip_via; // @Via (v)
sip_route_t *sip_route; // @Route
...
MSG_PUB_T msg_pub_t
struct sip_s sip_t
Structure for accessing parsed SIP headers.
Definition sip.h:111
Structure for Max-Forwards header.
Definition sip.h:559
Structure for SIP request line.
Definition sip.h:357
Structure for Route and Record-Route header fields.
Definition sip.h:681
SIP message object.
Definition sip.h:230
sip_route_t * sip_route
Route.
Definition sip.h:245
sip_record_route_t * sip_record_route
Record-Route.
Definition sip.h:246
unsigned sip_size
Size of structure.
Definition sip.h:234
sip_error_t * sip_error
Erroneous headers.
Definition sip.h:237
msg_pub_t * sip_next
Dummy link to msgfrag.
Definition sip.h:232
sip_via_t * sip_via
Via (v)
Definition sip.h:244
sip_max_forwards_t * sip_max_forwards
Max-Forwards.
Definition sip.h:247
sip_status_t * sip_status
Status line.
Definition sip.h:241
void * sip_user
Application data.
Definition sip.h:233
int sip_flags
Parser flags.
Definition sip.h:235
msg_common_t sip_common[1]
For recursive inclusion.
Definition sip.h:231
sip_request_t * sip_request
Request line
Definition sip.h:240
Structure for SIP status line.
Definition sip.h:370
Structure for Via header field.
Definition sip.h:753

As you can see above, the public sip_t structure contains the common header members that are also found in the beginning of a header structure. The sip_size indicates the size of the structure - the application can extend the parser and sip_t structure beyond the original size. The sip_flags contains various flags used during the parsing and printing process. They are documented in the <sofia-sip/msg.h>. These boilerplate members are followed by the pointers to various message elements and headers.

Note
Within the msg module, the public structure is known as msg_pub_t. The application programmer can cast a msg_t pointer to sip_t with sip_object() function (or macro).

Result of Parsing Process

Let us now show how a simple message is parsed and presented to the applications. As an exampe, we choose a BYE message with only the mandatory fields included:

BYE sip:joe@example.com SIP/2.0
Via: SIP/2.0/UDP sip.example.edu;branch=d7f2e89c.74a72681
Via: SIP/2.0/UDP pc104.example.edu:1030;maddr=110.213.33.19
From: Bobby Brown <sip:bb@example-email.address.hidden>;tag=77241a86
To: Joe User <sip:joe@example-email.address.hidden>;tag=7c6276c1
Call-ID: 4c4e911b@pc104.example.edu
CSeq: 2

The figure below shows the layout of the BYE message above after parsing:

BYE message and its representation in C

The leftmost box represents the message of type msg_t. Next box from the left reprents the sip_t structure, which contains pointers to a header objects. The next column contains the header objects. There is one header object for each message fragment. The rightmost box represents the I/O buffer used when the message was received. Note that the I/O buffer may be non-continous and composed of many separate memory areas.

The message object has link to the public message structure (m_object), to the dual-linked fragment chain (m_frags) and to the I/O buffer (m_buffer). The public message header structure contains pointers to the headers according to their type. If there are multiple headers of the same type (like there are two Via headers in the above message), the headers are put into a single-linked list.

Each fragment has pointers to successing and preceding fragment. It also contains pointer to the corresponding data within the I/O buffer and its length.

The main purpose of the fragment chain is to preserve the original order of the headers. If there were an third Via header after CSeq in the message, the fragment representing it would be after the CSeq header in the fragment chain but after the second Via in the header list.


Sofia-SIP 1.12.11devel - Copyright (C) 2006 Nokia Corporation. All rights reserved. Licensed under the terms of the GNU Lesser General Public License.