Section 16.2. Socket Descriptors

16.2. Socket Descriptors

A socket is an abstraction of a communication endpoint. Just as they would use file descriptors to access a file, applications use socket descriptors to access sockets. Socket descriptors are implemented as file descriptors in the UNIX System. Indeed, many of the functions that deal with file descriptors, such as read and write, will work with a socket descriptor.

To create a socket, we call the socket function.

#include <sys/socket.h> int socket(int domain, int type, int protocol);

Returns: file (socket) descriptor if OK, 1 on error

The domain argument determines the nature of the communication, including the address format (described in more detail in the next section). Figure 16.1 summarizes the domains specified by POSIX.1. The constants start with AF_ (for address family) because each domain has its own format for representing an address.

Figure 16.1. Socket communication domains
Domain
Description
AF_INET
IPv4 Internet domain
AF_INET6
IPv6 Internet domain
AF_UNIX
UNIX domain
AF_UNSPEC
unspecified

We discuss the UNIX domain in Section 17.3. Most systems define the AF_LOCAL domain also, which is an alias for AF_UNIX. The AF_UNSPEC domain is a wildcard that represents "any" domain. Historically, some platforms provide support for additional network protocols, such as AF_IPX for the NetWare protocol family, but domain constants for these protocols are not defined by the POSIX.1 standard.

The type argument determines the type of the socket, which further determines the communication characteristics. The socket types defined by POSIX.1 are summarized in Figure 16.2, but implementations are free to add support for additional types.

Figure 16.2. Socket types
Type
Description
SOCK_DGRAM
fixed-length, connectionless, unreliable messages
SOCK_RAW
datagram interface to IP (optional in POSIX.1)
SOCK_SEQPACKET
fixed-length, sequenced, reliable, connection-oriented messages
SOCK_STREAM
sequenced, reliable, bidirectional, connection-oriented byte streams

The protocol argument is usually zero, to select the default protocol for the given domain and socket type. When multiple protocols are supported for the same domain and socket type, we can use the protocol argument to select a particular protocol. The default protocol for a SOCK_STREAM socket in the AF_INET communication domain is TCP (Transmission Control Protocol). The default protocol for a SOCK_DGRAM socket in the AF_INET communication domain is UDP (User Datagram Protocol).

With a datagram (SOCK_DGRAM) interface, no logical connection needs to exist between peers for them to communicate. All you need to do is send a message addressed to the socket being used by the peer process.

A datagram, therefore, provides a connectionless service. A byte stream (SOCK_STREAM), on the other hand, requires that, before you can exchange data, you set up a logical connection between your socket and the socket belonging to the peer you want to communicate with.

A datagram is a self-contained message. Sending a datagram is analogous to mailing someone a letter. You can mail many letters, but you can't guarantee the order of delivery, and some might get lost along the way. Each letter contains the address of the recipient, making the letter independent from all the others. Each letter can even go to different recipients.

In contrast, using a connection-oriented protocol for communicating with a peer is like making a phone call. First, you need to establish a connection by placing a phone call, but after the connection is in place, you can communicate bidirectionally with each other. The connection is a peer-to-peer communication channel over which you talk. Your words contain no addressing information, as a point-to-point virtual connection exists between both ends of the call, and the connection itself implies a particular source and destination.

With a SOCK_STREAM socket, applications are unaware of message boundaries, since the socket provides a byte stream service. This means that when we read data from a socket, it might not return the same number of bytes written by the process sending us data. We will eventually get everything sent to us, but it might take several function calls.

A SOCK_SEQPACKET socket is just like a SOCK_STREAM socket except that we get a message-based service instead of a byte-stream service. This means that the amount of data received from a SOCK_SEQPACKET socket is the same amount as was written. The Stream Control Transmission Protocol (SCTP) provides a sequential packet service in the Internet domain.

A SOCK_RAW socket provides a datagram interface directly to the underlying network layer (which means IP in the Internet domain). Applications are responsible for building their own protocol headers when using this interface, because the transport protocols (TCP and UDP, for example) are bypassed. Superuser privileges are required to create a raw socket to prevent malicious applications from creating packets that might bypass established security mechanisms.

Calling socket is similar to calling open. In both cases, you get a file descriptor that can be used for I/O. When you are done using the file descriptor, you call close to relinquish access to the file or socket and free up the file descriptor for reuse.

Although a socket descriptor is actually a file descriptor, you can't use a socket descriptor with every function that accepts a file descriptor argument. Figure 16.3 summarizes most of the functions we've described so far that are used with file descriptors and describes how they behave when used with a socket descriptor. Unspecified and implementation-defined behavior usually means that the function doesn't work with socket descriptors. For example, lseek doesn't work with sockets, since sockets don't support the concept of a file offset.

Figure 16.3. How file descriptor functions act with sockets
Function
Behavior with socket
close (Section 3.3)
deallocates the socket
dup, dup2 (Section 3.12)
duplicates the file descriptor as normal
fchdir (Section 4.22)
fails with errno set to ENOTDIR
fchmod (Section 4.9)
unspecified
fchown (Section 4.11)
implementation defined
fcntl (Section 3.14)
some commands supported, including F_DUPFD, F_GETFD, F_GETFL, F_GETOWN, F_SETFD, F_SETFL, and F_SETOWN
fdatasync, fsync (Section 3.13)
implementation defined
fstat (Section 4.2)
some stat structure members supported, but how left up to the implementation
ftruncate (Section 4.13)
unspecified
getmsg, getpmsg (Section 14.4)
works if sockets are implemented with STREAMS (i.e., on Solaris)
ioctl (Section 3.15)
some commands work, depending on underlying device driver
lseek (Section 3.6)
implementation defined (usually fails with errno set to ESPIPE)
mmap (Section 14.9)
unspecified
poll (Section 14.5.2)
works as expected
putmsg, putpmsg (Section 14.4)
works if sockets are implemented with STREAMS (i.e., on Solaris)
read (Section 3.7) and readv (Section 14.7)
equivalent to recv (Section 16.5) without any flags
select (Section 14.5.1)
works as expected
write (Section 3.8) and writev (Section 14.7)
equivalent to send (Section 16.5) without any flags

Communication on a socket is bidirectional. We can disable I/O on a socket with the shutdown function.

#include <sys/socket.h> int shutdown (int sockfd, int how);

Returns: 0 if OK, 1 on error

If how is SHUT_RD, then reading from the socket is disabled. If how is SHUT_WR, then we can't use the socket for transmitting data. We can use SHUT_RDWR to disable both data transmission and reception.

Given that we can close a socket, why is shutdown needed? There are several reasons. First, close will deallocate the network endpoint only when the last active reference is closed. This means that if we duplicate the socket (with dup, for example), the socket won't be deallocated until we close the last file descriptor referring to it. The shutdown function allows us to deactivate a socket independently of the number of active file descriptors referencing it. Second, it is sometimes convenient to shut a socket down in one direction only. For example, we can shut a socket down for writing if we want the process we are communicating with to be able to determine when we are done transmitting data, while still allowing us to use the socket to receive data sent to us by the process.