14.4. STREAMSThe STREAMS mechanism is provided by System V as a general way to interface communication drivers into the kernel. We need to discuss STREAMS to understand the terminal interface in System V, the use of the poll function for I/O multiplexing (Section 14.5.2), and the implementation of STREAMS-based pipes and named pipes (Sections 17.2 and 17.2.1).
A stream provides a full-duplex path between a user process and a device driver. There is no need for a stream to talk to a hardware device; a stream can also be used with a pseudo-device driver. Figure 14.13 shows the basic picture for what is called a simple stream. Figure 14.13. A simple stream
Beneath the stream head, we can push processing modules onto the stream. This is done using an ioctl command. Figure 14.14 shows a stream with a single processing module. We also show the connection between these boxes with two arrows to stress the full-duplex nature of streams and to emphasize that the processing in one direction is separate from the processing in the other direction. Figure 14.14. A stream with a processing module
Any number of processing modules can be pushed onto a stream. We use the term push, because each new module goes beneath the stream head, pushing any previously pushed modules down. (This is similar to a last-in, first-out stack.) In Figure 14.14, we have labeled the downstream and upstream sides of the stream. Data that we write to a stream head is sent downstream. Data read by the device driver is sent upstream. STREAMS modules are similar to device drivers in that they execute as part of the kernel, and they are normally link edited into the kernel when the kernel is built. If the system supports dynamically-loadable kernel modules (as do Linux and Solaris), then we can take a STREAMS module that has not been link edited into the kernel and try to push it onto a stream; however, there is no guarantee that arbitrary combinations of modules and drivers will work properly together. We access a stream with the functions from Chapter 3: open, close, read, write, and ioctl. Additionally, three new functions were added to the SVR3 kernel to support STREAMS (getmsg, putmsg, and poll), and another two (getpmsg and putpmsg) were added with SVR4 to handle messages with different priority bands within a stream. We describe these five new functions later in this section. The pathname that we open for a stream normally lives beneath the /dev directory. Simply looking at the device name using ls -l, we can't tell whether the device is a STREAMS device. All STREAMS devices are character special files. Although some STREAMS documentation implies that we can write processing modules and push them willy-nilly onto a stream, the writing of these modules requires the same skills and care as writing a device driver. Generally, only specialized applications or functions push and pop STREAMS modules.
STREAMS MessagesAll input and output under STREAMS is based on messages. The stream head and the user process exchange messages using read, write, ioctl, getmsg, getpmsg, putmsg, and putpmsg. Messages are also passed up and down a stream between the stream head, the processing modules, and the device driver. Between the user process and the stream head, a message consists of a message type, optional control information, and optional data. We show in Figure 14.15 how the various message types are generated by the arguments to write, putmsg, and putpmsg. The control information and data are specified by strbuf structures: struct strbuf int maxlen; /* size of buffer */ int len; /* number of bytes currently in buffer */ char *buf; /* pointer to buffer */ };
When we send a message with putmsg or putpmsg, len specifies the number of bytes of data in the buffer. When we receive a message with getmsg or getpmsg, maxlen specifies the size of the buffer (so the kernel won't overflow the buffer), and len is set by the kernel to the amount of data stored in the buffer. We'll see that a zero-length message is OK and that a len of 1 can specify that there is no control or data. Why do we need to pass both control information and data? Providing both allows us to implement service interfaces between a user process and a stream. Olander, McGrath, and Israel [1986] describe the original implementation of service interfaces in System V. Chapter 5 of AT&T [1990d] describes service interfaces in detail, along with a simple example. Probably the best-known service interface, described in Chapter 4 of Rago [1993], is the System V Transport Layer Interface (TLI), which provides an interface to the networking system. Another example of control information is sending a connectionless network message (a datagram). To send the message, we need to specify the contents of the message (the data) and the destination address for the message (the control information). If we couldn't send control and data together, some ad hoc scheme would be required. For example, we could specify the address using an ioctl, followed by a write of the data. Another technique would be to require that the address occupy the first N bytes of the data that is written using write. Separating the control information from the data, and providing functions that handle both (putmsg and getmsg) is a cleaner way to handle this. There are about 25 different types of messages, but only a few of these are used between the user process and the stream head. The rest are passed up and down a stream within the kernel. (These message types are of interest to people writing STREAMS processing modules, but can safely be ignored by people writing user-level code.) We'll encounter only three of these message types with the functions we use (read, write, getmsg, getpmsg, putmsg, and putpmsg):
Every message on a stream has a queueing priority:
Ordinary messages are simply priority band messages with a band of 0. Priority band messages have a band of 1255, with a higher band specifying a higher priority. High-priority messages are special in that only one is queued by the stream head at a time. Additional high-priority messages are discarded when one is already on the stream head's read queue. Each STREAMS module has two input queues. One receives messages from the module above (messages moving downstream from the stream head toward the driver), and one receives messages from the module below (messages moving upstream from the driver toward the stream head). The messages on an input queue are arranged by priority. We show in Figure 14.15 how the arguments to write, putmsg, and putpmsg cause these various priority messages to be generated. There are other types of messages that we don't consider. For example, if the stream head receives an M_SIG message from below, it generates a signal. This is how a terminal line discipline module sends the terminal-generated signals to the foreground process group associated with a controlling terminal. putmsg and putpmsg FunctionsA STREAMS message (control information or data, or both) is written to a stream using either putmsg or putpmsg. The difference in these two functions is that the latter allows us to specify a priority band for the message.
We can also write to a stream, which is equivalent to a putmsg without any control information and with a flag of 0. These two functions can generate the three different priorities of messages: ordinary, priority band, and high priority. Figure 14.15 details the combinations of the arguments to these two functions that generate the various types of messages. The notation "N/A" means not applicable. In this figure, a "no" for the control portion of the message corresponds to either a null ctlptr argument or ctlptr>len being 1. A "yes" for the control portion corresponds to ctlptr being non-null and ctlptr>len being greater than or equal to 0. The data portion of the message is handled equivalently (using dataptr instead of ctlptr). STREAMS ioctl OperationsIn Section 3.15, we said that the ioctl function is the catchall for anything that can't be done with the other I/O functions. The STREAMS system continues this tradition. Between Linux and Solaris, there are almost 40 different operations that can be performed on a stream using ioctl. Most of these operations are documented in the streamio(7) manual page. The header <stropts.h> must be included in C code that uses any of these operations. The second argument for ioctl, request, specifies which of the operations to perform. All the requests begin with I_. The third argument depends on the request. Sometimes, the third argument is an integer value; sometimes, it's a pointer to an integer or a structure. Exampleisastream FunctionWe sometimes need to determine if a descriptor refers to a stream or not. This is similar to calling the isatty function to determine if a descriptor refers to a terminal device (Section 18.9). Linux and Solaris provide the isastream function.
Like isatty, this is usually a trivial function that merely tries an ioctl that is valid only on a STREAMS device. Figure 14.16 shows one possible implementation of this function. We use the I_CANPUT ioctl command, which checks if the band specified by the third argument (0 in the example) is writable. If the ioctl succeeds, the stream is not changed. We can use the program in Figure 14.17 to test this function. Running this program on Solaris 9 shows the various errors returned by the ioctl function:
$ ./a.out /dev/tty /dev/fb /dev/null /etc/motd
/dev/tty: streams device
/dev/fb: not a stream: Invalid argument
/dev/null: not a stream: No such device or address
/etc/motd: not a stream: Inappropriate ioctl for device
Note that /dev/tty is a STREAMS device, as we expect under Solaris. The character special file /dev/fb is not a STREAMS device, but it supports other ioctl requests. These devices return EINVAL when the ioctl request is unknown. The character special file /dev/null does not support any ioctl operations, so the error ENODEV is returned. Finally, /etc/motd is a regular file, not a character special file, so the classic error ENOTTY is returned. We never receive the error we might expect: ENOSTR ("Device is not a stream").
Figure 14.16. Check if descriptor is a STREAMS device#include <stropts.h> #include <unistd.h> int isastream(int fd) { return(ioctl(fd, I_CANPUT, 0) != -1); } Figure 14.17. Test the isastream function#include "apue.h" #include <fcntl.h> int main(int argc, char *argv[]) { int i, fd; for (i = 1; i < argc; i++) { if ((fd = open(argv[i], O_RDONLY)) < 0) { err_ret("%s: can't open", argv[i]); continue; } if (isastream(fd) == 0) err_ret("%s: not a stream", argv[i]); else err_msg("%s: streams device", argv[i]); } exit(0); } ExampleIf the ioctl request is I_LIST, the system returns the names of all the modules on the streamthe ones that have been pushed onto the stream, including the topmost driver. (We say topmost because in the case of a multiplexing driver, there may be more than one driver. Chapter 12 of Rago [1993] discusses multiplexing drivers in detail.) The third argument must be a pointer to a str_list structure: struct str_list { int sl_nmods; /* number of entries in array */ struct str_mlist *sl_modlist; /* ptr to first element of array */ }; We have to set sl_modlist to point to the first element of an array of str_mlist structures and set sl_nmods to the number of entries in the array: struct str_mlist { char l_name[FMNAMESZ+1]; /* null terminated module name */ }; The constant FMNAMESZ is defined in the header <sys/conf.h> and is often 8. The extra byte in l_name is for the terminating null byte. If the third argument to the ioctl is 0, the count of the number of modules is returned (as the value of ioctl) instead of the module names. We'll use this to determine the number of modules and then allocate the required number of str_mlist structures. Figure 14.18 illustrates the I_LIST operation. Since the returned list of names doesn't differentiate between the modules and the driver, when we print the module names, we know that the final entry in the list is the driver at the bottom of the stream. If we run the program in Figure 14.18 from both a network login and a console login, to see which STREAMS modules are pushed onto the controlling terminal, we get the following: $ who sar console May 1 18:27 sar pts/7 Jul 12 06:53 $ ./a.out /dev/console #modules = 5 module: redirmod module: ttcompat module: ldterm module: ptem driver: pts $ ./a.out /dev/pts/7 #modules = 4 module: ttcompat module: ldterm module: ptem driver: pts The modules are the same in both cases, except that the console has an extra module on top that helps with virtual console redirection. On this computer, a windowing system was running on the console, so /dev/console actually refers to a pseudo terminal instead of to a hardwired device. We'll return to the pseudo terminal case in Chapter 19. Figure 14.18. List the names of the modules on a stream#include "apue.h" #include <fcntl.h> #include <stropts.h> #include <sys/conf.h> int main(int argc, char *argv[]) { int fd, i, nmods; struct str_list list; if (argc != 2) err_quit("usage: %s <pathname>", argv[0]); if ((fd = open(argv[1], O_RDONLY)) < 0) err_sys("can't open %s", argv[1]); if (isastream(fd) == 0) err_quit("%s is not a stream", argv[1]); /* * Fetch number of modules. */ if ((nmods = ioctl(fd, I_LIST, (void *) 0)) < 0) err_sys("I_LIST error for nmods"); printf("#modules = %d\n", nmods); /* * Allocate storage for all the module names. */ list.sl_modlist = calloc(nmods, sizeof(struct str_mlist)); if (list.sl_modlist == NULL) err_sys("calloc error"); list.sl_nmods = nmods; /* * Fetch the module names. */ if (ioctl(fd, I_LIST, &list) < 0) err_sys("I_LIST error for list"); /* * Print the names. */ for (i = 1; i <= nmods; i++) printf(" %s: %s\n", (i == nmods) ? "driver" : "module", list.sl_modlist++->l_name); exit(0); } write to STREAMS DevicesIn Figure 14.15 we said that a write to a STREAMS device generates an M_DATA message. Although this is generally true, there are some additional details to consider. First, with a stream, the topmost processing module specifies the minimum and maximum packet sizes that can be sent downstream. (We are unable to query the module for these values.) If we write more than the maximum, the stream head normally breaks the data into packets of the maximum size, with one final packet that can be smaller than the maximum. The next thing to consider is what happens if we write zero bytes to a stream. Unless the stream refers to a pipe or FIFO, a zero-length message is sent downstream. With a pipe or FIFO, the default is to ignore the zero-length write, for compatibility with previous versions. We can change this default for pipes and FIFOs using an ioctl to set the write mode for the stream. Write ModeTwo ioctl commands fetch and set the write mode for a stream. Setting request to I_GWROPT requires that the third argument be a pointer to an integer, and the current write mode for the stream is returned in that integer. If request is I_SWROPT, the third argument is an integer whose value becomes the new write mode for the stream. As with the file descriptor flags and the file status flags (Section 3.14), we should always fetch the current write mode value and modify it rather than set the write mode to some absolute value (possibly turning off some other bits that were enabled). Currently, only two write mode values are defined.
A stream also has a read mode, and we'll look at it after describing the getmsg and getpmsg functions. getmsg and getpmsg FunctionsSTREAMS messages are read from a stream head using read, getmsg, or getpmsg.
Note that flagptr and bandptr are pointers to integers. The integer pointed to by these two pointers must be set before the call to specify the type of message desired, and the integer is also set on return to the type of message that was read. If the integer pointed to by flagptr is 0, getmsg returns the next message on the stream head's read queue. If the next message is a high-priority message, the integer pointed to by flagptr is set to RS_HIPRI on return. If we want to receive only high-priority messages, we must set the integer pointed to by flagptr to RS_HIPRI before calling getmsg. A different set of constants is used by getpmsg. We can set the integer pointed to by flagptr to MSG_HIPRI to receive only high-priority messages. We can set the integer to MSG_BAND and then set the integer pointed to by bandptr to a nonzero priority value to receive only messages from that band, or higher (including high-priority messages). If we only want to receive the first available message, we can set the integer pointed to by flagptr to MSG_ANY; on return, the integer will be overwritten with either MSG_HIPRI or MSG_BAND, depending on the type of message received. If the message we retrieved was not a high-priority message, the integer pointed to by bandptr will contain the message's priority band. If ctlptr is null or ctlptr>maxlen is 1, the control portion of the message will remain on the stream head's read queue, and we will not process it. Similarly, if dataptr is null or dataptr>maxlen is 1, the data portion of the message is not processed and remains on the stream head's read queue. Otherwise, we will retrieve as much control and data portions of the message as our buffers will hold, and any remainder will be left on the head of the queue for the next call. If the call to getmsg or getpmsg retrieves a message, the return value is 0. If part of the control portion of the message is left on the stream head read queue, the constant MORECTL is returned. Similarly, if part of the data portion of the message is left on the queue, the constant MOREDATA is returned. If both control and data are left, the return value is (MORECTL|MOREDATA). Read ModeWe also need to consider what happens if we read from a STREAMS device. There are two potential problems.
The default handling for condition 1 is called byte-stream mode. In this mode, a read takes data from the stream until the requested number of bytes has been read or until there is no more data. The message boundaries associated with the STREAMS messages are ignored in this mode. The default handling for condition 2 causes the read to return an error if there is a control message at the front of the queue. We can change either of these defaults. Using ioctl, if we set request to I_GRDOPT, the third argument is a pointer to an integer, and the current read mode for the stream is returned in that integer. A request of I_SRDOPT takes the integer value of the third argument and sets the read mode to that value. The read mode is specified by one of the following three constants:
Three additional constants can be specified in the read mode to set the behavior of read when it encounters messages containing protocol control information on a stream:
Only one of the message read modes and one of the protocol read modes can be set at a time. The default read mode is (RNORM|RPROTNORM). ExampleThe program in Figure 14.19 is the same as the one in Figure 3.4, but recoded to use getmsg instead of read. If we run this program under Solaris, where both pipes and terminals are implemented using STREAMS, we get the following output: $ echo hello, world | ./a.out requires STREAMS-based pipes flag = 0, ctl.len = -1, dat.len = 13 hello, world flag = 0, ctl.len = 0, dat.len = 0 indicates a STREAMS hangup $ ./a.out requires STREAMS-based terminals this is line 1 flag = 0, ctl.len = -1, dat.len = 15 this is line 1 and line 2 flag = 0, ctl.len = -1, dat.len = 11 and line 2 ^D type the terminal EOF character flag = 0, ctl.len = -1, dat.len = 0 tty end of file is not the same as a hangup $ ./a.out < /etc/motd getmsg error: Not a stream device When the pipe is closed (when echo terminates), it appears to the program in Figure 14.19 as a STREAMS hangup, with both the control length and the data length set to 0. (We discuss pipes in Section 15.2.) With a terminal, however, typing the end-of-file character causes only the data length to be returned as 0. This terminal end of file is not the same as a STREAMS hangup. As expected, when we redirect standard input to be a non-STREAMS device, getmsg returns an error. Figure 14.19. Copy standard input to standard output using getmsg#include "apue.h" #include <stropts.h> #define BUFFSIZE 4096 int main(void) { int n, flag; char ctlbuf[BUFFSIZE], datbuf[BUFFSIZE]; struct strbuf ctl, dat; ctl.buf = ctlbuf; ctl.maxlen = BUFFSIZE; dat.buf = datbuf; dat.maxlen = BUFFSIZE; for ( ; ; ) { flag = 0; /* return any message */ if ((n = getmsg(STDIN_FILENO, &ctl, &dat, &flag)) < 0) err_sys("getmsg error"); fprintf(stderr, "flag = %d, ctl.len = %d, dat.len = %d\n", flag, ctl.len, dat.len); if (dat.len == 0) exit(0); else if (dat.len > 0) if (write(STDOUT_FILENO, dat.buf, dat.len) != dat.len) err_sys("write error"); } } |