Section 5.8. Standard I/O Efficiency

5.8. Standard I/O Efficiency

Using the functions from the previous section, we can get an idea of the efficiency of the standard I/O system. The program in Figure 5.4 is like the one in Figure 3.4: it simply copies standard input to standard output, using getc and putc. These two routines can be implemented as macros.

Figure 5.4. Copy standard input to standard output using `getc` and `putc`

#include "apue.h"

int
main(void)
{
     int     c;

     while ((c = getc(stdin)) != EOF)
         if (putc(c, stdout) == EOF)
             err_sys("output error");

     if (ferror(stdin))
         err_sys("input error");

     exit(0);
}

We can make another version of this program that uses fgetc and fputc, which should be functions, not macros. (We don't show this trivial change to the source code.)

Finally, we have a version that reads and writes lines, shown in Figure 5.5.

Figure 5.5. Copy standard input to standard output using `fgets` and `fputs`

#include "apue.h"

int
main(void)
{
    char    buf[MAXLINE];

    while (fgets(buf, MAXLINE, stdin) != NULL)
        if (fputs(buf, stdout) == EOF)
            err_sys("output error");

    if (ferror(stdin))
        err_sys("input error");

    exit(0);
}

Note that we do not close the standard I/O streams explicitly in Figure 5.4 or Figure 5.5. Instead, we know that the exit function will flush any unwritten data and then close all open streams. (We'll discuss this in Section 8.5.) It is interesting to compare the timing of these three programs with the timing data from Figure 3.5. We show this data when operating on the same file (98.5 MB with 3 million lines) in Figure 5.6.

Figure 5.6. Timing results using standard I/O routines
Function
User CPU (seconds)
System CPU (seconds)
Clock time (seconds)
Bytes of program text
best time from Figure 3.5
0.01
0.18
6.67

fgets, fputs
2.59
0.19
7.15
139
getc, putc
10.84
0.27
12.07
120
fgetc, fputc
10.44
0.27
11.42
120
single byte time from Figure 3.5
124.89
161.65
288.64

For each of the three standard I/O versions, the user CPU time is larger than the best read version from Figure 3.5, because the character-at-a-time standard I/O versions have a loop that is executed 100 million times, and the loop in the line-at-a-time version is executed 3,144,984 times. In the read version, its loop is executed only 12,611 times (for a buffer size of 8,192). This difference in clock times is from the difference in user times and the difference in the times spent waiting for I/O to complete, as the system times are comparable.

The system CPU time is about the same as before, because roughly the same number of kernel requests are being made. Note that an advantage of using the standard I/O routines is that we don't have to worry about buffering or choosing the optimal I/O size. We do have to determine the maximum line size for the version that uses fgets, but that's easier than trying to choose the optimal I/O size.

The final column in Figure 5.6 is the number of bytes of text spacethe machine instructions generated by the C compilerfor each of the main functions. We can see that the version using getc and putc takes the same amount of space as the one using the fgetc and fputc functions. Usually, getc and putc are implemented as macros, but in the GNU C library implementation, the macro simply expands to a function call.

The version using line-at-a-time I/O is almost twice as fast as the version using character-at-a-time I/O. If the fgets and fputs functions are implemented using getc and putc (see Section 7.7 of Kernighan and Ritchie [1988], for example), then we would expect the timing to be similar to the getc version. Actually, we might expect the line-at-a-time version to take longer, since we would be adding the overhead of 200 million extra function calls to the existing 6 million ones. What is happening with this example is that the line-at-a-time functions are implemented using memccpy(3). Often, the memccpy function is implemented in assembler instead of C, for efficiency.

The last point of interest with these timing numbers is that the fgetc version is so much faster than the BUFFSIZE=1 version from Figure 3.5. Both involve the same number of function callsabout 200 millionyet the fgetc version is almost 12 times faster in user CPU time and slightly more than 25 times faster in clock time. The difference is that the version using read executes 200 million function calls, which in turn execute 200 million system calls. With the fgetc version, we still execute 200 million function calls, but this ends up being only 25,222 system calls. System calls are usually much more expensive than ordinary function calls.

As a disclaimer, you should be aware that these timing results are valid only on the single system they were run on. The results depend on many implementation features that aren't the same on every UNIX system. Nevertheless, having a set of numbers such as these, and explaining why the various versions differ, helps us understand the system better. From this section and Section 3.9, we've learned that the standard I/O library is not much slower than calling the read and write functions directly. The approximate cost that we've seen is about 0.11 seconds of CPU time to copy a megabyte of data using getc and putc. For most nontrivial applications, the largest amount of the user CPU time is taken by the application, not by the standard I/O routines.

5.8. Standard I/O Efficiency

Figure 5.4. Copy standard input to standard output using getc and putc

Figure 5.5. Copy standard input to standard output using fgets and fputs

Figure 5.6. Timing results using standard I/O routines

Figure 5.4. Copy standard input to standard output using `getc` and `putc`

Figure 5.5. Copy standard input to standard output using `fgets` and `fputs`