Section 3.9. I/O Efficiency

3.9. I/O Efficiency

The program in Figure 3.4 copies a file, using only the read and write functions. The following caveats apply to this program.

Figure 3.4. Copy standard input to standard output

#include "apue.h"

#define BUFFSIZE 4096

int
main(void)
{
    int    n;
    char   buf[BUFFSIZE];

    while ((n = read(STDIN_FILENO, buf, BUFFSIZE)) > 0)
        if (write(STDOUT_FILENO, buf, n) != n)
            err_sys("write error");

    if (n < 0)
        err_sys("read error");

    exit(0);
}

It reads from standard input and writes to standard output, assuming that these have been set up by the shell before this program is executed. Indeed, all normal UNIX system shells provide a way to open a file for reading on standard input and to create (or rewrite) a file on standard output. This prevents the program from having to open the input and output files.

Many applications assume that standard input is file descriptor 0 and that standard output is file descriptor 1. In this example, we use the two defined names, STDIN_FILENO and STDOUT_FILENO, from <unistd.h>.
The program doesn't close the input file or output file. Instead, the program uses the feature of the UNIX kernel that closes all open file descriptors in a process when that process terminates.
This example works for both text files and binary files, since there is no difference between the two to the UNIX kernel.

One question we haven't answered, however, is how we chose the BUFFSIZE value. Before answering that, let's run the program using different values for BUFFSIZE. Figure 3.5 shows the results for reading a 103,316,352-byte file, using 20 different buffer sizes.

The file was read using the program shown in Figure 3.4, with standard output redirected to /dev/null. The file system used for this test was the Linux ext2 file system with 4,096-byte blocks. (The st_blksize value, which we describe in Section 4.12, is 4,096.) This accounts for the minimum in the system time occurring at a BUFFSIZE of 4,096. Increasing the buffer size beyond this has little positive effect.

Most file systems support some kind of read-ahead to improve performance. When sequential reads are detected, the system tries to read in more data than an application requests, assuming that the application will read it shortly. From the last few entries in Figure 3.5, it appears that read-ahead in ext2 stops having an effect after 128 KB.

Figure 3.5. Timing results for reading with different buffer sizes on Linux
BUFFSIZE
User CPU (seconds)
System CPU (seconds)
Clock time (seconds)
#loops
1
124.89
161.65
288.64
103,316,352
2
63.10
80.96
145.81
51,658,#176
4
31.84
40.00
72.75
25,829,088
8
15.17
21.01
36.85
12,914,544
16
7.86
10.27
18.76
6,457,272
32
4.13
5.01
9.76
3,228,636
64
2.11
2.48
6.76
1,614,318
128
1.01
1.27
6.82
807,159
256
0.56
0.62
6.80
403,579
512
0.27
0.41
7.03
201,789
1,024
0.17
0.23
7.84
100,894
2,048
0.05
0.19
6.82
50,447
4,096
0.03
0.16
6.86
25,223
8,192
0.01
0.18
6.67
12,611
16,384
0.02
0.18
6.87
6,305
32,768
0.00
0.16
6.70
3,152
65,536
0.02
0.19
6.92
1,576
131,072
0.00
0.16
6.84
788
262,144
0.01
0.25
7.30
394
524,288
0.00
0.22
7.35
198

We'll return to this timing example later in the text. In Section 3.14, we show the effect of synchronous writes; in Section 5.8, we compare these unbuffered I/O times with the standard I/O library.

Beware when trying to measure the performance of programs that read and write files. The operating system will try to cache the file incore, so if you measure the performance of the program repeatedly, the successive timings will likely be better than the first. This is because the first run will cause the file to be entered into the system's cache, and successive runs will access the file from the system's cache instead of from the disk. (The term incore means in main memory. Back in the day, a computer's main memory was built out of ferrite core. This is where the phrase "core dump" comes from: the main memory image of a program stored in a file on disk for diagnosis.)

In the tests reported in Figure 3.5, each run with a different buffer size was made using a different copy of the file so that the current run didn't find the data in the cache from the previous run. The files are large enough that they all don't remain in the cache (the test system was configured with 512 MB of RAM).