3.14. fcntl FunctionThe fcntl function can change the properties of a file that is already open.
In the examples in this section, the third argument is always an integer, corresponding to the comment in the function prototype just shown. But when we describe record locking in Section 14.3, the third argument becomes a pointer to a structure. The fcntl function is used for five different purposes.
We'll now describe the first seven of these ten cmd values. (We'll wait until Section 14.3 to describe the last three, which deal with record locking.) Refer to Figure 3.6, since we'll be referring to both the file descriptor flags associated with each file descriptor in the process table entry and the file status flags associated with each file table entry.
The return value from fcntl depends on the command. All commands return 1 on an error or some other value if OK. The following four commands have special return values: F_DUPFD, F_GETFD, F_GETFL, and F_GETOWN. The first returns the new file descriptor, the next two return the corresponding flags, and the final one returns a positive process ID or a negative process group ID. ExampleThe program in Figure 3.10 takes a single command-line argument that specifies a file descriptor and prints a description of selected file flags for that descriptor. Note that we use the feature test macro _POSIX_C_SOURCE and conditionally compile the file access flags that are not part of POSIX.1. The following script shows the operation of the program, when invoked from bash (the Bourne-again shell). Results vary, depending on which shell you use. $ ./a.out 0 < /dev/tty read only $ ./a.out 1 > temp.foo $ cat temp.foo write only $ ./a.out 2 2>>temp.foo write only, append $ ./a.out 5 5<>temp.foo read write The clause 5<>temp.foo opens the file temp.foo for reading and writing on file descriptor 5. Figure 3.10. Print file flags for specified descriptor#include "apue.h" #include <fcntl.h> int main(int argc, char *argv[]) { int val; if (argc != 2) err_quit("usage: a.out <descriptor#>"); if ((val = fcntl(atoi(argv[1]), F_GETFL, 0)) < 0) err_sys("fcntl error for fd %d", atoi(argv[1])); switch (val & O_ACCMODE) { case O_RDONLY: printf("read only"); break; case O_WRONLY: printf("write only"); break; case O_RDWR: printf("read write"); break; default: err_dump("unknown access mode"); } if (val & O_APPEND) printf(", append"); if (val & O_NONBLOCK) printf(", nonblocking"); #if defined(O_SYNC) if (val & O_SYNC) printf(", synchronous writes"); #endif #if !defined(_POSIX_C_SOURCE) && defined(O_FSYNC) if (val & O_FSYNC) printf(", synchronous writes"); #endif putchar('\n'); exit(0); } ExampleWhen we modify either the file descriptor flags or the file status flags, we must be careful to fetch the existing flag value, modify it as desired, and then set the new flag value. We can't simply do an F_SETFD or an F_SETFL, as this could turn off flag bits that were previously set. Figure 3.11 shows a function that sets one or more of the file status flags for a descriptor. If we change the middle statement to val &= ~flags; /* turn flags off */ we have a function named clr_fl, which we'll use in some later examples. This statement logically ANDs the one's complement of flags with the current val. If we call set_fl from Figure 3.4 by adding the line set_fl(STDOUT_FILENO, O_SYNC); at the beginning of the program, we'll turn on the synchronous-write flag. This causes each write to wait for the data to be written to disk before returning. Normally in the UNIX System, a write only queues the data for writing; the actual disk write operation can take place sometime later. A database system is a likely candidate for using O_SYNC, so that it knows on return from a write that the data is actually on the disk, in case of an abnormal system failure. We expect the O_SYNC flag to increase the clock time when the program runs. To test this, we can run the program in Figure 3.4, copying 98.5 MB of data from one file on disk to another and compare this with a version that does the same thing with the O_SYNC flag set. The results from a Linux system using the ext2 file system are shown in Figure 3.12. The six rows in Figure 3.12 were all measured with a BUFFSIZE of 4,096. The results in Figure 3.5 were measured reading a disk file and writing to /dev/null, so there was no disk output. The second row in Figure 3.12 corresponds to reading a disk file and writing to another disk file. This is why the first and second rows in Figure 3.12 are different. The system time increases when we write to a disk file, because the kernel now copies the data from our process and queues the data for writing by the disk driver. We expect the clock time to increase also when we write to a disk file, but it doesn't increase significantly for this test, which indicates that our writes go to the system cache, and we don't measure the cost to actually write the data to disk. When we enable synchronous writes, the system time and the clock time should increase significantly. As the third row shows, the time for writing synchronously is about the same as when we used delayed writes. This implies that the Linux ext2 file system isn't honoring the O_SYNC flag. This suspicion is supported by the sixth line, which shows that the time to do synchronous writes followed by a call to fsync is just as large as calling fsync after writing the file without synchronous writes (line 5). After writing a file synchronously, we expect that a call to fsync will have no effect. Figure 3.13 shows timing results for the same tests on Mac OS X 10.3. Note that the times match our expectations: synchronous writes are far more expensive than delayed writes, and using fsync with synchronous writes makes no measurable difference. Note also that adding a call to fsync at the end of the delayed writes makes no measurable difference. It is likely that the operating system flushed previously written data to disk as we were writing new data to the file, so by the time that we called fsync, very little work was left to be done. Compare fsync and fdatasync, which update a file's contents when we say so, with the O_SYNC flag, which updates a file's contents every time we write to the file. Figure 3.11. Turn on one or more of the file status flags for a descriptor#include "apue.h" #include <fcntl.h> void set_fl(int fd, int flags) /* flags are file status flags to turn on */ { int val; if ((val = fcntl(fd, F_GETFL, 0)) < 0) err_sys("fcntl F_GETFL error"); val |= flags; /* turn on flags */ if (fcntl(fd, F_SETFL, val) < 0) err_sys("fcntl F_SETFL error"); }
With this example, we see the need for fcntl. Our program operates on a descriptor (standard output), never knowing the name of the file that was opened by the shell on that descriptor. We can't set the O_SYNC flag when the file is opened, since the shell opened the file. With fcntl, we can modify the properties of a descriptor, knowing only the descriptor for the open file. We'll see another need for fcntl when we describe nonblocking pipes (Section 15.2), since all we have with a pipe is a descriptor. |