2.5. LimitsThe implementations define many magic numbers and constants. Many of these have been hard coded into programs or were determined using ad hoc techniques. With the various standardization efforts that we've described, more portable methods are now provided to determine these magic numbers and implementation-defined limits, greatly aiding the portability of our software. Two types of limits are needed:
Compile-time limits can be defined in headers that any program can include at compile time. But runtime limits require the process to call a function to obtain the value of the limit. Additionally, some limits can be fixed on a given implementationand could therefore be defined statically in a headeryet vary on another implementation and would require a runtime function call. An example of this type of limit is the maximum number of characters in a filename. Before SVR4, System V historically allowed only 14 characters in a filename, whereas BSD-derived systems increased this number to 255. Most UNIX System implementations these days support multiple file system types, and each type has its own limit. This is the case of a runtime limit that depends on where in the file system the file in question is located. A filename in the root file system, for example, could have a 14-character limit, whereas a filename in another file system could have a 255-character limit. To solve these problems, three types of limits are provided:
To further confuse things, if a particular runtime limit does not vary on a given system, it can be defined statically in a header. If it is not defined in a header, however, the application must call one of the three conf functions (which we describe shortly) to determine its value at runtime. 2.5.1. ISO C LimitsAll the limits defined by ISO C are compile-time limits. Figure 2.6 shows the limits from the C standard that are defined in the file <limits.h>. These constants are always defined in the header and don't change in a given system. The third column shows the minimum acceptable values from the ISO C standard. This allows for a system with 16-bit integers using one's-complement arithmetic. The fourth column shows the values from a Linux system with 32-bit integers using two's-complement arithmetic. Note that none of the unsigned data types has a minimum value, as this value must be 0 for an unsigned data type. On a 64-bit system, the values for long integer maximums match the maximum values for long long integers.
One difference that we will encounter is whether a system provides signed or unsigned character values. From the fourth column in Figure 2.6, we see that this particular system uses signed characters. We see that CHAR_MIN equals SCHAR_MIN and that CHAR_MAX equals SCHAR_MAX. If the system uses unsigned characters, we would have CHAR_MIN equal to 0 and CHAR_MAX equal to UCHAR_MAX. The floating-point data types in the header <float.h> have a similar set of definitions. Anyone doing serious floating-point work should examine this file. Another ISO C constant that we'll encounter is FOPEN_MAX, the minimum number of standard I/O streams that the implementation guarantees can be open at once. This value is in the <stdio.h> header, and its minimum value is 8. The POSIX.1 value STREAM_MAX, if defined, must have the same value as FOPEN_MAX. ISO C also defines the constant TMP_MAX in <stdio.h>. It is the maximum number of unique filenames generated by the tmpnam function. We'll have more to say about this constant in Section 5.13. In Figure 2.7, we show the values of FOPEN_MAX and TMP_MAX on the four platforms we discuss in this book.
2.5.2. POSIX LimitsPOSIX.1 defines numerous constants that deal with implementation limits of the operating system. Unfortunately, this is one of the more confusing aspects of POSIX.1. Although POSIX.1 defines numerous limits and constants, we'll only concern ourselves with the ones that affect the base POSIX.1 interfaces. These limits and constants are divided into the following five categories:
Of these 44 limits and constants, some may be defined in <limits.h>, and others may or may not be defined, depending on certain conditions. We describe the limits and constants that may or may not be defined in Section 2.5.4, when we describe the sysconf, pathconf, and fpathconf functions. The 19 invariant minimum values are shown in Figure 2.8. These values are invariant; they do not change from one system to another. They specify the most restrictive values for these features. A conforming POSIX.1 implementation must provide values that are at least this large. This is why they are called minimums, although their names all contain MAX. Also, to ensure portability, a strictly-conforming application must not require a larger value. We describe what each of these constants refers to as we proceed through the text.
Unfortunately, some of these invariant minimum values are too small to be of practical use. For example, most UNIX systems today provide far more than 20 open files per process. Also, the minimum limit of 255 for _POSIX_PATH_MAX is too small. Pathnames can exceed this limit. This means that we can't use the two constants _POSIX_OPEN_MAX and _POSIX_PATH_MAX as array sizes at compile time. Each of the 19 invariant minimum values in Figure 2.8 has an associated implementation value whose name is formed by removing the _POSIX_ prefix from the name in Figure 2.8. The names without the leading _POSIX_ were intended to be the actual values that a given implementation supports. (These 19 implementation values are items 25 from our list earlier in this section: the invariant value, the runtime increasable value, the runtime invariant values, and the pathname variable values.) The problem is that not all of the 19 implementation values are guaranteed to be defined in the <limits.h> header. For example, a particular value may not be included in the header if its actual value for a given process depends on the amount of memory on the system. If the values are not defined in the header, we can't use them as array bounds at compile time. So, POSIX.1 decided to provide three runtime functions for us to callsysconf, pathconf, and fpathconfto determine the actual implementation value at runtime. There is still a problem, however, because some of the values are defined by POSIX.1 as being possibly "indeterminate" (logically infinite). This means that the value has no practical upper bound. On Linux, for example, the number of iovec structures you can use with readv or writev is limited only by the amount of memory on the system. Thus, IOV_MAX is considered indeterminate on Linux. We'll return to this problem of indeterminate runtime limits in Section 2.5.5. 2.5.3. XSI LimitsThe XSI also defines constants that deal with implementation limits. They include:
The invariant minimum values are listed in Figure 2.9. Many of these values deal with message catalogs. The last two illustrate the situation in which the POSIX.1 minimums were too smallpresumably to allow for embedded POSIX.1 implementationsso the Single UNIX Specification added symbols with larger minimum values for XSI-conforming systems. 2.5.4. sysconf, pathconf, and fpathconf FunctionsWe've listed various minimum values that an implementation must support, but how do we find out the limits that a particular system actually supports? As we mentioned earlier, some of these limits might be available at compile time; others must be determined at runtime. We've also mentioned that some don't change in a given system, whereas others can change because they are associated with a file or directory. The runtime limits are obtained by calling one of the following three functions.
The difference between the last two functions is that one takes a pathname as its argument and the other takes a file descriptor argument. Figure 2.10 lists the name arguments that sysconf uses to identify system limits. Constants beginning with _SC_ are used as arguments to sysconf to identify the runtime limit. Figure 2.11 lists the name arguments that are used by pathconf and fpathconf to identify system limits. Constants beginning with _PC_ are used as arguments to pathconf and fpathconf to identify the runtime limit.
We need to look in more detail at the different return values from these three functions.
There are some restrictions for the pathname argument to pathconf and the filedes argument to fpathconf. If any of these restrictions isn't met, the results are undefined.
ExampleThe awk(1) program shown in Figure 2.12 builds a C program that prints the value of each pathconf and sysconf symbol. The awk program reads two input filespathconf.sym and sysconf.symthat contain lists of the limit name and symbol, separated by tabs. All symbols are not defined on every platform, so the awk program surrounds each call to pathconf and sysconf with the necessary #ifdef statements. For example, the awk program transforms a line in the input file that looks like NAME_MAX _PC_NAME_MAX into the following C code: #ifdef NAME_MAX printf("NAME_MAX is defined to be %d\n", NAME_MAX+0); #else printf("no symbol for NAME_MAX\n"); #endif #ifdef _PC_NAME_MAX pr_pathconf("NAME_MAX =", argv[1], _PC_NAME_MAX); #else printf("no symbol for _PC_NAME_MAX\n"); #endif The program in Figure 2.13, generated by the awk program, prints all these limits, handling the case in which a limit is not defined. Figure 2.14 summarizes results from Figure 2.13 for the four systems we discuss in this book. The entry "no symbol" means that the system doesn't provide a corresponding _SC or _PC symbol to query the value of the constant. Thus, the limit is undefined in this case. In contrast, the entry "unsupported" means that the symbol is defined by the system but unrecognized by the sysconf or pathconf functions. The entry "no limit" means that the system defines no limit for the constant, but this doesn't mean that the limit is infinite. We'll see in Section 4.14 that UFS is the SVR4 implementation of the Berkeley fast file system. PCFS is the MS-DOS FAT file system implementation for Solaris. Figure 2.12. Build C program to print all supported configuration limitsBEGIN { printf("#include \"apue.h\"\n") printf("#include <errno.h>\n") printf("#include <limits.h>\n") printf("\n") printf("static void pr_sysconf(char *, int);\n") printf("static void pr_pathconf(char *, char *, int);\n") printf("\n") printf("int\n") printf("main(int argc, char *argv[])\n") printf("{\n") printf("\tif (argc != 2)\n") printf("\t\terr_quit(\"usage: a.out <dirname>\");\n\n") FS="\t+" while (getline <"sysconf.sym" > 0) { printf("#ifdef %s\n", $1) printf("\tprintf(\"%s defined to be %%d\\n\", %s+0);\n", $1, $1) printf("#else\n") printf("\tprintf(\"no symbol for %s\\n\");\n", $1) printf("#endif\n") printf("#ifdef %s\n", $2) printf("\tpr_sysconf(\"%s =\", %s);\n", $1, $2) printf("#else\n") printf("\tprintf(\"no symbol for %s\\n\");\n", $2) printf("#endif\n") } close("sysconf.sym") while (getline <"pathconf.sym" > 0) { printf("#ifdef %s\n", $1) printf("\tprintf(\"%s defined to be %%d\\n\", %s+0);\n", $1, $1) printf("#else\n") printf("\tprintf(\"no symbol for %s\\n\");\n", $1) printf("#endif\n") printf("#ifdef %s\n", $2) printf("\tpr_pathconf(\"%s =\", argv[1], %s);\n", $1, $2) printf("#else\n") printf("\tprintf(\"no symbol for %s\\n\");\n", $2) printf("#endif\n") } close("pathconf.sym") exit } END { printf("\texit(0);\n") printf("}\n\n") printf("static void\n") printf("pr_sysconf(char *mesg, int name)\n") printf("{\n") printf("\tlong val;\n\n") printf("\tfputs(mesg, stdout);\n") printf("\terrno = 0;\n") printf("\tif ((val = sysconf(name)) < 0) {\n") printf("\t\tif (errno != 0) {\n") printf("\t\t\tif (errno == EINVAL)\n") printf("\t\t\t\tfputs(\" (not supported)\\n\", stdout);\n") printf("\t\t\telse\n") printf("\t\t\t\terr_sys(\"sysconf error\");\n") printf("\t\t} else {\n") printf("\t\t\tfputs(\" (no limit)\\n\", stdout);\n") printf("\t\t}\n") printf("\t} else {\n") printf("\t\tprintf(\" %%ld\\n\", val);\n") printf("\t}\n") printf("}\n\n") printf("static void\n") printf("pr_pathconf(char *mesg, char *path, int name)\n") printf("{\n") printf("\tlong val;\n") printf("\n") printf("\tfputs(mesg, stdout);\n") printf("\terrno = 0;\n") printf("\tif ((val = pathconf(path, name)) < 0) {\n") printf("\t\tif (errno != 0) {\n") printf("\t\t\tif (errno == EINVAL)\n") printf("\t\t\t\tfputs(\" (not supported)\\n\", stdout);\n") printf("\t\t\telse\n") printf("\t\t\t\terr_sys(\"pathconf error, path = %%s\", path);\n") printf("\t\t} else {\n") printf("\t\t\tfputs(\" (no limit)\\n\", stdout);\n") printf("\t\t}\n") printf("\t} else {\n") printf("\t\tprintf(\" %%ld\\n\", val);\n") printf("\t}\n") printf("}\n") } Figure 2.13. Print all possible sysconf and pathconf values#include "apue.h" #include <errno.h> #include <limits.h> static void pr_sysconf(char *, int); static void pr_pathconf(char *, char *, int); int main(int argc, char *argv[]) { if (argc != 2) err_quit("usage: a.out <dirname>"); #ifdef ARG_MAX printf("ARG_MAX defined to be %d\n", ARG_MAX+0); #else printf("no symbol for ARG_MAX\n"); #endif #ifdef _SC_ARG_MAX pr_sysconf("ARG_MAX =", _SC_ARG_MAX); #else printf("no symbol for _SC_ARG_MAX\n"); #endif /* similar processing for all the rest of the sysconf symbols... */ #ifdef MAX_CANON printf("MAX_CANON defined to be %d\n", MAX_CANON+0); #else printf("no symbol for MAX_CANON\n"); #endif #ifdef _PC_MAX_CANON pr_pathconf("MAX_CANON =", argv[1], _PC_MAX_CANON); #else printf("no symbol for _PC_MAX_CANON\n"); #endif /* similar processing for all the rest of the pathconf symbols... */ exit(0); } static void pr_sysconf(char *mesg, int name) { long val; fputs(mesg, stdout); errno = 0; if ((val = sysconf(name)) < 0) { if (errno != 0) { if (errno == EINVAL) fputs(" (not supported)\n", stdout); else err_sys("sysconf error"); } else { fputs(" (no limit)\n", stdout); } } else { printf(" %ld\n", val); } } static void pr_pathconf(char *mesg, char *path, int name) { long val; fputs(mesg, stdout); errno = 0; if ((val = pathconf(path, name)) < 0) { if (errno != 0) { if (errno == EINVAL) fputs(" (not supported)\n", stdout); else err_sys("pathconf error, path = %s", path); } else { fputs(" (no limit)\n", stdout); } } else { printf(" %ld\n", val); } }
2.5.5. Indeterminate Runtime LimitsWe mentioned that some of the limits can be indeterminate. The problem we encounter is that if these limits aren't defined in the <limits.h> header, we can't use them at compile time. But they might not be defined at runtime if their value is indeterminate! Let's look at two specific cases: allocating storage for a pathname and determining the number of file descriptors. PathnamesMany programs need to allocate storage for a pathname. Typically, the storage has been allocated at compile time, and various magic numbersfew of which are the correct valuehave been used by different programs as the array size: 256, 512, 1024, or the standard I/O constant BUFSIZ. The 4.3BSD constant MAXPATHLEN in the header <sys/param.h> is the correct value, but many 4.3BSD applications didn't use it. POSIX.1 tries to help with the PATH_MAX value, but if this value is indeterminate, we're still out of luck. Figure 2.15 shows a function that we'll use throughout this text to allocate storage dynamically for a pathname. Figure 2.15. Dynamically allocate space for a pathname#include "apue.h" #include <errno.h> #include <limits.h> #ifdef PATH_MAX static int pathmax = PATH_MAX; #else static int pathmax = 0; #endif #define SUSV3 200112L static long posix_version = 0; /* If PATH_MAX is indeterminate, no guarantee this is adequate */ #define PATH_MAX_GUESS 1024 char * path_alloc(int *sizep) /* also return allocated size, if nonnull */ { char *ptr; int size; if (posix_version == 0) posix_version = sysconf(_SC_VERSION); if (pathmax == 0) { /* first time through */ errno = 0; if ((pathmax = pathconf("/", _PC_PATH_MAX)) < 0) { if (errno == 0) pathmax = PATH_MAX_GUESS; /* it's indeterminate */ else err_sys("pathconf error for _PC_PATH_MAX"); } else { pathmax++; /* add one since it's relative to root */ } } if (posix_version < SUSV3) size = pathmax + 1; else size = pathmax; if ((ptr = malloc(size)) == NULL) err_sys("malloc error for pathname"); if (sizep != NULL) *sizep = size; return(ptr); } If the constant PATH_MAX is defined in <limits.h>, then we're all set. If it's not, we need to call pathconf. The value returned by pathconf is the maximum size of a relative pathname when the first argument is the working directory, so we specify the root as the first argument and add 1 to the result. If pathconf indicates that PATH_MAX is indeterminate, we have to punt and just guess a value. Standards prior to SUSv3 were unclear as to whether or not PATH_MAX included a null byte at the end of the pathname. If the operating system implementation conforms to one of these prior versions, we need to add 1 to the amount of memory we allocate for a pathname, just to be on the safe side. The correct way to handle the case of an indeterminate result depends on how the allocated space is being used. If we were allocating space for a call to getcwd, for exampleto return the absolute pathname of the current working directory; see Section 4.22and if the allocated space is too small, an error is returned and errno is set to ERANGE. We could then increase the allocated space by calling realloc (see Section 7.8 and Exercise 4.16) and try again. We could keep doing this until the call to getcwd succeeded. Maximum Number of Open FilesA common sequence of code in a daemon processa process that runs in the background, not connected to a terminalis one that closes all open files. Some programs have the following code sequence, assuming the constant NOFILE was defined in the <sys/param.h> header: #include <sys/param.h> for (i = 0; i < NOFILE; i++) close(i); Other programs use the constant _NFILE that some versions of <stdio.h> provide as the upper limit. Some hard code the upper limit as 20. We would hope to use the POSIX.1 value OPEN_MAX to determine this value portably, but if the value is indeterminate, we still have a problem. If we wrote the following and if OPEN_MAX was indeterminate, the loop would never execute, since sysconf would return -1: #include <unistd.h> for (i = 0; i < sysconf(_SC_OPEN_MAX); i++) close(i); Our best option in this case is just to close all descriptors up to some arbitrary limit, say 256. As with our pathname example, this is not guaranteed to work for all cases, but it's the best we can do. We show this technique in Figure 2.16. Figure 2.16. Determine the number of file descriptors#include "apue.h" #include <errno.h> #include <limits.h> #ifdef OPEN_MAX static long openmax = OPEN_MAX; #else static long openmax = 0; #endif /* * If OPEN_MAX is indeterminate, we're not * guaranteed that this is adequate. */ #define OPEN_MAX_GUESS 256 long open_max(void) { if (openmax == 0) { /* first time through */ errno = 0; if ((openmax = sysconf(_SC_OPEN_MAX)) < 0) { if (errno == 0) openmax = OPEN_MAX_GUESS; /* it's indeterminate */ else err_sys("sysconf error for _SC_OPEN_MAX"); } } return(openmax); } We might be tempted to call close until we get an error return, but the error return from close (EBADF) doesn't distinguish between an invalid descriptor and a descriptor that wasn't open. If we tried this technique and descriptor 9 was not open but descriptor 10 was, we would stop on 9 and never close 10. The dup function (Section 3.12) does return a specific error when OPEN_MAX is exceeded, but duplicating a descriptor a couple of hundred times is an extreme way to determine this value. Some implementations will return LONG_MAX for limits values that are effectively unlimited. Such is the case with the Linux limit for ATEXIT_MAX (see Figure 2.14). This isn't a good idea, because it can cause programs to behave badly. For example, we can use the ulimit command built into the Bourne-again shell to change the maximum number of files our processes can have open at one time. This generally requires special (superuser) privileges if the limit is to be effectively unlimited. But once set to infinite, sysconf will report LONG_MAX as the limit for OPEN_MAX. A program that relies on this value as the upper bound of file descriptors to close as shown in Figure 2.16 will waste a lot of time trying to close 2,147,483,647 file descriptors, most of which aren't even in use. Systems that support the XSI extensions in the Single UNIX Specification will provide the getrlimit(2) function (Section 7.11). It can be used to return the maximum number of descriptors that a process can have open. With it, we can detect that there is no configured upper bound to the number of open files our processes can open, so we can avoid this problem.
|