8.12. Interpreter FilesAll contemporary UNIX systems support interpreter files. These files are text files that begin with a line of the form
#! pathname [ optional-argument ]
The space between the exclamation point and the pathname is optional. The most common of these interpreter files begin with the line #!/bin/sh The pathname is normally an absolute pathname, since no special operations are performed on it (i.e., PATH is not used). The recognition of these files is done within the kernel as part of processing the exec system call. The actual file that gets executed by the kernel is not the interpreter file, but the file specified by the pathname on the first line of the interpreter file. Be sure to differentiate between the interpreter filea text file that begins with #!and the interpreter, which is specified by the pathname on the first line of the interpreter file. Be aware that systems place a size limit on the first line of an interpreter file. This limit includes the #!, the pathname, the optional argument, the terminating newline, and any spaces.
ExampleLet's look at an example to see what the kernel does with the arguments to the exec function when the file being executed is an interpreter file and the optional argument on the first line of the interpreter file. The program in Figure 8.20 execs an interpreter file. The following shows the contents of the one-line interpreter file that is executed and the result from running the program in Figure 8.20: $ cat /home/sar/bin/testinterp #!/home/sar/bin/echoarg foo $ ./a.out argv[0]: /home/sar/bin/echoarg argv[1]: foo argv[2]: /home/sar/bin/testinterp argv[3]: myarg1 argv[4]: MY ARG2 The program echoarg (the interpreter) just echoes each of its command-line arguments. (This is the program from Figure 7.4.) Note that when the kernel execs the interpreter (/home/sar/bin/echoarg), argv[0] is the pathname of the interpreter, argv[1] is the optional argument from the interpreter file, and the remaining arguments are the pathname (/home/sar/bin/testinterp) and the second and third arguments from the call to execl in the program shown in Figure 8.20 (myarg1 and MY ARG2). Both argv[1] and argv[2] from the call to execl have been shifted right two positions. Note that the kernel takes the pathname from the execl call instead of the first argument (testinterp), on the assumption that the pathname might contain more information than the first argument. Figure 8.20. A program that execs an interpreter file#include "apue.h" #include <sys/wait.h> int main(void) { pid_t pid; if ((pid = fork()) < 0) { err_sys("fork error"); } else if (pid == 0) { /* child */ if (execl("/home/sar/bin/testinterp", "testinterp", "myarg1", "MY ARG2", (char *)0) < 0) err_sys("execl error"); } if (waitpid(pid, NULL, 0) < 0) /* parent */ err_sys("waitpid error"); exit(0); } ExampleA common use for the optional argument following the interpreter pathname is to specify the -f option for programs that support this option. For example, an awk(1) program can be executed as awk -f myfile which tells awk to read the awk program from the file myfile.
Using the -f option with an interpreter file lets us write
#!/bin/awk -f
(awk program follows in the interpreter file)
For example, Figure 8.21 shows /usr/local/bin/awkexample (an interpreter file). If one of the path prefixes is /usr/local/bin, we can execute the program in Figure 8.21 (assuming that we've turned on the execute bit for the file) as
$ awkexample file1 FILENAME2 f3
ARGV[0] = awk
ARGV[1] = file1
ARGV[2] = FILENAME2
ARGV[3] = f3
When /bin/awk is executed, its command-line arguments are /bin/awk -f /usr/local/bin/awkexample file1 FILENAME2 f3 The pathname of the interpreter file (/usr/local/bin/awkexample) is passed to the interpreter. The filename portion of this pathname (what we typed to the shell) isn't adequate, because the interpreter (/bin/awk in this example) can't be expected to use the PATH variable to locate files. When it reads the interpreter file, awk ignores the first line, since the pound sign is awk's comment character. We can verify these command-line arguments with the following commands: $ /bin/su become superuser Password: enter superuser password # mv /bin/awk /bin/awk.save save the original program # cp /home/sar/bin/echoarg /bin/awk and replace it temporarily # suspend suspend the superuser shell using job control [1] + Stopped /bin/su $ awkexample file1 FILENAME2 f3 argv[0]: /bin/awk argv[1]: -f argv[2]: /usr/local/bin/awkexample argv[3]: file1 argv[4]: FILENAME2 argv[5]: f3 $ fg resume superuser shell using job control /bin/su # mv /bin/awk.save /bin/awk restore the original program # exit and exit the superuser shell In this example, the -f option for the interpreter is required. As we said, this tells awk where to look for the awk program. If we remove the -f option from the interpreter file, an error message usually results when we try to run it. The exact text of the message varies, depending on where the interpreter file is stored and whether the remaining arguments represent existing files. This is because the command-line arguments in this case are /bin/awk /usr/local/bin/awkexample file1 FILENAME2 f3 and awk is trying to interpret the string /usr/local/bin/awkexample as an awk program. If we couldn't pass at least a single optional argument to the interpreter (-f in this case), these interpreter files would be usable only with the shells. Figure 8.21. An awk program as an interpreter file#!/bin/awk -f BEGIN { for (i = 0; i < ARGC; i++) printf "ARGV[%d] = %s\n", i, ARGV[i] exit } Are interpreter files required? Not really. They provide an efficiency gain for the user at some expense in the kernel (since it's the kernel that recognizes these files). Interpreter files are useful for the following reasons.
None of this would work as we've shown if the three shells and awk didn't use the pound sign as their comment character. |