Using and Administering Linux: Volume 1: Zero to SysAdmin: Getting Started

options. Be sure to read the DNF man page.

Chapter summary Updating software and installing new software are easy with tools like DNF, the DaNdiFied YUM package manager. DNF is a wrapper around the powerful RPM Package Manager, but DNF offers advanced features such as the ability to provide automated handling of dependencies; it will determine the dependencies, download them from the repository on the Internet, and install them. DNF uses the concept of groups to enable installation and removal of large numbers of related packages such as would be used by complex software systems. Using groups to define things like desktops, development environments, office suites, scientific, and related technology packages makes it easy to install complete systems with a single command. DNF and RPM both provide tools that enable exploring the content of RPM packages. It is possible to list the files that will be installed by an RPM package and the other packages upon which it is dependent. We installed some additional repositories beyond the default repos provided by Fedora. Additional repos make it easier to install software that is not part of the distribution.

Exercises Perform the following exercises to complete this chapter: 1. Back in Experiment 12-7, you removed the utils package, and the mc (Midnight Commander) package was also removed. Provide a detailed explanation for why DNF removed mc, too.

332

Chapter 12

Installing andUpdating Software

2. Did you know that you can browse the Internet, receive and send e-mail, download files from remote servers, and more, all in a terminal command line text–based environment? Identify all of the packages required to perform these tasks and install them. 3. Reboot your student VM, and select one of the older kernels but not the recovery option. Use a few of the tools you have already learned to explore and determine that everything seems to be working fine. 4. On occasion the DNF database and cache may become corrupted or at least out of sync with the system. How would you correct that situation?

333

CHAPTER 13

Tools forProblem Solving O bjectives In this chapter you will learn •

A procedure to use for solving problems

•

To install some useful problem-solving tools that are not always installed by default

•

To select and use the correct tools to investigate the status of various Linux system resources such as CPU, memory, and disk

•

To create command line programs that simulate certain problems

•

To use available tools to locate and resolve the simulated problems

•

To create a FIFO named pipe to illustrate the function of buffers

This chapter introduces a few powerful and important tools that can be used for locating and solving problems. This is a very long chapter because there is so much to know about these tools. I have intentionally grouped these tools into this one chapter because they are all closely related in at least two ways. First, they are some of the most basic and commonly used tools used for problem determination. Second, these tools offer significant overlap in the data that they provide so your choice of which tool to use for a particular purpose can be rather flexible. All of these tools are powerful and flexible and offer many options for how the data they can access is displayed. Rather than cover every possible option, I will try to provide you with enough information about these tools to pique your curiosity and encourage your own explorations into their depths. “Follow your curiosity” is one of the tenets of The Linux Philosophy for SysAdmins.1

Both, David, The Linux Philosophy for SysAdmins, Apress, 2018, Chapter 22

335

Chapter 13

Tools forProblem Solving

The art ofproblem solving One of the best things that my mentors helped me with was the formulation of a defined process that I could always use for solving problems of nearly any type. This process is very closely related to the scientific method. I find this short article entitled, “How the Scientific Method Works,”2 to be very helpful. It describes the scientific method using a diagram very much like the one I have created for my five steps of problem solving. So I pass this on as a mentor, and it is my contribution to all of you young SysAdmins. I hope that you find it as useful as I have. Solving problems of any kind is art, science, and– some would say– perhaps a bit of magic, too. Solving technical problems, such as those that occur with computers, requires a good deal of specialized knowledge as well. Any approach to solving problems of any nature– including problems with Linux– must include more than just a list of symptoms and the steps necessary to fix or circumvent the problems which caused the symptoms. This so-called “symptom-fix” approach looks good on paper to the managers– the Pointy-Haired Bosses, the PHBs– but it really sucks in practice. The best way to approach problem solving is with a large base of knowledge of the subject and a strong methodology.

The five steps ofproblem solving There are five basic steps that are involved in the problem-solving process as shown in Figure13-1. This algorithm is very similar to that of the scientific method referred to in Footnote 1 but is specifically intended for solving technical problems. You probably already follow these steps when you troubleshoot a problem but do not even realize it. These steps are universal and apply to solving most any type of problem, not just problems with computers or Linux. I used these steps for years in various types of problems without realizing it. Having them codified for me made me much more effective at solving problems because when I became stuck, I could review the steps I had taken, verify where I was in the process, and restart at any appropriate step.

Harris, William, How the Scientific Method Works, https://science.howstuffworks.com/ innovation/scientific-experiments/scientific-method6.htm

336

Chapter 13

Tools forProblem Solving

Figure 13-1. The algorithm I use for troubleshooting You may have heard a couple other terms applied to problem solving in the past. The first three steps of this process are also known as problem determination, that is, finding the root cause of the problem. The last two steps are problem resolution which is actually fixing the problem. The next sections cover each of these five steps in more detail.

Knowledge Knowledge of the subject in which you are attempting to solve a problem is the first step. All of the articles I have seen about the scientific method seem to assume this as a prerequisite. However the acquisition of knowledge is an ongoing process, driven by curiosity and augmented by the knowledge gained from using the scientific method to explore and extend your existing knowledge through experimentation. This is one of the reasons I use the term “experiment” in this course rather than something like “lab project.” 337

Chapter 13

Tools forProblem Solving

You must be knowledgeable about Linux at the very least, and even more, you must be knowledgeable about the other factors that can interact with and affect Linux, such as hardware, the network, and even environmental factors such as how temperature, humidity, and the electrical environment in which the Linux system operates can affect it. Knowledge can be gained by reading books and web sites about Linux and those other topics. You can attend classes, seminars, and conferences and through interaction with other knowledgeable people who can be found there. You can also just set up a number of Linux computers in a networked environment, physical or virtual, as we have done in this course. Knowledge is gained when you resolve a problem and discover a new cause for a particular type of problem. You can also find new knowledge when an attempt to fix a problem results in a temporary failure. Classes are also valuable in providing us with new knowledge. My personal preference is to play– uh, experiment– with Linux or with a particular piece such as networking, name services, DHCP, Chrony, and more, and then take a class or two to help me internalize the knowledge I have gained. Remember, “Without knowledge, resistance is futile,” to paraphrase the Borg. Knowledge is power.

Observation The second step in solving the problem is to observe the symptoms of the problem. It is important to take note of all of the problem symptoms. It is also important to observe what is working properly. This is not the time to try to fix the problem; merely observe. Another important part of observation is to ask yourself questions about what you see and what you do not see. Aside from the questions you need to ask that are specific to the problem, there are some general questions to ask:

338

•

Is this problem caused by hardware, Linux, application software, or perhaps by lack of user knowledge or training?

•

Is this problem similar to others I have seen?

•

Is there an error message?

•

Are there any log entries pertaining to the problem?

•

What was taking place on the computer just before the error occurred?

Chapter 13

Tools forProblem Solving

•

What did I expect to happen if the error had not occurred?

•

Has anything about the system hardware or software changed recently?

Other questions will reveal themselves as you work to answer these. The important thing to remember here is not the specific questions but rather to gather as much information as possible. This increases the knowledge you have about this specific problem instance and aids in finding the solution. As you gather data, never assume that the information obtained from someone else is correct. Observe everything yourself. The best problem solvers are those who never take anything for granted. They never assume that the information they have is 100% accurate or complete. When the information you have seems to contradict itself or the symptoms, start over from the beginning as if you have no information at all. In almost all of the jobs I have had in the computer business, we have always tried to help each other out, and this was true when I was at IBM.I have always been very good at fixing things, and there were times when I would show up at a customer when another customer engineer (CE) was having a particularly difficult time finding the source of a problem. The first thing I would do is assess the situation. I would ask the primary CE what they had done so far to locate the problem. After that I would start over from the beginning. I always wanted to see the results myself. Many times that paid off because I would observe something that others had missed. And, of course, the other CE’s– my mentors– would help me out in the same way. In one very strange incident, I fixed a large computer by sitting on it. That is a long story and amounts to the fact that I observed a very brief symptom that was caused by sitting on the workspace that was the top of a very large printer control unit. The complete story can be found in my book, The Linux Philosophy for SysAdmins.3

R easoning Use reasoning skills to take the information from your observations of the symptoms, your knowledge to determine a probable cause for the problem. We discussed the different types of reasoning in some detail in Chapter 23.4 The process of reasoning through your observations of the problem, your knowledge, and your past experience is

Both, David, The Linux Philosophy for SysAdmins, Apress, 2018, 471–472 op. cit, Ch 23.

3 4

339

Chapter 13

Tools forProblem Solving

where art and science combine to produce inspiration, intuition, or some other mystical mental process that provides some insight to the root cause of the problem. In some cases this is a fairly easy process. You can see an error code and look up its meaning from the sources available to you. Or perhaps you observe a symptom that is familiar and you know what steps might resolve it. You can then apply the vast knowledge you have gained by reading about Linux, this book, and the documentation provided with Linux to reason your way to the cause of the problem. In other cases it can be a very difficult and lengthy part of the problem determination process. These are the types of cases that can be the most difficult. Perhaps symptoms you have never seen or a problem that is not resolved by any of the methods you have used. It is these difficult ones that require more work and especially more reasoning applied to them. It helps to remember that the symptom is not the problem. The problem causes the symptom. You want to fix the true problem not just the symptom.

Action Now is the time to perform the appropriate repair action. This is usually the simple part. The hard part is what came before– figuring out what to do. After you know the cause of the problem, it is usually easy to determine the correct repair action to take. The specific action you take will depend upon the cause(s) of the problem. Remember, we are fixing the root cause, not just trying to get rid of or cover up the symptom. Make only one change at a time. If there are several actions that can be taken that might correct the cause of a problem, only make the one change or take the one action that is most likely to resolve the root cause. The selection of the corrective action with the highest probability of fixing the problem is what you are trying to do here. Whether it is your own experience telling you which action to take or the experiences of others, move down the list from highest to lowest priority, one action at a time. Test the results after each action.

Test After taking some overt repair action, the repair should be tested. This usually means performing the task that failed in the first place, but it could also be a single, simple command that illustrates the problem. 340

Chapter 13

Tools forProblem Solving

We make a single change, taking one potential corrective action and then testing the results of that action. This is the only way in which we can be certain which corrective action fixed the problem. If we were to make several corrective actions and then test one time, there is no way to know which action was responsible for fixing the problem. This is especially important if we want to walk back those ineffective changes we made after finding the solution. If the repair action has not been successful, you should begin the procedure over again. If there are additional corrective actions you can take, return to that step and continue doing so until you have run out of possibilities or have learned with to a certainty that you are on the wrong track. Be sure to check the original observed symptoms when testing. It is possible that they have changed due to the action you have taken and you need to be aware of this in order to make informed decisions during the next iteration of the process. Even if the problem has not been resolved, the altered symptom could be very valuable in determining how to proceed. As you work through a problem, it will be necessary to iterate through at least some of the steps. If, for example, performing a given corrective action does not resolve the problem, you may need to try another action that has also been known to resolve the problem in the past. Figure13-1 shows that you may need to iterate to any previous step in order to continue. It may be necessary to go back to the observation step and gather more information about the problem. I have also found that sometimes it was a good idea to go back to the knowledge step and gather more basic knowledge. This latter includes reading or re-reading manuals, man pages, using a search engine, whatever is necessary to gain the knowledge required to continue past the point where I was blocked. Be flexible. Don’t hesitate to step back and start over if nothing else produces some forward progress.

System performance andproblem solving Now let's explore some commands that enable you to observe various configuration and performance aspects of your Linux system. Be sure to use the man pages for each command if you have questions about the syntax or interpreting the data displayed. There are a large number of Linux commands that are used in the process of analyzing system performance and problem determination. Most of these commands obtain their information from various files in the /proc filesystem which we will explore 341

Chapter 13

Tools forProblem Solving

later. You may wish to use multiple terminal sessions side by side in order to make some of the comparisons between commands and their output. I use top, htop, and atop as my primary tools when starting the process of problem determination. These three tools all display much of the same information, but each does it in its own way and with different emphasis. All three of these tools display system data in near real time. The top and htop utilities are also interactive and allow the SysAdmin to renice and kill processes by sending them signals. The atop tools can kill processes, but it cannot renice them.

Note The nice command can be used to change the nice number (renice) of a process in order to modify its priority level and so how much CPU time it might be allocated by the Linux scheduler. We will explore nice numbers, priority, and scheduling as we proceed through this chapter and also in Volume 2, Chapter 4. Let’s look at each of these three tools in some detail.

top The top command is my go-to tool when I am solving problems that involve any type of performance issues. I like it because it has been around since forever and is always available while the other tools may not be installed. The top utility is always installed by Fedora and all of the other distributions I have worked with. The top program is a very important and powerful tool to observe memory and CPU usage as well as load averages in a dynamic setting. The information provided by top can be instrumental in helping diagnose an extant problem; it is usually the first tool I use when troubleshooting a new problem. Understanding the information that top is presenting is key to using it to greatest effect. Let’s look at some of the data which can alert us to performance problems and explore their meanings in more depth. Much of this information also pertains to the other system monitors we will study which also display some of this same information. The top utility displays system information in near real time, updating (by default) every three seconds. Fractional seconds are allowed, although very small values can place a significant load the system. It is also interactive and the data columns to be displayed and the sort column can be modified. 342

Chapter 13

Tools forProblem Solving

EXPERIMENT 13-1 Perform this experiment as root on StudentVM1. Start top: [root@StudentVM1 ~]# top

The results are displayed full screen and are live, updating every three seconds. top is an interactive tool that allows some changes to things like which programs are displayed and how the displayed results are sorted. It also allows some interaction with programs such as renicing them to change their priority and killing them: top- 21:48:21 up 7 days,8:50,7 users,load average: 0.00, 0.00, 0.00 Tasks: 195 total,1 running, 136 sleeping,0 stopped,0 zombie %Cpu(s):0.0 us,0.2 sy,0.0 ni, 99.7 id,0.0 wa,0.0 hi,0.2 si,0.0 st KiB Mem :4038488 total,2369772 free, 562972 used,1105744 buff/cache KiB Swap: 10485756 total, 10485756 free,0 used.3207808 avail Mem PID USERPRNIVIRTRESSHR 5173 student20031608433282884 7396 root20025735645043620 1 root20023700098206764 2 root200 000 3 root0 -20000 4 root0 -20000 6 root0 -20000 kworker/0:0H-kb 8 root0 -20000 9 root200 000 10 root200 000 11 root200 000

S%CPU %MEM TIME+ S0.30.15:16.42 R0.30.10:00.03 S0.00.20:23.54 S0.00.00:00.26 I0.00.00:00.00 I0.00.00:00.00 I0.00.00:00.00

COMMAND VBoxClient top systemd kthreadd rcu_gp rcu_par_gp

I0.00.00:00.00 S0.00.00:01.40 I0.00.00:10.44 I0.00.00:00.00

mm_percpu_wq ksoftirqd/0 rcu_sched rcu_bh

Let that run as you study it. Then press the s (lowercase) key. The top utility displays Change delay from 3.0 to and you type 1 and press the Enter key. This sets the display update to one second. I find this to be a bit more responsive and is more to my liking than the default three seconds. Now press the 1 key to display both CPUs in this VM to show the statistics for each CPU on a separate line in the header section. Pressing 1 again would return to display of the aggregate CPU data. Do that a couple times to compare the data but leave it so that top displays both CPUs when you are finished.

343

Chapter 13

Tools forProblem Solving

After making these changes, we want to make them permanent. The top utility does not automatically save these changes, so we must press W (uppercase) to write the modified configuration to the ~/.toprc file. Let top run while you read about the various sections in the display. The top display is divided into two sections. The “Summary” section, which is the topmost section of the output, and the “process” section which is the lower portion of the output; I will use this terminology for top, atop, and htop in the interest of consistency. The top program has a number of useful interactive commands you can use to manage the display of data and to manipulate individual processes. Use the h command to view a brief help page for the various interactive commands. Be sure to press h twice to see both pages of the help. Use the q key to quit from Help and return to the active display.

S ummary section The top summary section contains information that provides an excellent overview of the current system status. This section can inform you of the basic facts of overall CPU and memory usage as well as CPU load trends. The first line shows the system uptime and the 1-, 5-, and 15-minute load averages. In Experiment 13-1 the load averages are all zero because the host is doing very little. Figure13-2 shows load averages on a system that has some work going on. The second line shows the number of process currently active and the status of each. The lines containing CPU statistics are shown next. There can be a single line which combines the statistics for all CPUs present in the system, or as in Figure13-2, one line for each CPU, in this case a single quad core CPU.Press the 1 key to toggle between the consolidated display of CPU usage and the display of the individual CPUs. The data in these lines is displayed as percentages of the total CPU time available. The last two lines in the Summary section are memory usage. They show the physical memory usage including both RAM and swap space. Many of the other tools we will look at present some or all of this same information. The next few sections explore this displayed information in detail, and this will also apply to the same information when it is displayed in all of those other tools.

344

Chapter 13

Tools forProblem Solving

Load averages The first line of the output from top contains the current load averages. Load averages represent the 1-, 5-, and 15-minute load average for a system. In Figure13-2, which was taken from a host with four CPUs, the load averages are 2.49, 1.37, and 0.60, respectively. top - 12:21:44 up 1 day, 3:25, 7 users, load average: 2.49, 1.37, 0.60 Tasks: 257 total,

5 running, 252 sleeping,

0 stopped,

0 zombie

Cpu0 : 33.2%us, 32.3%sy, 0.0%ni, 34.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 51.7%us, 24.0%sy, 0.0%ni, 24.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 24.6%us, 48.5%sy, 0.0%ni, 27.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 67.1%us, 21.6%sy, 0.0%ni, 11.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem:

6122964k total, 3582032k used, 2540932k free,

Swap: 8191996k total,

358752k buffers

0k used, 8191996k free, 2596520k cached

Figure 13-2. The load averages in this top sample indicate a recent increase in CPU usage But what does this really mean when I say that the one (or five or ten) minute load average is 2.49? Load average can be considered a measure of demand for the CPU; it is a number that represents the average number of instructions waiting for CPU time. Thus in a single processor system, a fully utilized CPU would have a load average of 1. This means that the CPU is keeping up exactly with the demand; in other words it has perfect utilization. A load average of less than one means that the CPU is underutilized, and a load average of greater than 1 means that the CPU is overutilized and that there is pent-up, unsatisfied demand. For example, a load average of 1.5in a single CPU system indicates that some instructions are forced to wait to be executed until the one preceding it has completed. This is also true for multiple processors. If a 4-CPU system has a load average of 4, then it has perfect utilization. If it has a load average of 3.24, for example, then three of its processors are fully utilized, and one is underutilized by about 76%. In the preceding example, a 4-CPU system has a 1-minute load average of 2.49, meaning that there is still significant capacity available among the four CPUs. A perfectly utilized 4-CPU system would show a load average of 4.00.

345

Chapter 13

Tools forProblem Solving

The optimum condition for load average in idealistic server environments is for it to equal the total number of CPUs in a system. That would mean that every CPU is fully utilized, and yet no instructions must be forced to wait. Also note that the longer-term load averages provide indication of the overall utilization trend. It appears in the preceding example that the short-term load average is indicative of a short-term peak in utilization but that there is still plenty of capacity available. Linux Journal has an excellent article describing load averages, the theory and the math behind them, and how to interpret them in the December 1, 2006 issue.5

CPU usage CPU usage is a fairly simple measure of how much CPU time is being used by executing instructions. These numbers are displayed as percentages and represent the amount of time that a CPU is being used during the defined time period. The default update time interval is usually three seconds although this can be changed using the “s” key, and I normally use one second. Fractional seconds are also accepted down to .01 seconds. I do not recommend very short intervals, that is, less than one second, as this adds load to the system and makes it difficult to read the data. However, as with everything Linux and its flexibility, it may occasionally be useful to set the interval to less than one second. top - 09:47:38 up 13 days, 24 min, 6 users, load average: 0.13, 0.04, 0.01 Tasks: 180 total,

1 running, 179 sleeping,

0 stopped,

0 zombie

Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.9%us, 0.9%sy, 0.0%ni, 98.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2

0.0%us,

0.0%sy, 0.0%ni,100.0%id, 0.0%wa,

0.0%hi,

0.0%si,

0.0%st

Cpu3 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem:

2056456k total,

797768k used, 1258688k free,

Swap: 4095992k total,

88k used, 4095904k free,

92028k buffers 336252k cached

Figure 13-3. The summary section of top contains a comprehensive overview of CPU and memory usage data

Walker, Ray, Examining Load Average, Linux Journal, Dec. 1, 2006, https://archive.org/ details/Linux-Journal-2006-12/page/n81

346

Chapter 13

Tools forProblem Solving

There are eight fields which describe CPU usage in more detail. The us, sy, ni, id, wa, hi, si, and st fields subdivide the CPU usage into categories that can provide more insight into what is using CPU time in the system: •

us: User space is CPU time spent performing tasks in user space as opposed to system, or kernel space. This is where user-level programs run.

•

sy: System is CPU time spent performing system tasks. These are mostly kernel tasks such as memory management, task dispatching, and all the other tasks performed by the kernel.

•

ni: This is “nice” time; CPU time spent on tasks that have a positive nice number; a positive nice number makes a task nicer, that is, it is less demanding of CPU time, and other tasks may get priority over it.

•

id: Idle time is any time that the CPU is free and is not performing any processing or waiting for I/O to occur.

•

wa: IO wait time is the amount of time that a CPU is waiting on some I/O such as a disk read or write to occur. The program running on that CPU is waiting for the result of that I/O operation before it can continue and is blocked until then.

•

hi: The percentage of CPU time waiting for hardware interrupts in the time interval. A high number here, especially when IO wait is also high, can be indicative that hardware speed is too slow for the existing load.

•

si: The number of software interrupts during the time interval. A high number here, especially when IO wait is also high, can be indicative that some software application(s) may be in some sort of tight loop or a race condition.

•

st: This is time stolen from “this” VM because it can run, but another VM is running, and the VM hypervisor cannot allocate time to “this” VM.This should always be zero for a non-virtual host. In a virtual host, a number significantly greater than zero might mean that more physical CPU power is required for the given real and virtual system load.

These times should usually add up to 100% for each CPU give or take a bit of rounding error. 347

Chapter 13

Tools forProblem Solving

Process section The process section of the output from top is a listing of the running processes in the system— at least the for the number of processes for which there is room on the terminal display. The default columns displayed by top are described in the following. Several other columns are available, and each can usually be added with a single keystroke; refer to the top man page for details: •

PID: The process ID.

•

USER: The username of the process owner.

•

PR: The priority of the process.

•

NI: The nice number of the process.

•

VIRT: The total amount of virtual memory allocated to the process.

•

RES: Resident size (in kb unless otherwise noted) of non-swapped physical RAM memory consumed by a process.

•

SHR: The amount of shared memory in kb used by the process.

•

S: The status of the process. This can be R for running, I for idle time, S for sleeping, and Z for zombie. Less frequently seen statuses can be T for traced or stopped, I for idle, and D for deep, uninterruptible sleep.

•

%CPU: The percentage of CPU cycles, used by this process during the last measured time period.

•

%MEM: The percentage of physical system memory used by the process.

•

TIME+: The cumulative CPU time to 100ths of a second consumed by the process since the process was started.

•

COMMAND: This is the command that was used to launch the process.

Use the Page Up and Page Down keys to scroll through the list of running processes. You can use the < and > keys to sequence the sort column to the left or right. The k key can be used to kill a process or the r key to renice it. You have to know the process ID (PID) of the process you want to kill or renice, and that information is displayed in the process section of the top display. When killing a process, top asks first for the PID and then for the signal number to use in killing the process. Type them in, and press the enter key after each. Start with signal 15, SIGTERM, and if that does not kill the process, use 9, SIGKILL. 348

Chapter 13

Tools forProblem Solving

Things tolook forwithCPU usage You should check a couple things with CPU usage when you are troubleshooting a problem. Look for one or more CPUs that have 0% idle time for extended periods. You especially have a problem if all CPUs have zero or very low idle time. You should then look to the task area of the display to determine which process is using the CPU time. Be careful to understand whether the high CPU usage might be normal for a particular environment or program, so you will know whether you might be seeing normal or transient behavior. The load averages discussed in the following can be used to help with determination of whether the system is overloaded or just very busy. Let’s explore the use of top to observe CPU usage when we have programs that suck it up.

EXPERIMENT 13-2 Start a second terminal session as user student, and position it near the root terminal session that is already running top so that they can both be seen simultaneously. As the user student, create a file named cpuHog in your home directory and make it executable with the permissions rwxr_xr_x: [student@studentvm1 ~]$ touch cpuHog [student@studentvm1 ~]$ chmod 755 cpuHog

Use the vim editor to add the following content to the file: #!/bin/bash # This little program is a cpu hog X=0;while [ 1 ];do echo $X;X=$((X+1));done

Save this Bash shell script, close vim, and run the cpuHog program with the following command: [student@studentvm1 ~]$ ./cpuHog

The preceding program simply counts up by one and prints the current value of X to STDOUT.And it sucks up CPU cycles. Observe the effect this has on system performance in top. CPU usage should immediately go up, and the load averages should also start to increase over time. What is the priority of the cpuHog program?

349

Chapter 13

Tools forProblem Solving

Now open another terminal session as the student user, and run the same program in it. You should now have two instances of this program running. Notice in top that the two processes tend to get about the same amount of CPU time on average. Sometimes one gets more than the other, and sometimes they get about the same amount. Figure 13-4 shows the results in top when two of these CPU hogs are running. Note that I have logged in remotely using SSH and am using the screen program to perform these experiments on the VM, so both of those tools show up with high CPU usage in Figure13-4. You should not have those two entries in your top output. The results you see are essentially the same.

top - 11:46:13 up 20:55, 6 users, load average: 3.64, 2.46, 1.14 Tasks: 161 total,

5 running, 97 sleeping,

0 stopped,

0 zombie

%Cpu0 : 3.0 us, 73.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 12.1 hi, 11.1 si, 0.0 st %Cpu1 : 11.2 us, 85.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 3.1 hi, 0.0 si, 0.0 st KiB Mem : 4038488 total, 3015548 free,

240244 used,

KiB Swap: 10485756 total, 10485756 free, PID USER

PR NI

782696 buff/cache

0 used. 3543352 avail Mem

VIRT

RES

SHR S %CPU %MEM

TIME+ COMMAND

15481 student

0 214388

1180

1036 R 52.0 0.0

0:19.30 cpuHog

15408 student

0 214388

1184

1040 R 33.3 0.0

4:07.18 cpuHog

15217 student

0 134336

4944

3768 R 31.4 0.1

2:02.57 sshd

15359 student

0 228968

3008

2212 R 31.4 0.1

2:19.63 screen

15017 root

0 I 13.7 0.0

0:27.36 kworker/u4:2-ev

15158 root

0 I 13.7 0.0

0:22.97 kworker/u4:0-ev

814 root

98212

6704

5792 S

1.0 0.2

0:02.01 rngd

13103 root

0 257244

4384

3628 R

1.0 0.1

1:16.87 top

1 root

0 171068

9488

6880 S

0.0 0.2

0:04.82 systemd

2 root

0 S

0.0 0.0

0:00.02 kthreadd

0 -20

0 I

0.0 0.0

0:00.00 rcu_gp

3 root

Figure 13-4. The top command showing what happens when two CPU hog programs are running Notice on your VM, as is illustrated on my VM in Figure13-4, that the load averages will rise over time until they eventually stabilize. You can also see that one or both CPUs will start to show waits for both hardware and software interrupts.

350

Chapter 13

Tools forProblem Solving

As the root user, use top to set the nice number for one of these CPU hogs first to +19 and then to -20, and observe the results of each setting for a short time. We will discuss the details of renicing and priorities in Volume 2, Chapter 4, but for now it is sufficient to know that a higher number means more nice and a lower, even negative number, means less nice. A nicer program has a higher number for its priority and will receive fewer CPU cycles than an identical program that has a lower number. If this seems counterintuitive, it is. This is a case of RPL– reverse programmer logic– at least at first glance.

Tip Press the r (lowercase) key for renice, and follow the directions on the screen just below the “Swap” line. To change the nice number for a running program using top, simply type r. When top asks for the PID to renice, enter the PID (process ID) number as shown in Figure13-5. The PIDs of your running processes will be different from mine. The top utility will then ask what value. Enter 19 and press Enter. I suggest choosing the PID of the cpuHog program that has the most accumulated time– TIME+– to watch the other cpuHog catch up over time. I have highlighted the relevant data lines in bold.

top - 11:46:13 up 20:55, 6 users, load average: 3.64, 2.46, 1.14 Tasks: 160 total,

5 running, 97 sleeping,

0 stopped,

0 zombie

%Cpu0 : 2.0 us, 64.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 15.2 hi, 18.2 si, 0.0 st %Cpu1 : 6.1 us, 91.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 2.0 hi, 0.0 si, 0.0 st KiB Mem : 4038488 total, 3015028 free,

240208 used,

KiB Swap: 10485756 total, 10485756 free,

0 used.

783252 buff/cache 3543356 avail Mem

PID to renice [default pid = 15217] 15408 PID USER

PR NI

VIRT

RES

SHR S %CPU %MEM

TIME+ COMMAND

15217 student

0 134336

4944

3768 S 34.7 0.1

6:58.80 sshd

15408 student

0 214388

1184

1040 R 34.7 0.0 10:06.25 cpuHog

15481 student

0 214388

1180

1036 R 33.7 0.0

7:01.68 cpuHog

15359 student

0 228968

3008

2212 R 31.7 0.1

7:11.20 screen

15017 root

0 I 13.9 0.0

1:55.58 kworker/u4:2-ev

15158 root

0 I 13.9 0.0

1:21.88 kworker/u4:0-ev

9 root

0 R

2.0 0.0

0:12.88 ksoftirqd/0

15505 root

0 257244

4256

3504 R

1.0 0.1

0:06.23 top

Figure 13-5. Renicing one of the cpuHog programs 351

Chapter 13

Tools forProblem Solving

You will experience very little change in overall system performance and responsiveness despite having these two cpuHogs running because there are no other programs seriously competing for resources. However the CPU hog with the highest priority (most negative number) will consistently get the most CPU time even if by just a little bit. You should notice the nice number and the actual priority as displayed by top. Figure13-6 shows the results after nearly three hours of runtime with PID 15408 at a nice number of +19. Notice that while PID 15408 had the most cumulative time in Figure13-5, it now has the least of the two CPU hogs. top - 14:26:44 up 23:36, 6 users, load average: 4.28, 4.11, 4.10 Tasks: 161 total,

4 running, 98 sleeping,

0 stopped,

0 zombie

%Cpu0 : 6.7 us, 58.9 sy, 5.6 ni, 1.1 id, 0.0 wa, 13.3 hi, 14.4 si, 0.0 st %Cpu1 : 1.1 us, 77.3 sy, 17.0 ni, 1.1 id, 0.0 wa, 2.3 hi, 1.1 si, 0.0 st KiB Mem : 4038488 total, 2973868 free,

240528 used,

KiB Swap: 10485756 total, 10485756 free,

0 used.

PID USER

PR NI

VIRT

RES

824092 buff/cache 3541840 avail Mem

SHR S %CPU %MEM

TIME+ COMMAND

15481 student

0 214388

1180

1036 R 56.4 0.0 68:13.93 cpuHog

15408 student

39 19 214388

1184

1040 R 40.6 0.0 63:45.60 cpuHog

15217 student

0 134336

4944

3768 R 24.8 0.1 52:31.23 sshd

15359 student

0 228968

3008

2212 S 33.7 0.1 51:37.26 screen

16503 root

0 I

3.0 0.0

5:57.70 kworker/u4:3-ev

16574 root

0 I

5.0 0.0

5:21.60 kworker/u4:2-ev

16950 root

0 I

8.9 0.0

2:20.38 kworker/u4:1-ev

9 root

0 S

1.0 0.0

1:58.70 ksoftirqd/0

15505 root

0 257244

4256

3504 R

1.0 0.1

1:05.85 top

Figure 13-6. After running for almost three hours with a nice number of +19, cpuHog PID 15408 has fallen behind cpuHog PID 15481in cumulative CPU time Now set the nice number for the process with the higher nice number from +19 to -20. We are changing the PID of one cpuHog from +19 to -20 and will leave the nice number of the other cpuHog at 0 (zero). Figure13-7 shows the results of that change.

352

Chapter 13

Tools forProblem Solving

top - 14:39:45 up 23:49, 6 users, load average: 4.29, 4.14, 4.10 Tasks: 160 total,

5 running, 97 sleeping,

0 stopped,

0 zombie

%Cpu0 : 4.9 us, 61.8 sy, 0.0 ni, 0.0 id, 0.0 wa, 15.7 hi, 17.6 si, 0.0 st %Cpu1 : 5.9 us, 92.1 sy, 0.0 ni, 0.0 id, 0.0 wa, 2.0 hi, 0.0 si, 0.0 st KiB Mem : 4038488 total, 2973276 free,

240688 used,

KiB Swap: 10485756 total, 10485756 free, PID USER 15481 student 15408 student

PR NI

824524 buff/cache

0 used. 3541672 avail Mem

VIRT

RES

0 214388

1180

1036 R 35.3 0.0 73:50.56 cpuHog

1 -19 214388

1184

1040 R 37.3 0.0 68:43.16 cpuHog

SHR S %CPU %MEM

TIME+ COMMAND

15217 student

0 134336

4944

3768 R 35.3 0.1 56:33.25 sshd

15359 student

0 228968

3008

2212 R 30.4 0.1 55:39.90 screen

16503 root

0 I 12.7 0.0

7:00.04 kworker/u4:3-ev

16574 root

0 I

6:30.02 kworker/u4:2-ev

0.0 0.0

Figure 13-7. After changing the nice number of PID 15408 from +19 to -19 Eventually cpuHog 15408 will accumulate more time than cpuHog 15481 because of its higher priority. Leave top and the two cpuHog instances running for now. Notice also that the load averages have continued to climb. Be aware that the nice number is only a “suggestion” to the kernel scheduler as the info page puts it. Thus a very negative nice number may not result in a process that receives more CPU time. It all depends upon the overall load, and many other data points that are used in calculating which process gets CPU time and when. But our cpuHogs help us understand that just a bit.

Memory statistics Performance problems can also be caused by lack of memory. Without sufficient memory in which to run all the active programs, the kernel memory management subsystems will spend time moving the contents of memory between swap space on the disk and RAM in order to keep all processes running. This swapping takes CPU time and I/O bandwidth, so it slows down the progress of productive work. Ultimately a state known as “thrashing” can occur in which the majority of the computer’s time is spent on moving memory contents between disk and RAM and little or no time is available to

353

Chapter 13

Tools forProblem Solving

spend on productive work. In Figure13-8 we can see that there is plenty of free RAM left and that no swap space has been used. top - 09:04:07 up 1 day, 18:13, 6 users, load average: 4.02, 4.03, 4.05 Tasks: 162 total,

6 running, 96 sleeping,

0 stopped,

0 zombie

%Cpu0 : 2.0 us, 72.5 sy, 0.0 ni, 0.0 id, 0.0 wa, 12.7 hi, 12.7 si, 0.0 st %Cpu1 : 12.2 us, 84.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 3.1 hi, 0.0 si, 0.0 st KiB Mem : 4038488 total, 2940852 free,

243836 used,

KiB Swap: 10485756 total, 10485756 free,

0 used.

PID USER 15481 student 15408 student

PR NI

3538144 avail Mem

VIRT

RES

0 214388

1180

1036 R 48.5 0.0 542:17.06 cpuHog

1 -19 214388

SHR S %CPU %MEM

853800 buff/cache

TIME+ COMMAND

1184

1040 R 33.7 0.0 484:37.55 cpuHog

15217 student

0 134336

4944

3768 R 31.7 0.1 402:08.24 sshd

15359 student

0 228968

3008

2212 R 31.7 0.1 396:29.99 screen

Figure 13-8. The top memory statistics show that we have plenty of virtual and real memory available The memory total, free, and used amounts for both RAM and swap space are obvious. The number that is not quite so obvious is the buff/cache one. Buff/cache is RAM, but not swap space, that is used for temporary storage. Buffers are typically a designated area of memory where the operating system will store data that is being transmitted over the network, a serial communications line, or another program, for example, for a short period of time until the program or utility that is using that data can catch up and process it. Data in the buffer is not altered before it is removed and used. Buffers enable processes that may work at differing speeds to communicate without loss of data due to that speed mismatch. Linux provides a tool called a named pipe that works as a storage buffer between two (or more) programs. A user– any user– can create a named pipe which appears as a file in the directory in which it is created. The named pipe is a FIFO (first in, first out) buffer because the data comes out in the same order in which it went in. Named pipes can be used for any number of purposes. They can provide inter-process communication between scripts and other executable programs, as well as a place to store output data for later use by other programs.

354

Chapter 13

Tools forProblem Solving

EXPERIMENT 13-3 This experiment should be performed as the student user. In this experiment we will look at one type of buffer called a named pipe. Because it is easily created and used by any user, it allows us to illustrate the function of a buffer. You will need two open terminal sessions as the student user for this experiment. In one terminal, create a named pipe called mypipe in your home directory, then do a long listing of the contents of your home directory and look at the entry for mypipe. It should have a “p” as the file type in the first column to indicate that it is a pipe: [student@studentvm1 ~]$ mkfifo mypipe [student@studentvm1 ~]$ ll total 284 -rw-rw-r--1 student student 130 Sep 15 -rwxr-xr-x1 student student91 Oct 19 <snip> drwxr-xr-x. 2 student student4096 Aug 18 prw-rw-r--1 student student 0 Oct 25 -rw-rw-r--. 1 student student0 Sep6 <snip> drwxrwxr-x. 2 student student4096 Sep6 drwxr-xr-x. 2 student student4096 Aug 18 [student@studentvm1 ~]$

16:21 ascii-program.sh 11:35 cpuHog 10:21 Music 21:21 mypipe 10:52 newfile.txt 14:48 testdir7 10:21 Videos

Now let’s put some data into the pipe. We could use any command that creates a data stream, but for this experiment, let’s use the lsblk command to list the block devices– essentially the disk drives– on the system and redirect the output to the named pipe. Run the following command in one of the terminal sessions: [student@studentvm1 ~]$ lsblk -i > mypipe

Notice that you do not get returned to the command prompt; you are left with a blank line. Do not press Ctrl-C to return to the command prompt. In the other terminal session, use the cat command to read the data from the named pipe. This simple, standard, core command retrieves the data from the pipe and sends it to STDOUT.At that point we could do anything we want with it: [student@studentvm1 ~]$ cat mypipe

355

Chapter 13

Tools forProblem Solving

NAMEMAJ:MIN RMSIZE RO sda8:0060G0 |-sda18:101G0 `-sda28:2059G0 |-fedora_studentvm1-pool00_tmeta253:0 0 4M0 | `-fedora_studentvm1-pool00-tpool 253:202G0 ||-fedora_studentvm1-root 253:3 0 2G0 |`-fedora_studentvm1-pool00253:6 0 2G0 |-fedora_studentvm1-pool00_tdata253:1 0 2G0 | `-fedora_studentvm1-pool00-tpool 253:202G0 ||-fedora_studentvm1-root 253:3 0 2G0 |`-fedora_studentvm1-pool00253:6 0 2G0 |-fedora_studentvm1-swap253:4010G0 |-fedora_studentvm1-usr253:5015G0 |-fedora_studentvm1-home253:702G0 |-fedora_studentvm1-var253:8010G0 `-fedora_studentvm1-tmp253:905G0 sr011:01 1024M0

TYPE MOUNTPOINT disk part /boot part lvm lvm lvm/ lvm lvm lvm lvm/ lvm lvm[SWAP] lvm/usr lvm/home lvm/var lvm/tmp rom

Note that all of the data in the pipe is sent to STDOUT.Return to the terminal session in which you added data to the pipe. Notice that it has been returned to the command prompt. Add more data to the pipe using some different commands and then read it again. Cache is RAM memory that is allocated especially to data that may be changing and that may be used at some time in the near future or that may be discarded if it is not required. Hardware cache is also common in processors. CPU cache is different from the RAM cache monitored by top. This is a separate space located on the processor chip itself and which is used to cache– store– data that has been transferred from RAM until it is needed by the CPU.Not all of the data in a CPU cache will necessarily be used, and some may just be discarded to make room for data from RAM that has a higher probability of being used by the CPU.Cache in the CPU is faster than normal system RAM, so getting data into cache that has a high probability of being used by the CPU can improve overall processing speeds. This is definitely not the type of cache that is monitored by the top program. Buffers and cache space are very similar in that they are both allocated in RAM to be used for temporary storage. The difference is in the manner in which they are used. 356

Chapter 13

Tools forProblem Solving

The task list The top task list provides a view of the tasks consuming the most of a particular resource. The task list can be sorted by any of the displayed columns including CPU and memory usage. By default top is sorted by CPU usage from high to low. This provides a quick way to view the processes consuming the most CPU cycles. If there is one that stands out such as sucking up 90% or more of the available CPU cycles, this could be indicative of a problem. That is not always the case; some applications just gobble huge amounts of CPU time. The task list also presents us with other data which, if not immediately obvious, can be obtained from the help option or the top man page. Again, it is imperative that you observe a correctly running system to understand what is normal so you will know when you see abnormal. I spend a great deal of time using top and these other tools just observing the activities of my hosts when there are no extant problems. This enables me to understand what is “normal” for these hosts and gives me the knowledge I need to understand when they are not running normally.

Signals The top, atop, and htop utilities allow you to send signals to running processes. Each of these signals has a specific function though some of them can be defined by the receiving program using signal handlers. The kill command, which is separate from top, can also be used to send signals to processes outside of the monitors. The kill -l can be used to list all possible signals that can be sent. The use of the kill command to send signals can be confusing if you do not actually intend to kill the process. The thing to remember is that the kill command is used to send signals to processes and that at least three of those signals can be used to terminate the process with varying degrees of prejudice: •

SIGTERM (15): Signal 15, SIGTERM is the default signal sent by top and the other monitors when the k key is pressed. It may also be the least effective because the program must have a signal handler built into it. The program's signal handler must intercept incoming signals and act accordingly. So for scripts, most of which do not have signal handlers, SIGTERM is ignored. The idea behind SIGTERM is that by simply telling the program that you want it to terminate itself, it will take advantage of that and clean up things like open files and then terminate itself in a controlled and nice manner. 357

Chapter 13

Tools forProblem Solving

•

SIGKILL (9): Signal 9, SIGKILL provides a means of killing even the most recalcitrant programs, including scripts and other programs that have no signal handlers. For scripts and other programs with no signal handler, however, it not only kills the running script, but it also kills the shell session in which the script is running; this may not be the behavior that you want. If you want to kill a process and you don't care about being nice, this is the signal you want. This signal cannot be intercepted by a signal handler in the program code.

•

SIGINT (2): Signal 2, SIGINT can be used when SIGTERM does not work and you want the program to die a little more nicely, for example, without killing the shell session in which it is running. SIGINT sends an interrupt to the session in which the program is running. This is equivalent to terminating a running program, particularly a script, with the Ctrl-C key combination.

There are many other signals, but these are the ones I have found that pertain to terminating a program.

Consistency One more thing about top and many of its relatives. It does not need to run continuously in order to display the correct and consistent current statistics. For example, data such as TIME+ is cumulative starting with the time that the system booted or that the process was launched. Starting top or restarting it does not alter the accuracy of the data. This is not due to any intrinsic capabilities of top; rather it is the fact that top and other programs like it obtain their information from the /proc virtual filesystem.

Other top-like tools Like all things Linux, there are other programs that work in a top-like manner and which can be used if you prefer them. In this section we will look at three of these alternatives, htop, atop, and iotop. None of these tools are likely to be installed on your Fedora VM, so let’s do that now.

358

Chapter 13

Tools forProblem Solving

PREPARATION 13-1 Perform this preparation step as root. Install the tools we will need for this chapter: [root@studentvm1 ~]# dnf -y install htop atop iotop

Note that package name for atop might show up as being packaged for an earlier version. This is uncommon, but it can happen if a tool has not yet been repackaged for the most current version of Fedora. Jason, my technical reviewer, and I both noted this. It is not a problem. If it were a problem, the older package would not appear in the repository for the current Fedora release.

htop The htop utility is very much like top but offers a bit different capability to interact with the running processes. htop allows selection of multiple processes so that they can be acted upon simultaneously. It allows you to kill and renice the selected processes and to send signals simultaneously to one or more processes.

EXPERIMENT 13-4 Leave top and the two CPU hog programs running. In another terminal session as the root user, start htop: [root@studentvm1 ~]# htop

Notice the bar graphs and load average data at the top of the screen. I have removed some lines of data to reduce the page space required, but you can still see that the data displayed is very similar to top. The function key menu at the bottom of the screen provides easy access to many functions:

359

360

Use the F6 key to display the “Sort By” menu and select CPU%. Observe the CPU usage data for the two CPU hogs for a few moments.

Press F2 to display the “Setup” menu. In this menu, you can modify the layout of the header information and choose some alternative ways to display the data. Press the Esc key to return to the main display. We will see in Chapter 14 why the F1 and F10 keys don’t work as you would expect in this situation and how to fix that problem.

Press h to read the short Help page. You should also take a bit of time to read the man page for htop.

PID USERPRINIVIRTRES SHRSCPU% MEM%TIME+Command 4691 student255209M12001056 R200.0.0 15h03:36 /bin/bash ./cpuHog 4692 student12-8209M11761032 R172.0.0 14h50:13 /bin/bash ./cpuHog 1703 student211224M33962068 R123.0.19h18:00 SCREEN 1396 lightdm222951M 85620 67092 S0.02.10:00.00 /usr/sbin/lightdm-gtk-greeter 1414 lightdm211951M 85620 67092 S0.02.10:00.00 /usr/sbin/lightdm-gtk-greeter <snip> 1045 root200652M 17156 14068 S0.00.40:01.06 /usr/sbin/NetworkManager --no4700 root200652M 17156 14068 S0.00.40:00.00 /usr/sbin/NetworkManager –no<snip> 1441 lightdm200572M 122649112 S0.00.30:00.03 /usr/bin/pulseaudio --daemoniz <snip> 872 root200534M 102848744 S0.00.3 0:00.03 /usr/sbin/abrtd -d -s 1 root200231M98446828 S0.00.20:02.68 /usr/lib/systemd/systemd F1HelpF2SetupF3Search F4Filter F5TreeF6SortBy F7Nice- F8 Nice + F9KillF10Quit

Mem[||||||||||||252M/3.85G]Uptime: 1 day, 11:43:50 Swp[0K/10.00G]

1[||||||||||||||||||||||||||||||100.0%] Tasks: 77, 71 thr; 2 running 2[||||||||||||||||||||||||||||||100.0%] Load average: 3.19 3.39 3.50 Chapter 13 Tools forProblem Solving

Chapter 13

Tools forProblem Solving

Use the up/down arrow keys to highlight one of the CPU hogs, then use the F7 and F8 keys to first decrement the nice number to -20 and then increment it to +19, observing both states for a few moments. Watch how the priority of the process changes as the nice number changes. Highlight first one cpuHog, and press the space bar to select it, then do the same for the second cpuHog. It is OK for the highlight bar to rest on another process while performing this task because only the selected processes are affected. Use the F7 and F8 keys to adjust the nice number for these two processes. Assuming that the cpuHogs started with different nice numbers, what happens when one process reaches the upper or lower limit? A process can be deselected. Highlight it and then press the space bar again. Deselect the cpuHog that has the highest amount of cumulative CPU time (TIME+), and then set the nice number of the other cpuHog process, which should still be selected, to be a more negative number than the deselected process. Use the F5 key to display the process tree view. I like this view because it shows the parent/ child hierarchy of the running programs. Scroll down the list of processes until you find the CPU hogs. There is much more to htop than we have explored here. I recommend that you spend some time exploring it and learning its powerful capabilities. Do not terminate the htop tool.

atop The atop utility provides much of the same data as top and htop.

EXPERIMENT 13-5 Start the atop program in another root terminal session: [root@studentvm1 ~]# atop

You should now have top, htop, and atop running along with the two CPU hogs. I have reduced the font size on the output shown in the following in order to have room for more data. You can see in the following the additional information displayed by atop. The atop utility provides detailed information on I/O usage including aggregated, per device, and per process data. It should be easy to pick that data out of the information in the following as well as on your student VM: 361

362

CPU cpu cpu CPL MEM SWP LVM LVM LVM LVM LVM LVM LVM LVM LVM DSK NET NET NET NET

|sys48%|user 25%|irq 8%|idle118%|wait2%|curscal ?%| |sys21%|user 15%|irq 4%|idle 59%|cpu000 w1%|curscal ?%| |sys27%|user 10%|irq 4%|idle 59%|cpu001 w1%|curscal ?%| |avg13.74|avg53.67|avg15 3.61|csw 209886e5|intr 48899e5|numcpu 2| |tot 3.9G|free2.7G|cache 669.2M|buff134.6M|slab136.2M|hptot 0.0M| |tot10.0G|free 10.0G|||vmcom 981.5M|vmlim11.9G| |udentvm1-var|busy5%|read 14615|write 297786|MBw/s0.0|avio 10.4 ms| |udentvm1-usr|busy0%|read 30062|write 6643|MBw/s0.0|avio 3.35 ms| |dentvm1-root|busy0%|read1408|write 1089|MBw/s0.0|avio 20.0 ms| |pool00-tpool|busy0%|read1265|write 1090|MBw/s0.0|avio 21.0 ms| |pool00_tdata|busy0%|read1280|write 1090|MBw/s0.0|avio 20.9 ms| |udentvm1-tmp|busy0%|read 254|write 1257|MBw/s0.0|avio 17.9 ms| |dentvm1-home|busy0%|read 153|write108|MBw/s0.0|avio 34.9 ms| |pool00_tmeta|busy0%|read66|write 15|MBw/s0.0|avio 10.6 ms| |dentvm1-swap|busy0%|read 152|write0|MBw/s0.0|avio 4.89 ms| |sda|busy5%|read 39186|write 252478|MBw/s0.0|avio 11.5 ms| |transport |tcpi221913|tcpo235281|udpi3913|udpo4284|tcpao 48| |network |ipi 226661|ipo 242445|ipfrw0|deliv 226655|icmpo 3836| |enp0s80%|pcki253285|pcko244604|sp 1000 Mbps|si6 Kbps|so2 Kbps| |enp0s30%|pcki1459|pcko7235|sp 1000 Mbps|si0 Kbps|so0 Kbps|

46918h39m6h38m 209.4M1200K 4K 0K 46928h43m6h21m 209.4M1176K 0K 0K 17036h18m3h08m 224.1M3396K32K 8K 50767m13s2m20s 251.2M4420K 308K 0K 202335m58s0.00s 0K 0K 0K 0K 206222m12s0.01s 0K 0K 0K 0K 9009 20.16s 20.09s 224.4M3664K 0K 0K 1388 10.14s 23.98s 951.1M 85620K 59636K9020K <snip> 14.68s2.51s 231.5M9844K 258.1M 227.9M 8673.02s3.19s 435.2M 34388K 13236K 4K 106.04s0.00s 0K 0K 0K 0K 7144.07s1.89s 125.7M 24492K3360K 0K 207234.24s1.12s 219.0M4116K 0K 0K 8542.05s1.76s98.0M1840K 388K 0K 14813.19s0.09s 131.2M5016K 0K 0K <snip> 8620.75s0.21s 188.5M3376K72K 0K 8930.65s0.07s 103.1M2800K 156K 140K <snip> -1 -2 -1 -3 -1 -2 -1

rootroot N- rootroot N- rootroot N- rootroot N- rootroot N- rootroot N- student studentN-

S S I S S S S

R R R S I I S S 1 0 1 0 1 1 1

0% 0% 0% 0% 0% 0% 0%

181% 179% 050% 0 1% 1 1% 1 0% 1 0% 0 0% systemd firewalld rcu_sched dmeventd htop irqbalance sshd

cpuHog cpuHog screen top kworker/u4:1-e kworker/u4:2-e screen lightdm-gtk-gr

rtkit rtkitN- -3 S 1 0% rtkit-daemon chronychrony N- -1 S 1 0% chronyd

-1 -1 -1 -1 -1 -1 -1 -5

student studentN- student studentN- student studentN- rootroot N- rootroot N- rootroot N- rootroot N- lightdm lightdmN-

PID SYSCPU USRCPUVGROWRGROWRDDSKWRDSK RUIDEUID ST EXCTHR S CPUNRCPU CMD 1/5

Chapter 13 Tools forProblem Solving

363

Chapter 13

Tools forProblem Solving

The atop program provides some network utilization data as well as combined and individual detailed CPU usage data. By default it shows only the processes that actually receive CPU time during the collection interval. Press the a key to display all processes. atop also shows data in the header space if there is some activity. You will see this as you watch the output for a time. It can kill a process, but it cannot renice one. The atop program starts with an interval of ten seconds. To set the interval to one second, first type i and then 1. To access the help facility type h. Scan this help to learn about the many capabilities of this tool. Enter q to exit help. atop provides insight into a great deal of information, and I find it to be very helpful. It has an option to create a log file so it can be used to monitor long-term system performance which can be reviewed at a later time. Press q to exit from atop. These three tools are the ones I start with when looking for problems. Between them they can tell me almost everything I need to know about my running system. I find that atop has the most complex interface, and on a terminal that does not have enough width (columns), the output may be misaligned and distorted.

More tools There are many more tools available to us as SysAdmins. Most of these concentrate on a single aspect of system operation such as memory or CPU usage. Let’s look briefly at a few of them by type.

Memory tools The free and vmstat utilities look at memory usage. The vmstat tool also provides data about the CPU usage breakdown such as user, system, and idle time.

364

Chapter 13

Tools forProblem Solving

EXPERIMENT 13-6 You should perform this experiment as root, but these commands can be used with identical results as any non-privileged user as well. Use the free command to display the system memory information: [root@studentvm1 ~]# free totalusedfreesharedbuff/cacheavailable Mem:4038488255292286514460929180523517436 Swap:10485756 010485756 [root@studentvm1 ~]#

Does it agree fairly closely with the output from top? It should because they both get their data from the /proc filesystem. The vmstat command shows the virtual memory statistics including some of the data shown in top and other utilities. The data output from this command may need more explanation than some of the others, so use the man page to interpret it if you need to: [root@studentvm1 ~]# vmstat procs --------memory---------- ---swap-- -----io---- -system-- ------cpu----rb swpd free buffcache si sobibo in cs us sy id wa st 300 2865308 138528 77954800 513149 41 13 28 5810 [root@studentvm1 ~]#

1. Neither of these commands is continuous; that is, they display data one time and exit. The watch command can help us turn them into repeating tools. Enter the command shown in the following and watch it for a while. The output actually appears at the top of the terminal: [root@studentvm1 ~]# watch free Every 2.0s: freestudentvm1: Sat Oct 27 10:24:26 2018 totalusedfreesharedbuff/cacheavailable Mem:4038488255932286432060929182363516804 Swap:10485756 010485756

365

Chapter 13

Tools forProblem Solving

The data on the screen will update at the default 2-second interval. That interval can be changed, and the differences between refresh instances can be highlighted. Of course the watch command also works with other tools as well. The watch command has a number of interesting capabilities that you can explore using the man page. When finished, you can use Ctrl-C to exit from the watch program.

Tools that display disk I/O statistics Although top and atop both provide some insight into I/O usage, this data is limited to I/O waits in top. The atop utility provides a significant amount of I/O information including disk reads (RDDSK) and writes (WRDSK). The iostat program provides, like the free command, a point in time view of disk I/O statistics, while iotop provides a toplike view of disk I/O statistics.

EXPERIMENT 13-7 Perform this experiment as root. Look first at the results of the iostat tool: [root@studentvm1 tmp]# iostat Linux 4.18.9-200.fc28.x86_64 (studentvm1)10/28/2018_x86_64_ (2 CPU) avg-cpu:%user %nice %system %iowait%steal %idle 8.5511.0942.74 0.54 0.0037.08 DevicetpskB_read/skB_wrtn/skB_readkB_wrtn sda2.082.5815.586708354051880 dm-00.000.000.0028044 dm-10.010.080.0220917 5576 dm-20.010.080.0220853 5576 dm-30.010.140.0237397 5956 dm-40.000.010.003320 0 dm-50.151.420.13368072 34108 dm-70.000.010.002916 412 dm-82.281.0111.592619853014888 dm-90.010.024.1062521065340

366

Chapter 13

Tools forProblem Solving

The iostat utility provides point in time data about disk reads and writes per second as well as cumulative read and write data. The sda device is the entire hard drive, so the data in that row is an aggregate for all filesystems on that entire device. The dm devices are the individual filesystems on the /dev/sda device. You can use the following command to view the filesystem names: [root@studentvm1 tmp]# iostat -j ID Linux 4.18.9-200.fc28.x86_64 (studentvm1)10/28/2018_x86_64_ (2 CPU) avg-cpu:%user %nice %system %iowait%steal %idle 8.5611.1042.79 0.54 0.0037.01 tpskB_read/skB_wrtn/skB_readkB_wrtn Device 2.092.5715.576708354059184 ata-VBOX_HARDDISK_VBb426cd38-22c9b6be 0.000.000.0028044 dm-0 0.010.080.0220917 5640 dm-1 0.010.080.0220853 5640 dm-2 0.010.140.0237397 6028 dm-name-fedora_studentvm1-root 0.000.010.003320 0 dm-name-fedora_studentvm1-swap 0.151.410.13368072 34580 dm-name-fedora_studentvm1-usr 0.000.010.002916 412 -dm-name-fedora_studentvm1-home 2.281.0011.592619853021780 dm-name-fedora_studentvm1-var 0.010.024.0962521065412 dm-name-fedora_studentvm1-tmp

The iostat program has many options that can be used to provide a more dynamic view of this data as well as to create log files for later perusal. The iotop utility consists of a two-line header that displays the total and actual disk reads and writes for the current interval which is one second by default. First we start the iotop program in one terminal as the user root: [root@studentvm1 tmp]# iotop

367

Chapter 13

Tools forProblem Solving

At first the full-screen output will look like this sample with not much going on. This output includes all of the processes that will fit in the terminal window regardless of whether any of them are actually performing I/O or not: Total DISK READ :0.00 B/s | Total DISK WRITE :0.00 B/s Actual DISK READ:0.00 B/s | Actual DISK WRITE:0.00 B/s TIDPRIOUSER DISK READDISK WRITESWAPIN IO>COMMAND 1 be/4 root0.00 B/s0.00 B/s0.00 %0.00 % systemd --switched-root~system --deserialize 32 2 be/4 root0.00 B/s0.00 B/s0.00 %0.00 % [kthreadd] 3 be/0 root0.00 B/s0.00 B/s0.00 %0.00 % [rcu_gp] 4 be/0 root0.00 B/s0.00 B/s0.00 %0.00 % [rcu_par_gp] <snip>

Although the cpuHog programs should still be running, they do not perform any disk I/O, so we need a little program to do that for us. Keep the iotop utility running in this terminal window. Open another terminal as the student user such that the running iotop program can be seen in the previous terminal window. Run the short command-line program shown in the following. This dd command makes an image backup of the /.home filesystem and stores the result in / tmp. If you created the filesystems according to the filesystem sizes I provided in Table5-1, this should not fill up the 5GB /tmp filesystem with the content of the 2.0GB /home filesystem: [root@studentvm1 tmp]# time dd if=/dev/mapper/fedora_studentvm1-home of=/tmp/ home.bak 4194304+0 records in 4194304+0 records out 2147483648 bytes (2.1 GB, 2.0 GiB) copied, 96.1923 s, 22.3 MB/s real1m36.194s user0m0.968s sys0m14.808s [root@studentvm1 tmp]#

I used the time utility to get an idea of how long the dd program would run. On my VM, it ran for a little over a minute and a half of real time, but this will vary depending on the specifications of the underlying physical host and its other loads. The output of the iotop command should change to look somewhat like that in the following. Your results will depend upon the details of your system, but you should at least see some disk activity: 368

Chapter 13

Tools forProblem Solving

Total DISK READ :3.14 M/s | Total DISK WRITE :3.14 M/s Actual DISK READ:3.14 M/s | Actual DISK WRITE:19.72 M/s TIDPRIOUSER DISK READDISK WRITESWAPIN IO>COMMAND 42 be/4 root0.00 B/s0.00 B/s0.00 % 99.99 % [kswapd0] 780 be/3 root0.00 B/s0.00 B/s0.00 % 99.99 % [jbd2/dm-9-8] 26769 be/4 root0.00 B/s0.00 B/s0.00 % 93.31 % [kworker/ u4:3+flush-253:9] 13810 be/4 root3.14 M/s3.14 M/s0.00 % 87.98 % dd if=/dev/ mapper/fedor~1-home of=/tmp/home.bak 1 be/4 root0.00 B/s0.00 B/s0.00 %0.00 % systemd --switched-root~system --deserialize 32 2 be/4 root0.00 B/s0.00 B/s0.00 %0.00 % [kthreadd] <snip>

If the backup completes before you are able to observe it in iotop, run it again. I leave it as an exercise for you to determine the option that can be used with iotop to show only processes that are actually performing I/O.Perform the last part of this experiment with that option set.

The /proc filesystem All of the data displayed by the commands in this chapter, and many other tools that let us look into the current status of the running Linux system, are stored by the kernel in the /proc filesystem. Because the kernel already stores this data in an easily accessible location and in ASCII text format, for the most part, it is possible for other programs to access it with no impact upon the performance of the kernel. There are two important points to understand about the /proc filesystem. First, it is a virtual filesystem, and it does not exist on any physical hard drive– it exists only in RAM.Second, the /proc filesystem is a direct interface between the internal conditions and configuration settings of the kernel itself and the rest of the operating system. Simple Linux commands enable us humans to view the current state of the kernel and its configuration parameters. It is also possible to alter many kernel configuration items instantly without a reboot. More on that in Volume 2, Chapter 5.

369

Chapter 13

Tools forProblem Solving

EXPERIMENT 13-8 This experiment should be performed as root. First make /proc the PWD and do a short list of the directory contents: [root@studentvm1 proc]# ls 113751549 1757033480781862 99irqsched_debug 1013811550 1761834492783864 acpikallsymsschedstat 100138215551762135 517784865asoundkcorescsi 101138715561836 518792866buddyinfokeysself 102138815614195037562793867buskey-usersslabinfo 10314 16

23795638 872cgroupskmsg softirqs

103014131602120385693814893cmdlinekpagecgroupstat 1041416160332059938468159consoleskpagecountswaps 104814211665920863396168189008cpuinfokpageflagssys 1053142316872214 6438209009cryptolatency_statssysrq-trigger 105414271689722426808219010devicesloadavgsysvipc 1071468172277143 701839904diskstatslocks thread-self 10814711702234397058409056dmamdstattimer_list 1114731703 24448 714841910drivermeminfotty 11414741704274507418429107execdomainsmiscuptime 116314791715528 462 744854937fbmodulesversion 1214811729 29 463 745856938filesystemsmountsvmallocinfo 131487174983464 748857940fsmtrr vmstat 1338151752130 467 778858945interruptsnetzoneinfo 1348151317545314691 77985997 iomempagetypeinfo 136315461754632469278086098ioportspartitions [root@studentvm1 proc]#

First notice the directories with numerical names. Each directory name is the PID (process ID) of a running process. The data contained inside these directories exposes all of the pertinent information about each process. Let’s take a look at one to see what that means. Use htop to find the PID of one of the cpuHogs. These are 4691 and 4692 on my VM, but your PIDs probably will be different. Select one of those PIDs, and then make it the PWD.I used PID 4691, so my current PWD is /proc/4691: [root@studentvm1 4691]# ls

370

Chapter 13

Tools forProblem Solving

attrcpusetlatency mountstatspersonalitysmaps_rolluptimerslack_ns autogroupcwdlimitsnet projid_map stack auxvenvironloginuid nsroot statwchan cgroupexemap_filesnuma_mapsschedstatm clear_refsfdmapsoom_adj schedstatstatus cmdlinefdinfomem

oom_scoresessionid syscall

commgid_mapmountinfooom_score_adjsetgroupstask coredump_filterio mounts pagemapsmapstimers [root@studentvm1 4691]#

Now cat the loginuid file. Notice that most of the data in these files– at least the last item– may not have an ending line feed character. This means that sometimes the new command prompt is printed on the same line as the data. That is what is happening here: [root@studentvm1 4691]# cat loginuid 1000[root@studentvm1 4691]#

The UID of the user who started this process is 1000. Now go to a terminal that is logged in as the student user and enter this command: [student@studentvm1 ~]$ id uid=1000(student) gid=1000(student) groups=1000(student) [student@studentvm1 ~]$

Thus we see that the user ID (UID) of the student user is 1000 so that the user student started the process with PID 4691. Now let’s watch the scheduling data for this process. I won’t reproduce my results here, but you should be able to see the changes in this live data as they occur. In a root terminal session, run the following command: [root@studentvm1 4691]# watch cat sched

Now return to /proc as the PWD.Enter the following commands to view some of the raw data from the /proc filesystem: [root@studentvm1 proc]# cat /proc/meminfo [root@studentvm1 proc]# cat /proc/cpuinfo [root@studentvm1 proc]# cat /proc/loadavg

These are just a few of the files in /proc that contain incredibly useful information. Spend some time exploring more of the data in /proc. Some of the data is in formats that require a bit of manipulation in order to make sense to us humans, and much of it would only be useful to kernel or system developers.

371

Chapter 13

Tools forProblem Solving

We have just touched upon a very tiny bit of the /proc filesystem. Having all of this data exposed and available to us as SysAdmins, not to mention the system developers, makes it easy to obtain information about the kernel, hardware, and running programs. This means that it is not necessary to write code that needs to access the kernel or its data structures in order to discover all of the knowable aspects of a Linux system. The CentOS / RHEL 7.2 documentation has a list of many of the more useful files in the /proc filesystem.6 Some older Fedora documentation also contains this information. The Linux Documentation Project has a brief description of some of the data files in /proc.7

E xploring hardware Sometimes– frequently, actually– I find it is nice to know very specific information about the hardware installed in a host, and we have some tools to assist with this. Two that I like are the lshw (list hardware) and dmidecode (Desktop Management Interface8 decode) commands which both display as much hardware information as is available in SMBIOS.9 The man page for dmidecode states, “SMBIOS stands for System Management BIOS, while DMI stands for Desktop Management Interface. Both standards are tightly related and developed by the DMTF (Desktop Management Task Force).” These utility tools use data stored in SMBIOS which is a data storage area on system motherboards that allows the BIOS boot process to store data about the system hardware. Because the task of collecting hardware data is performed at BIOS boot time, the operating system does not need to probe the hardware directly in order to collect information that can be used to perform tasks such as determination of which hardware related kernel modules to load during the Linux kernel portion of the boot and startup process. We will discuss the boot and startup sequence of a Linux computer in some detail in Chapter 16.

hapter 4. The /proc Filesystem, Red Hat Linux 7.2: The Official Red Hat Linux Reference Guide, C www.centos.org/docs//2/rhl-rg-en-7.2/ch-proc.html 7 Linux Documenation Project, Linux System Administrators Guide, 3.7. The /proc filesystem, www.tldp.org/LDP/sag/html/proc-fs.html 8 Wikipedia, Desktop Management Interface, https://en.wikipedia.org/wiki/ Desktop_Management_Interface 9 Wikipedia, System Management BIOS, https://en.wikipedia.org/wiki/ System_Management_BIOS 6

372

Chapter 13

Tools forProblem Solving

The data collected into SMBIOS can be easily accessed by tools such as lshw and dmidecode for use by SysAdmins. I use this data when planning upgrades, for example. The last time I needed to install more RAM in a system, I used the dmidecode utility to determine the total amount of memory capacity available on the motherboard and the current memory type. Many times the motherboard vendor, model, and serial number are also available. This makes it easy to obtain the information needed to locate documentation on the Internet. Other tools, such as lsusb (list USB) and lspci (list PCI), do not use the DMI information; they use data from the special filesystems /proc and /sys which are generated during Linux boot. We will explore these special filesystems in Volume 2, Chapter 5. Because these are command-line tools, we have access to the hardware details of systems that are local or halfway around the planet. The value of being able to determine detailed hardware information about systems without having to dismantle them to do so is incalculable.

EXPERIMENT 13-9 Perform this experiment as root. Install the lshw (list hardware) package: [root@studentvm1 ~]# dnf install -y lshw

This program lists data about the motherboard, CPU, and other installed hardware. Run the following command to list the hardware on your host. It may take a few moments to extract and display the data, so be patient. Look through the data to see all of the (virtual) hardware in your VM: [root@studentvm1 ~]# lshw | less

Now run dmidecode and do the same: [root@studentvm1 ~]# dmidecode | less

It is also possible to list hardware information by DMI type. For example, the motherboard is DMI type 2, so you can use the following command to list the hardware information for just the motherboard: [root@studentvm1 ~]# dmidecode -t 2

373

Chapter 13

Tools forProblem Solving

You can find the type codes for different types of hardware in the dmidecode man page. There are two commands available to list USB and PCI devices. Both should be installed already. Run the following commands, and take some time to review the output: [root@studentvm1 ~]# lsusb -v | less [root@studentvm1 ~]# lspci -v | less

Caution The results for the dmidecode and lshw tools can be questionable. According to both of their man pages, “More often than not, information contained in the DMI tables is inaccurate, incomplete or simply wrong.” In large part this information deficiency is because hardware vendors do not always cooperate by storing data about their hardware in a way that is useful– when they provide any data at all.

Monitoring hardware temperatures Keeping computers cool is essential for helping to ensure that they have a long life. Large data centers spend a great deal of energy to keep the computers in them cool. Without going into the details, designers need to ensure that the flow of cool air is directed into the data center and specifically into the racks of computers to keep them cool. It is even better if they can be kept at a fairly constant temperature. Proper cooling is essential even in a home or office environment. In fact, it is even more essential in those environments because the ambient temperature is so much higher as it is primarily for the comfort of the humans. One can measure the temperature of many different points in a data center as well as within individual racks. But how can the temperature of the internals of a computer be measured? Fortunately, modern computers have many sensors built into various components to enable monitoring of temperatures, fan speeds, and voltages. If you have ever looked at some of the data available when a computer is in BIOS configuration mode, you can see many of these values. But this cannot show what is happening inside the computer when it is in a real-world situation under loads of various types.

374

Chapter 13

Tools forProblem Solving

Linux has some software tools available to allow system administrators to monitor those internal sensors. Those tools are all based on the lm_sensors, SMART, and hddtemp library modules which are available on all Red Hat-based distributions such as Fedora and CentOS and most others as well. The simplest tool is the sensors command. Before the sensors command is run, the sensors-detect command is used to detect as many of the sensors installed on the host system as possible. The sensors command then produces output including motherboard and CPU temperatures, voltages at various points on the motherboard, and fan speeds. The sensors command also displays the range temperatures considered to be normal, high, and critical. The hddtemp command displays temperatures for a specified hard drive. The smartctl command shows the current temperature of the hard drive, various measurements that indicate the potential for hard drive failure, and, in some cases, an ASCII text history graph of the hard drive temperatures. This last output can be especially helpful in some types of problems. There are also a number of good graphical monitoring tools that can be used to monitor the thermal status of your computers. I like GKrellM for my desktop and are plenty of others available for you to choose from. I suggest installing these tools and monitoring the outputs on every newly installed system. That way you can learn what temperatures are normal for your computers. Using tools like these allows you to monitor the temperatures in real time and understand how added loads of various types affect those temperatures.

EXPERIMENT 13-10 As root, install the lm_sensors and hddtemp packages. If the physical host for your virtual machine is a Linux system, you may perform these experiments on that system if you have root access: [root@studentvm1 proc]# dnf -y install lm_sensors hddtemp

It is necessary to configure the lm_sensors package before useful data can be obtained. Unfortunately this is a highly interactive process, but you can usually just press the Enter key to take all of the defaults, some of which are “no,” or pipe yes to answer yes to all options: [root@studentvm1 proc]# yes | sensors-detect

375

Chapter 13

Tools forProblem Solving

Because these utilities require real hardware, they do not produce any results on a virtual machine. So I will illustrate the results with data from one of my own hosts, my primary workstation: [root@david proc]# sensors coretemp-isa-0000 Adapter: ISA adapter Package id 0:+54.0°C(high Core 0:+44.0°C(high Core 1:+51.0°C(high Core 2:+49.0°C(high Core 3:+51.0°C(high Core 4:+51.0°C(high Core 5:+50.0°C(high Core 6:+47.0°C(high Core 7:+51.0°C(high Core 8:+48.0°C(high Core 9:+51.0°C(high Core 10:+53.0°C(high Core 11:+47.0°C(high Core 12:+52.0°C(high Core 13:+52.0°C(high Core 14:+54.0°C(high Core 15:+52.0°C(high

= = = = = = = = = = = = = = = = =

+86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C, +86.0°C,

crit crit crit crit crit crit crit crit crit crit crit crit crit crit crit crit crit

= = = = = = = = = = = = = = = = =

+96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C) +96.0°C)

radeon-pci-6500 Adapter: PCI adapter temp1:+40.5°C(crit = +120.0°C, hyst = +90.0°C) asus-isa-0000 Adapter: ISA adapter cpu_fan:0 RPM

376

Chapter 13

Tools forProblem Solving

[root@david proc]# hddtemp /dev/sda: TOSHIBA HDWE140: 38°C /dev/sdb: ST320DM000-1BD14C: 33°C /dev/sdc: ST3000DM001-1CH166: 31°C /dev/sdd: ST1000DM003-1CH162: 32°C /dev/sdi: WD My Passport 070A:drive supported, but it doesn't have a temperature sensor. [root@david proc]#

Monitoring hard drives Hard drives are one of the most common failure points in computers, right after fans. They have moving parts, and those are always more prone to failure than electronic integrated circuit chips. Knowing in advance that a hard drive is likely to fail soon can save much time and aggravation. The Self-Monitoring, Analysis and Reporting Technology10 (SMART) capabilities built into modern hard drives enable SysAdmins like us to identify drives that are likely to fail soon and replace them during a scheduled maintenance. The smartctl command is used to access the data and statistics available from SMART-enabled hard drives. Most hard drives are SMART-enabled these days, but not all, especially very old hard drives.

EXPERIMENT 13-11 Perform this experiment as root. If you have root access to a physical Linux host, you might prefer to carefully perform this experiment on that host instead of the VM.You might need to install the smartmontools package on the physical host: [root@david ~]# dnf -y install smartmontools

First verify the device name of your hard drive. There should only be one hard drive, sda, on your VM because that is the way we created it. However you may still see the USB drive from the experiments in Chapter 12; that is OK, just be sure to use the sda device:

Wikipedia, S.M.A.R.T., https://en.wikipedia.org/wiki/S.M.A.R.T.

377

Chapter 13

Tools forProblem Solving

[root@studentvm1 ~]# lsblk -i NAMEMAJ:MIN RMSIZE RO sda8:0060G0 |-sda18:101G0 `-sda28:2059G0 |-fedora_studentvm1-root 253:002G0 |-fedora_studentvm1-swap 253:106G0 |-fedora_studentvm1-usr253:20 15G0 |-fedora_studentvm1-home 253:304G0 |-fedora_studentvm1-var253:40 10G0 `-fedora_studentvm1-tmp253:505G0 sdb8:16020G0 |-sdb18:170 2G0 |-sdb28:180 2G0 `-sdb38:19016G0 `-NewVG--01-TestVol1253:604G0 sdc8:320 2G0 `-NewVG--01-TestVol1253:604G0 sdd8:480 2G0 `-sdd18:490 2G0 sr011:01 1024M0 [root@studentvm1 ~]#

TYPE MOUNTPOINT disk part /boot part lvm/ lvm[SWAP] lvm/usr lvm/home lvm/var lvm/tmp disk part /TestFS part part lvm disk lvm disk part [SWAP] rom

Use the following command to print all SMART data and pipe it through the less filter. This assumes that your hard drive is /dev/sda, which it probably is in the virtual environment: [root@studentvm1 proc]# smartctl -x /dev/sda

There is not much to see because your VM is using a virtual hard drive. So here are the results from one of the hard drives on my primary workstation: [root@david ~]# smartctl -x /dev/sda smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-200.fc28.x86_64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

378

Chapter 13

Tools forProblem Solving

=== START OF INFORMATION SECTION === Model Family:Toshiba X300 Device Model:TOSHIBA HDWE140 Serial Number:46P2K0DZF58D LU WWN Device Id: 5 000039 6fb783fa0 Firmware Version: FP2A User Capacity:4,000,787,030,016 bytes [4.00 TB] Sector Sizes:512 bytes logical, 4096 bytes physical Rotation Rate:7200rpm Form Factor:3.5 inches Device is:In smartctl database [for details use: -P show] ATA Version is:ATA8-ACS (minor revision not indicated) SATA Version is:SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is:Wed Oct 31 08:59:01 2018 EDT SMART support is: Available- device has SMART capability. SMART support is: Enabled AAM feature is:Unavailable APM level is:128 (minimum power consumption without standby) Rd look-ahead is: Enabled Write cache is:Enabled DSN feature is:Unavailable ATA Security is:Disabled, frozen [SEC2] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status:(0x82)Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status:(0)The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection:(120) seconds. Offline data collection

379

Chapter 13

Tools forProblem Solving

capabilities:(0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:(2) minutes. Extended self-test routine recommended polling time:( 469) minutes. SCT capabilities:(0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAMEFLAGSVALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_RatePO-R--100100050 - 0 2 3 4 5 7 8 9 10 12 191 192 193

380

Throughput_PerformanceP-S--- 100 100 050-0 Spin_Up_TimePOS--K100100001 - 4146 Start_Stop_Count-O--CK100100000 - 132 Reallocated_Sector_CtPO--CK100100050 - 0 Seek_Error_RatePO-R--100100050 - 0 Seek_Time_PerformanceP-S---100100050 - 0 Power_On_Hours-O--CK051051000 - 19898 Spin_Retry_CountPO--CK102100030 - 0 Power_Cycle_Count-O--CK100100000 - 132 G-Sense_Error_Rate-O--CK100100000 - 63 Power-Off_Retract_Count -O--CK100100000 - 82 Load_Cycle_Count-O--CK100100000 - 162

Chapter 13

Tools forProblem Solving

194 Temperature_Celsius-O---K100100000 - 36 (Min/Max 24/45) 196 Reallocated_Event_Count -O--CK100100000 - 0 197 Current_Pending_Sector-O--CK 100 100 000-0 198 Offline_Uncorrectable----CK100100000 - 0 199 UDMA_CRC_Error_Count-O--CK200253000-0 220 Disk_Shift-O----100100000 - 0 222 Loaded_Hours-O--CK051051000 - 19891 223 Load_Retry_Count-O--CK100100000 - 0 224 Load_Friction-O---K100100000 - 0 226 Load-in_Time-OS--K100100000 - 210 240 Head_Flying_HoursP-----100100001 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning General Purpose Log Directory Version 1 SMARTLog Directory Version 1 [multi-sector log support] AddressAccessR/W SizeDescription 0x00GPL,SLR/O1Log Directory 0x01SLR/O1Summary SMART error log 0x02SLR/O 51Comprehensive SMART error log 0x03GPLR/O64Ext. Comprehensive SMART error log 0x04GPL,SLR/O8Device Statistics log 0x06SLR/O1SMART self-test log 0x07GPLR/O 1Extended self-test log 0x08GPLR/O 2Power Conditions log 0x09SLR/W1Selective self-test log 0x10GPLR/O 1NCQ Command Error log 0x11GPLR/O 1SATA Phy Event Counters log 0x24GPLR/O12288Current Device Internal Status Data log 0x30GPL,SLR/O9IDENTIFY DEVICE data log 0x80-0x9fGPL,SLR/W 16Host vendor specific log 0xa7GPLVS8Device vendor specific log 0xe0GPL,SLR/W1SCT Command/Status 0xe1GPL,SLR/W1SCT Data Transfer

381

Chapter 13

Tools forProblem Solving

SMART Extended Comprehensive Error Log Version: 1 (64 sectors) No Errors Logged SMART Extended Self-test Log Version: 1 (1 sectors) No self-tests have been logged.[To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPANMIN_LBAMAX_LBACURRENT_TEST_STATUS 100Not_testing 200Not_testing 300Not_testing 400Not_testing 500Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. SCT Status Version:3 SCT Version (vendor specific):1 (0x0001) SCT Support Level:1 Device State:Active (0) Current Temperature:36 Celsius Power Cycle Min/Max Temperature:34/45 Celsius LifetimeMin/Max Temperature: 24/45 Celsius Under/Over Temperature Limit Count:0/0 SCT Temperature History Version:2 Temperature Sampling Period:1 minute Temperature Logging Min/Max recommended Min/Max Temperature Temperature History

Interval:1 minute Temperature:5/55 Celsius Limit:5/55 Celsius Size (Index):478 (197)

IndexEstimated TimeTemperature Celsius 1982018-10-31 01:0237****************** .....( 12 skipped)...****************** 2112018-10-31 01:1537****************** 2122018-10-31 01:1636***************** .....(137 skipped)...***************** <snip>

382

Chapter 13

Tools forProblem Solving

162018-10-31 05:5835**************** 172018-10-31 05:5936***************** .....(179 skipped)...***************** 1972018-10-31 08:5936***************** SCT Error Recovery Control: Read: Disabled Write: Disabled Device Statistics (GP Log 0x04) PageOffset SizeValue Flags Description 0x01====== ====== General Statistics (rev 2) == 0x010x0084 132---Lifetime Power-On Resets 0x010x0104 19898---Power-on Hours 0x010x0186 37056039193---Logical Sectors Written 0x010x020631778305---Number of Write Commands 0x010x0286 46110927573---Logical Sectors Read 0x010x0306 256272184---Number of Read Commands 0x02====== ====== Free-Fall Statistics (rev 1) == 0x020x010463---Overlimit Shock Events 0x03====== ====== Rotating Media Statistics (rev 1) == 0x030x0084 19897---Spindle Motor Power-on Hours 0x030x0104 19891---Head Flying Hours 0x030x0184 162---Head Load Events 0x030x0204 0---Number of Reallocated Logical Sectors 0x030x0284 0---Read Recovery Attempts 0x030x0304 0---Number of Mechanical Start Failures 0x04====== ====== General Errors Statistics (rev 1) == 0x040x0084 0---Number of Reported Uncorrectable Errors 0x040x0104 1---Resets Between Cmd Acceptance and Completion 0x05====== ====== Temperature Statistics (rev 1) == 0x050x008136---Current Temperature 0x050x010137N--Average Short Term Temperature 0x050x018138N--Average Long Term Temperature 0x050x020145---Highest Temperature 0x050x028124---Lowest Temperature 0x050x030141N--Highest Average Short Term Temperature 0x050x038130N--Lowest Average Short Term Temperature

383

Chapter 13

Tools forProblem Solving

0x050x040139N--Highest Average Long Term Temperature 0x050x048132N--Lowest Average Long Term Temperature 0x050x0504 0---Time in Over-Temperature 0x050x058155---Specified Maximum Operating Temperature 0x050x0604 0---Time in Under-Temperature 0x050x0681 5---Specified Minimum Operating Temperature 0x06====== ====== Transport Statistics (rev 1) == 0x060x00841674---Number of Hardware Resets 0x060x0184 0---Number of Interface CRC Errors 0x07====== ====== Solid State Device Statistics (rev 1) == |||_ C monitored condition met ||__ D supports DSN |___ N normalized value Pending Defects log (GP Log 0x0c) not supported SATA Phy Event Counters (GP Log 0x11) IDSizeValueDescription 0x000140Command failed 0x000240R_ERR response 0x000340R_ERR response 0x000440R_ERR response 0x000540R_ERR response 0x000640R_ERR response 0x000740R_ERR response 0x000840Device-to-host

due to ICRC error for data FIS for device-to-host data FIS for host-to-device data FIS for non-data FIS for device-to-host non-data FIS for host-to-device non-data FIS non-data FIS retries

0x00094 15Transition from drive PhyRdy to drive PhyNRdy 0x000a4 16Device-to-host register FISes sent due to a COMRESET 0x000b40CRC errors within host-to-device FIS 0x000d40Non-CRC errors within host-to-device FIS 0x000f40R_ERR response for host-to-device data FIS, CRC 0x001040R_ERR response for host-to-device data FIS, non-CRC 0x001240R_ERR response for host-to-device non-data FIS, CRC 0x001340R_ERR response for host-to-device non-data FIS, nonCRC [root@david ~]#

384

Chapter 13

Tools forProblem Solving

One easy to understand part of this long and complex result is the START OF READ SMART DATA SECTION.The preceding result shown is SMART overall-health self-assessment test result: PASSED

The specific data shown for a particular hard drive will vary depending upon the device vendor and model. And more recent versions of the software can take advantage of additional information stored by newer hard drives. The SMART11 reports contain a great deal of information which can be useful if it can be understood. At first glance the data can be very confusing, but a little knowledge can be very helpful. Contributing to the confusion is the fact that there are no standards for the information being displayed and different vendors implement SMART in different ways. One large cloud storage company has been keeping records of close to 40,000 hard drives over the last few years and posting their data on the Web. According to an article12 on the Computer World web site, the company identified the following five data points that can predict hard drive failures: •

SMART 5: Reallocated_Sector_Count

•

SMART 187: Reported_Uncorrectable_Errors

•

SMART 188: Command_Timeout

•

SMART 197: Current_Pending_Sector_Count

•

SMART 198 : Offline_Uncorrectable

Each of these attributes is listed in the SMART Attributes section of the output, and low numbers are good. If any one– or especially more than one– of these attributes have high numbers, then it would be a good idea to replace the hard drive.

Wikipedia, S.M.A.R.T., https://en.wikipedia.org/wiki/S.M.A.R.T. Mearian, Lucas, The 5 SMART stats that actually predict hard drive failure, Computer World, www.computerworld.com/article/2846009/the-5-smart-stats-that-actually-predicthard-drive-failure.html

11 12

385

Chapter 13

Tools forProblem Solving

System statistics withSAR The sar command is one of my favorite tools when it comes to resolving problems. SAR stands for System Activity Reporter. Its primary function is to collect system performance data for each day and store it in log files for later display. Data is collected as ten-minute averages, but more granular collection can also be configured. Data is retained for one month. The only time I have made any changes to the SAR configuration is when I needed to collect data every minute instead of every ten minutes in order to get a better handle on the exact time a particular problem was occurring. The SAR data is stored in two files per day in the /var/log/sa directory. Collecting data more frequently than every ten minutes can cause these files to grow very large. In one place I worked, we had a problem that would start and escalate so quickly that the default ten-minute interval was not very helpful in determining which occurred first, CPU load, high disk activity, or something else. Using a one-minute interval, we determined that not only was CPU activity high but that it was preceded by a short interval of high network activity as well as high disk activity. It was ultimately determined that this was an unintentional denial of service (DOS) attack on the web server that was complicated by the fact that there was too little RAM installed in the computer to handle the temporary overload. Adding 2GB of RAM to the existing 2GB resolved the problem, and further DOS attacks have not caused problems.

I nstallation andconfiguration SAR is installed as part of the sysstat package in Red Hat-based distributions; however, it is not installed by default in at least some of the current Fedora distributions. We installed it in Chapter 7. By now the SAR data collection has been running long enough to accumulate a significant amount of data for us to explore. After installing SAR as part of the sysstat package, there is normally nothing that needs to be done to alter its configuration or to start it collecting data. Data is collected on every ten-minute mark of each hour.

Examining collected data The output from the sar command can be very detailed. A full day of data on my primary workstation, the one with 16 Intel cores and 32 CPUs, produced 14,921 lines of data. 386

Chapter 13

Tools forProblem Solving

You can deal with this in multiple ways. You can choose to limit the data displayed by specifying only certain subsets of data, you can grep out the data you want, or you can pipe it through the less tool and page through the data using less’s built-in search feature.

EXPERIMENT 13-12 Perform this experiment as the student user. The root privileges are not required to run the sar command. Because of the very large amount of data that can be emitted by SAR, I will not reproduce it all here except for headers and a few lines of data to illustrate the results.

Note Some options for the sar command are in uppercase as shown. Using lowercase will result in an error, or incorrect data being displayed. First, just enter the sar command with no options which displays only aggregate CPU performance data. The sar command uses the current day by default, starting at midnight or the time in the current day when the system was booted. If the host was rebooted during the current day, there will be a notification in the results. Note that some of the output of the SAR command can be very wide: [student@studentvm1 ~]$ sar Linux 4.18.9-200.fc28.x86_64 (studentvm1)11/01/2018_x86_64_(2 CPU) 08:44:38LINUX RESTART (2 CPU) 08:50:01 AMCPU%user%nice%system%iowait %steal%idle 09:00:05 AMall 0.01 0.03 0.13 1.54 0.0098.28 09:10:05 AMall 0.01 0.00 0.09 0.95 0.0098.95 09:20:05 AMall 0.01 0.00 0.08 1.14 0.0098.77 09:30:02 AMall 0.02 0.00 0.09 1.17 0.0098.72 09:40:05 AMall 0.01 0.00 0.08 0.95 0.0098.96 09:50:02 AMall 0.01 0.00 0.09 1.04 0.0098.86 10:00:01 AMall 0.01 0.01 0.09 1.29 0.0098.61 10:10:01 AMall 0.01 0.00 0.08 0.93 0.0098.98 10:20:05 AMall 6.26 3.9182.39 0.18 0.00 7.26 Average:all0.680.428.891.020.0088.98

387

Chapter 13

Tools forProblem Solving

11:10:03 AMLINUX RESTART(2 CPU) 11:20:31 AMCPU%user%nice%system%iowait %steal%idle 11:30:31 AMall18.4110.1571.34 0.00 0.00 0.10 11:40:07 AMall20.0710.9368.83 0.00 0.00 0.17 11:50:18 AMall18.6810.3270.88 0.00 0.00 0.13 12:00:31 PMall17.8310.0971.98 0.00 0.00 0.09 12:10:31 PMall17.8710.9571.07 0.00 0.00 0.11 Average:all18.5510.4870.84 0.00 0.00 0.12 [student@studentvm1 ~]$

All of this data is an aggregate for all CPUs, in this case two, for each ten-minute time period. It also is the same data you would see in top, htop, and atop for CPU usage. Use the next command to view details for each individual CPU: [student@studentvm1 ~]$ sar -P ALL Linux 4.18.9-200.fc28.x86_64 (studentvm1)11/01/2018_x86_64_ (2 CPU) 08:44:38LINUX RESTART (2 CPU) 08:50:01 09:00:05 09:00:05 09:00:05

AMCPU%user%nice%system%iowait %steal%idle AMall 0.01 0.03 0.13 1.54 0.0098.28 AM00.020.000.120.240.0099.61 AM10.010.050.142.850.0096.95

09:00:05 09:10:05 09:10:05 09:10:05 <snip>

AMCPU%user%nice%system%iowait %steal%idle AMall 0.01 0.00 0.09 0.95 0.0098.95 AM00.020.000.080.100.0099.80 AM10.010.000.101.800.0098.09

12:20:31 12:30:31 12:30:31 12:30:31

PMCPU%user%nice%system%iowait %steal%idle PMall15.4%13.6%70.8% 0.0% 0.0% 0.2% PM016.9%15.3%67.7% 0.0% 0.0% 0.1% PM113.9%11.8%73.9% 0.0% 0.0% 0.4%

Average:CPU%user%nice%system%iowait %steal%idle Average:all18.3%10.7%70.9% 0.0% 0.0% 0.1% Average:018.8%15.6%65.6% 0.0% 0.0% 0.0% Average:117.8% 5.9%76.1% 0.0% 0.0% 0.2%

388

<snip>

09:10:05 AM0.020.0k0.1k3.7k0.0034.2534.250.1% fedora_studentvm1-pool00-tpool

09:10:05 AM0.020.0k0.1k3.7k0.0034.2534.250.1% fedora_studentvm1-pool00_tdata

09:10:05 AM0.000.0k0.0k0.0k0.000.000.000.0% fedora_studentvm1-pool00_tmeta

09:10:05 AM1.740.4k8.3k5.0k0.1055.0514.062.4% sda

09:00:05 AM0.060.0k0.2k4.0k0.0027.3723.710.1% fedora_studentvm1-tmp

09:00:05 AM7.71154.0k 12.3k 21.6k0.068.394.213.2% fedora_studentvm1-var

09:00:05 AM0.000.0k0.0k0.0k0.000.000.000.0% fedora_studentvm1-home

09:00:05 AM0.8614.3k1.1k18.1k0.0110.254.410.4% fedora_studentvm1-usr

09:00:05 AM0.000.0k0.0k0.0k0.000.000.000.0% fedora_studentvm1-swap

09:00:05 AM0.090.7k0.2k9.5k0.0015.539.130.1% fedora_studentvm1-root

09:00:05 AM0.090.5k0.1k7.1k0.0015.539.130.1% fedora_studentvm1-pool00-tpool

09:00:05 AM0.090.5k0.1k7.1k0.0015.539.130.1% fedora_studentvm1-pool00_tdata

09:00:05 AM0.000.0k0.0k0.0k0.000.000.000.0% fedora_studentvm1-pool00_tmeta

09:00:05 AM8.12168.8k 13.5k 22.5k0.077.884.493.6% sda

08:50:01 AMtpsrkB/swkB/sareq-sz aqu-szawaitsvctm%util DEV

08:44:38LINUX RESTART (2 CPU)

Linux 4.18.9-200.fc28.x86_64 (studentvm1)11/01/2018_x86_64_ (2 CPU)

[student@studentvm1 ~]$ sar -dh

Now use the following command to view disk statistics. The -h option makes the data more easily readable by humans and, for block devices (disks), also shows the name of the device. The -d option specifies that SAR is to display disk activity:

Chapter 13 Tools forProblem Solving

389

Chapter 13

Tools forProblem Solving

Try the preceding command without the -h option. Run the following command to view all of the output for the current day or at least since the host was booted for the first time during the current day: [student@studentvm1 ~]$ sar -A | less

Use the man page for the sar command to interpret the results and to get an idea of the many options available. Many of those options allow you to view specific data such as network and disk performance. I typically use the sar -A command because many of the types of data available are interrelated and sometimes I find something that gives me a clue to a performance problem in a section of the output that I might not have looked at otherwise. You can limit the total amount of data to just the total CPU activity. Try that and notice that you only get the composite CPU data, not the data for the individual CPUs. Also try the -r option for memory and -S for swap space. It is also possible to combine these options so the following command will display CPU, memory, and swap space: [student@studentvm1 ~]$ sar -urS

If you want only data between certain times, you can use -s and -e to define the start and end times, respectively. The following command displays all CPU data, both individual and aggregate for the time period between 7:50 AM and 8:11 AM today: [student@studentvm1 ~]$ sar -P ALL -s 07:50:00 -e 08:11:00

Note that all times must be specified in 24-hour format. If you have multiple CPUs, each CPU is detailed individually, and the average for all CPUs is also given. The next command uses the -n option to display network statistics for all interfaces: [student@studentvm1 ~]$ sar -n ALL | less

Data collected for previous days can also be examined by specifying the desired log file. Assume that you want to see the data for the second day of the month, the following command displays all collected data for that day. The last two digits of each file name are the day of the month on which the data was collected.

390

Chapter 13

Tools forProblem Solving

I used the file sa02in the following example, but you should list the contents of the /var/log/sa directory and choose a file that exists there for your host: [student@studentvm1 ~]$ sar -A -f /var/log/sa/sa02 | less

You can also use SAR to display (nearly) real-time data. The following command displays memory usage in five-second intervals for ten iterations: [student@studentvm1 ~]$ sar -r 5 10

This is an interesting option for sar as it can provide a series of data points for a defined period of time that can be examined in detail and compared. The SAR utility is very powerful and has many options. We have merely touched on a few of them, and all are listed in the man page. I suggest you familiarize yourself with SAR because it is very useful for locating those performance problems that occur when no one is around to see them. If you are not very familiar with Intel and related hardware, some of the output from the sar command may not be particularly meaningful to you. Over time SysAdmins are pretty much bound to learn a great deal about hardware, and you will, too. The best way I can suggest to do this in a relatively safe manner is to use the tools you are learning in this course to explore all of the VMs and physical hosts you have available to you. I also like to build my own computers from parts I purchase at my local computer store and on the Internet. I also fix my own hardware.

Cleanup A little cleanup may be required at this point. We want to kill the cpuHogs, and you may also want to close many but not all of the terminal sessions you opened during the course of the experiments in this chapter. Use top to kill one of the CPU hog processes using signal 2. Now use htop to kill the other CPU hog process with signal 15. Quit the top, htop, and atop programs, and close all but one or two of the terminal sessions.

391

Chapter 13

Tools forProblem Solving

Chapter summary This chapter has introduced you to some of the most common tools that SysAdmins use for determining the source of many types of performance problems. Each tool that we explored provides useful information that can help locate the source of a problem. Although I start with top, I also depend upon all of the other tools as well because they are useful and valuable. Each one has enabled me to resolve a problem when the others could not. There are many other tools available, and a good number of them are tools that can be used on a GUI desktop to display pretty graphs of many types. We have looked at these specific tools because they are the ones that are most likely to be available or easily installed on almost any Linux host. As you progress in your experience as a SysAdmin, you will find other tools that will be useful to you. In no way should you try to memorize every option of every tool. Just knowing that these tools are there and that they each have useful and interesting capabilities gives you a place to start when trying to solve problems. You can explore more as time permits, and having a specific task, such as fixing a broken system, can focus your efforts on the specifics needed for that problem. In my opinion it is completely unnecessary to purchase expensive tools that merely repackage the content of the /proc filesystem– because that is exactly what they do. Nothing can give you any more information than what is already available to you using standard Linux tools. Linux even has many GUI tools from which to choose that can display graphs of all of the data we have looked at here and more and can do it with both local and remote hosts. And finally– by now you should be used to viewing the man and info pages as well as the available help options on most commands we are using to learn more about them. So I suspect you are as tired of reading those suggestions as I am of writing them. Let’s just stipulate that one thing you should always do when you read about a new command is to use those tools to assist you in learning more.

392

Chapter 13

Tools forProblem Solving

Exercises Perform the following exercises to complete this chapter: 1. Can you set the refresh delay for the top command to sub-second, such as .2 or .5 seconds? 2. Define the three load average numbers. 3. Using top, how much memory and swap space are free on the StudentVM1 virtual host? 4. List at least three other tools that you can find the memory usage information. 5. What does the TIME+ value in the top display tell you? 6. How much memory and swap space are free on this VM? 7. What is the default sort column for top? 8. Change the top sort column first to PID and then to TIME+. What is the PID of the process with the most CPU time? 9. What is the original source of data for top and every other tool we have explored in this chapter? 10. Is it possible to buffer data from more than one program in the same named pipe before reading any data from it? 11. Which of the tools discussed in this chapter provides network I/O information? 12. Which of the tools discussed in this chapter allows operations such as renicing to be performed simultaneously on multiple processes? 13. Using htop, on which column would you sort to determine which processes have accumulated the most total CPU time? 14. What is the difference between total and actual disk reads and writes as displayed by iotop? 15. Use the setup feature of htop to add the hostname and the time of day clock to the top of the right-hand header column. 393

Chapter 13

Tools forProblem Solving

16. What command would you use to obtain a time-domain graph of the internal temperatures of a hard drive? 17. Use SAR to view the network statistics for the current day. 18. View all of the recorded system activity for yesterday as of this reading and if your VM was running at that time. If not, choose another day in the SAR data collection. 19. What type of CPU is installed in your VM? Make: _________ Model:_____________Speed: __________GHz

394

CHAPTER 14

Terminal Emulator Mania O bjectives In this chapter you will learn •

To use multiple different terminal emulators

•

To use advanced features of these terminal emulators to work more efficiently

•

To use advanced Bash shell tools like wildcards, sets, brace expansion, meta-characters, and more to easily locate and act upon single or multiple files

The function of the terminal emulator is to provide us with a window on the GUI desktop that allows us to access the Linux command line where we can have unfettered access to the full power of Linux. In this chapter we will explore several terminal emulators in some detail as a means to better understand how these terminal emulators can make our use of the CLI more productive.

A bout terminals To refresh our memories, a terminal emulator is a software program that emulates a hardware terminal. Most terminal emulators are graphical programs that run on any Linux graphical desktop environment like Xfce, KDE, Cinnamon, LXDE, GNOME, and others. In Chapter 7 we explored the command-line interface (CLI) and the concept of the terminal emulator1 in some detail. We specifically looked at the xfce4-terminal to get us started on the command line, but we did not explore it in much depth. We will look at its features more closely along with several other terminal emulators. Wikipedia, Terminal Emulator, https://en.wikipedia.org/wiki/Terminal_emulator

395

Chapter 14

Terminal Emulator Mania

PREPARATION 14-1 Not all distributions install all of the terminal emulators we will use during this chapter, so we will install them now. Do this as root. Enter the following command to install the terminal emulators we will be exploring: # dnf -y install tilix lxterminal konsole5 rxvt terminator

You will notice that there were lots of dependencies installed in addition to the emulators themselves. All of these terminal emulators should now appear in the System Tools submenu of the system launcher on your desktop.

M y requirements As a Linux SysAdmin with many systems to manage in multiple locations, my life is all about simplification and making access to and monitoring of those systems easy and flexible. I have used many different terminal emulators in the past, all the way from the venerable Xterm to Terminator and Konsole. With as many as 25 or 30 terminal sessions open simultaneously much of the time, having a few windows in which to manage those sessions prevents having large numbers of windows open on my desktop. As a person who generally keeps a messy physical desktop– they do say that is the sign of high intelligence, cough, cough– and lots of open windows on my Linux desktop, wrangling all of my terminal sessions into a three or four windows is a great step forward in terms of decluttering. Figure 14-1 shows the desktop of my own primary workstation as I write this chapter. I have three different emulators open. I understand that it is impossible to discern any details in Figure14-1, but it does give you a good image of the flexibility provided by having multiple terminal emulators open on a single GUI desktop.

396

Chapter 14

Terminal Emulator Mania

Figure 14-1. My main workstation desktop with multiple terminal emulators open There are many terminal emulators available for Linux. Their different approaches to this task were defined by the needs, likes, dislikes, and philosophies of the developers who created them. One web site has an article entitled “35 Best Linux Terminal Emulators for 2018”2 which should give you an idea of how many options there are. Unfortunately there are too many for us to examine all of them here. The emulators we will explore in this chapter have features that enable us to massively leverage the power of the command line to become more efficient and effective in performing our jobs. I have used all of these terminal emulators at one time or another, and they all provide powerful features to do that. Sometimes I use more than one terminal emulator at the same time because each may fit the way I work better for a specific task. So while my– current– favorite terminal emulator happens to be xfce4-terminal and I have multiple instances of that open, I may also have instances of other terminal emulators open, too. But let’s do look more closely at a few of these terminal emulators.

3 5 Best Linux Terminal Emulators for 2018, www.slant.co/topics/794/~best-linux-terminalemulators

397

Chapter 14

Terminal Emulator Mania

rxvt There are some very minimalistic terminals out there. The rxvt terminal emulator is one of these. It has no features like tabs or multiple panes that can be opened in a single window. Its font support is primitive, and a specific font must be specified on the command line, or the very basic default font will be used.

EXPERIMENT 14-1 Open an rxvt instance on the desktop of your VM.The rxvt window has no menu or icon bars. Right-click in the window does nothing. But you can use it as a basic terminal emulator. Experiment with rxvt for a few minutes just to get a feel for a truly old-style but functional terminal emulator. The reason I included this terminal emulator in our exploration is to give you a baseline for comparing the advanced features of some of the other terminal emulators. Also, you may prefer this type of terminal emulator. There are people who do, and that is your choice and perfectly fine. The rxvt terminal executable is 197,472 bytes in size, and it uses 226MB of virtual memory when running. This is the smallest memory footprint of any terminal emulators I looked at for this chapter. But it is also a minimalist project. It has no features of any kind other than the fact that it works as a terminal emulator. It does have some options that can be used as part of the command line used to launch it, but these, too, are very minimal.

xfce4-terminal The xfce4-terminal emulator is my current favorite. It is a powerful emulator that uses tabs to allow multiple terminals in a single window. It is flexible and easy to use. This terminal emulator is simple compared to emulators like Tilix, Terminator, and Konsole, but it gets the job done. And, yes, xfce4-terminal is the name of the executable for this emulator.

398

Chapter 14

Terminal Emulator Mania

One of my favorite features of the xfce4-terminal are the tabs. You can open many tabs in a single window, and each tab can be logged in as a different user, or as root, or into different hosts. Think of each tab as a separate terminal session. This provides a huge amount of flexibility to run multiple terminal sessions while maintaining a single window on the desktop. I especially like the tabs on the xfce4-terminal emulator because they display the name of the host to which they are connected regardless of how many other hosts are connected through to make that connection, for example, host1 → host2 → host3 → host4 properly shows host4 in the tab. Other emulators show host2 at best. Like other components of the Xfce desktop, this terminal emulator uses very little in the way of system resources. You can also use the mouse to drag the tabs and change their order; a tab can also be dragged completely out of the window and onto the desktop which places it in a window of its own where you can then add more tabs if you like. Let’s try it now. Because we are using the Xfce desktop, you should have already been using the xfce4-terminal up to this point. You should, therefore, already be somewhat familiar with it.

EXPERIMENT 14-2 Perform this experiment as the student user. If you do not already have an available instance of the xfce4-terminal open on your desktop, open one now. Figure14-2 shows an xfce4-terminal window with three tabs open.

399

Chapter 14

Terminal Emulator Mania

Figure 14-2. The xfce4-terminal emulator sports an easy-to-use interface that includes tabs for switching between emulator sessions. Each tab may be logged in as a different user, to a different host, or any combination There should still be only be a single tab open in your instance. Perform a simple task just to have some content in this first terminal session, such as the ll command. In addition to the standard menu bar, the xfce4-terminal emulator also has an icon bar which can be used to open another tab or another emulator window. We need to turn on the icon bar in order to see it. On the menu bar, select View ➤ Show Toolbar. Hover the mouse pointer over the leftmost icon in the icon bar. The tool tip indicates that this icon will launch another tab in the current window. Click the tab icon. The new tab is inserted in the rightmost position of the tab bar which is created if there was only one terminal session open previously. Open a couple more tabs and su - to root in one of them. 400

Chapter 14

Terminal Emulator Mania

The tab names can be changed, and the tabs can be rearranged by drag and drop or by selecting the options on the menu bar. Double-click one of the tabs to open a small dialog that allows you to specify a new static name for the tab. Type in the name “My Tab.” Drag “My Tab” to a new location in the tab bar. Now drag one tab completely away from the xfce4-terminal window, and drop it somewhere else on the desktop. This creates a new window that contains only that tab. The new window now acts just the same as the original, and you can open new tabs in it as well. Many aspects of function and appearance can be easily configured to suit your needs. Opening the Terminal Preferences configuration menu shown in Figure14-3 gives access to five tabs that enable you to configure various aspects of the xfce4-terminal’s look and feel. Open the terminal Edit ➤ Preferences dialog, and select the Appearance tab. Choose different fonts and font sizes to view the differences. The htop utility uses bold text for some types of data, so remove the check mark from the Allow bold text item to see how that looks.

Figure 14-3. The xfce4-terminal Terminal Preferences dialog allows configuration of many aspects of its look and feel 401

Chapter 14

Terminal Emulator Mania

I sometimes fuss with the options in the Colors tab to enable some colors to be more readable. The Colors tab also has some presets from where you can start your modifications. I usually start with green or white on black and modify some of the individual colors to improve readability. Select the Colors tab. Load a couple of the different presets to view the differences. Feel free to experiment with this tab for a short time. Select the tab with htop running in it. Press the F1 key to see the htop help. Press F1 again to close the htop help page. Close all of the open xfce4-terminal windows. In my opinion, the xfce4-terminal emulator is the best overall terminal emulator I have used. It just works, and it has the features that work for me. So long as there is horizontal space available in the emulator window, the tabs are wide enough to show the entire host and directory name or certainly enough to figure out the rest. Other terminal emulators with tabs usually have fixed size tabs that restrict the view of the available information in the tab. The xfce4-terminal executable is just a little over 255KB in size. This emulator uses 576MB of virtual memory when running which is the second least of the advanced emulators I tested.

L XTerminal The LXTerminal emulator uses the least amount of RAM and has the smallest executable file of any of the other terminal emulators I have used. It has few extraneous features, but it does have tabs to enable multiple sessions in a single emulator window. The LXTerminal window has no icon bar; it uses only a menu bar and pop-up menus and dialog boxes when you right-click the window.

EXPERIMENT 14-3 Open an instance of LXTerminal as the student user. Run a short command such as ll to show some content in the first session. No tabs are displayed yet. Right-click the existing session to display a pop-up menu, and select New tab to open second tab. Two tabs should now be visible at the top of the terminal emulator window. Now open the File menu from the menu bar, and open a new tab. There should now be three open tabs as you can see in Figure14-4. 402

Chapter 14

Terminal Emulator Mania

Figure 14-4. The LXTerminal window with three tabs open Use the menu bar, and open Edit ➤ Preferences to display the minimalistic configuration options. You can change the terminal font and adjust the colors and cursor style on the Style tab. I sometimes adjust one or more of these colors to make certain colorized text a bit more readable. Choose a couple of the color palettes to see what is available, and then modify by using that as a starting point. Notice that no preference changes take effect until you click the OK button which also closes the Preferences dialog. This is one thing I dislike. Save your current changes to see how that looks. Open Preferences again and select the Display tab. I like having the tabs at the top, but you may prefer to have them at the bottom of the window. Select Bottom and save the change. The Display tab also allows changing the number of scrollback lines, which I usually do not change, and the default window size when a new LXTerminal window is opened. I currently 403

Chapter 14

Terminal Emulator Mania

have this adjusted to 130 columns by 65 lines. I have a lot of screen real estate, so that is fine on my wide screen. Play around with the window size, and start a new session of LXTerminal to see how that works for you. Other options on this tab enable you to hide various tools like the scroll bar and the menu bar. Play around with this to see how you might work in an environment without those tools. I never hide any of them. Switch to the Advanced tab. The only thing I ever do on this tab is disable the F10 menu shortcut key. The shortcuts tab provides the ability to change that key to something else, but I never change the defaults there, either. Spend some time exploring LXTerminal on your own so you can get a better feel for how it works for you. When finished, close all instances of LXTerminal. LXTerminal is a very lightweight terminal emulator which is reflected in its small size and relatively few configuration options. The important thing with this terminal emulator is that it has all of the things we need as SysAdmins to do our jobs quickly and easily. These two facts make LXTerminal perfect for small systems such as smaller and older laptops with low amounts of RAM but also powerful enough to be just as perfect in big systems like my primary workstation. The lxterminal executable is 98,592 bytes in size, and it consumes 457MB of virtual memory when running. Both of these numbers are the smallest of any of the advanced emulators that I have tested.

T ilix Tilix helps me organize at least a bit by allowing me to keep all– or at least a large number– of my terminal sessions in one very flexible window. I can organize my terminal sessions in many different ways due to the extreme power and flexibility of Tilix. Figure14-5 shows a typical– at least for me– Tilix window with one of the three active sessions that contains four terminals. Each terminal in this session– session 2 of 2– is connected to a different host using SSH.Note that the title bar in each terminal displays the user, hostname, and current directory for that terminal.

404

Chapter 14

Terminal Emulator Mania

The Tilix instance in Figure14-5 is running on my personal workstation. I have used SSH to log in to three different hosts in the student virtual network. The left half of the screen is a host that I use for testing, testvm1. The top right terminal is logged in to a VM server, studentvm2, which I have installed on my virtual test network.3 The terminal at the bottom right is logged into studentvm1. This can make it easy for me to monitor all three hosts in my virtual network. Some of the details may be difficult to see in Figure14-5, but you can see how this ability to have multiple terminals open in a single emulator window can allow easy comparison of multiple systems or multiple utilities on a single host can be very useful.

Figure 14-5. This Tilix instance has two sessions active with three terminals open in session 2 Let's ensure that we keep our terminology straight because it can be confusing. In Tilix, a “session” is a page in a Tilix window that contains one or more terminals. Opening a new session opens a new page with a single terminal emulation session. Tilix sessions can be created or subdivided horizontally and vertically, and general The referenced server is created in Volume 3.

405

Chapter 14

Terminal Emulator Mania

configuration can be performed using the tools in the Tilix title bar. Placing the window and session controls in the window title bar saves the space usually used for separate menu and icon bars.

EXPERIMENT 14-4 As the student user, start by opening an instance of Tilix on your VM desktop. Like the other terminal emulators that provide for multiple terminal sessions in a single window, only one session is opened when the emulator is launched. Figure 14-6 shows the top portion of the Tilix window with only one emulator session open. You can open another terminal in a new session, as defined earlier, or in this session. For this instance, let’s open a new terminal in this session vertically next to the existing terminal.

Figure 14-6. The title bar of the Tilix window contains a nonstandard set of icons that are used to help manage the terminal sessions On the left side of the title bar are the icons that let us open new terminals in various ways. The two icons in Figure14-7 open a new terminal in the current session.

Figure 14-7. Use these icons to open new termianl sessions beside or belo the current ones

406

Chapter 14

Terminal Emulator Mania

Click the left icon of this pair to open a terminal to the right of the existing one. The session window will be split down the middle and will now contain two terminals one on the left and one on the right. The result looks like that in Figure14-8.

Figure 14-8. The Tilix session after creation of a second terminal These two side-by-side terminals allow you to do things like use top to watch the effects of commands executed in one terminal on system resources in the other. Now select the terminal on the left, and click the button on the right of Figure14-9. This opens a new terminal such that terminal 1 is on the top and terminal 3 is on the bottom, with terminal 2 still taking the entire right side of the session.

407

Chapter 14

Terminal Emulator Mania

Figure 14-9. The Tilix window now has three terminals in this one session You can move the splitters between the terminals to adjust their relative size. Adjust both the horizontal and vertical splitters to see how they work. So far we have worked with only a single session. To create a second session in this Tilix window, click the plus sign (+) icon shown in Figure14-10.

Figure 14-10. Use the + icon to open a new session The new session is created and is now the focus. The first session with its three terminals is now hidden. The count in the icon now shows “2/2” because we are in the second session. Click anywhere in the left part of this icon to show the sidebar. Displayed on the left of the Tilix window, the sidebar displays smaller images of the open sessions. Click the desired session to switch to it. 408

Chapter 14

Terminal Emulator Mania

The icon on the far left of the title bar looks like a terminal screen, and we would normally expect that to be the standard System menu. For Tilix windows, that would be incorrect. Tilix places its own menu in that icon. One of the choices on that menu is Preferences. Open the Preferences dialog. I will let you find your own way through this Preferences dialog. I do suggest that you try switching from use of the sidebar to using tabs to switch between sessions. Try that for a while and see which you like better. There is one default profile for configuring the look and feel of Tilix, and other profiles can be added as needed. Each profile sets alternate values for the functions and appearance of Tilix. Existing profiles can be cloned to provide a starting place for new ones. To select from a list of profiles for an already open window, click the name of the terminal window, select Profiles, and then the profile you want to change to. You can also select one profile to be the one used when a new Tilix session or terminal is launched. For me, using a terminal emulator on a GUI desktop adds the power of a GUI to that of the command line. When using a terminal emulator like Tilix, Terminator, or Konsole that allow multiple pages and split screens, my ability to work efficiently is increased exponentially. Although there are other powerful terminal emulators out there that allow multiple terminal sessions in a single window, I have found that Tilix meets my needs better than any I have tried so far. Tilix offers me most standard features that xfce4-terminal, LXTerm, Konsole, Terminator, and other terminal emulation software do, while providing me some that they do not. It implements those features in a classy interface that is easy to learn, configure, and navigate, and it maximizes the use of onscreen real estate. I find that Tilix fits my desktop working style very nicely and that is what it is all about, isn't it. The Tilix executable is 2.9MB in size, and it consumes 675MB of virtual memory when running. There are other options for managing multiple terminal emulator sessions in a single window. We have already explored one of those, the GNU screen utility, and tmux (terminal multiplexer) is another. Both of these tools can be run in any terminal session using a single window, virtual console, or remote connection to provide creation of and access to multiple terminal emulator sessions in that one window. These two command-line tools are completely navigable by simple– or at least moderately simple– keystrokes. They do not require a GUI of any kind to run.

409

Chapter 14

Terminal Emulator Mania

The terminal emulators we are discussing in this chapter, as well as many we are not, are GUI tools that use multiple tabs or the ability to split an emulator window into multiple panes, each with a terminal emulator session. Some of these GUI terminal emulators, like Tilix, can divide the screen into multiple panes and use tabs, too. One of the advantages of having multiple panes is that it is easy to place sessions we want to compare or to observe together in a single window. It is easy, however, to split the screen into so many panes that there is not enough space in them to really see what is happening. So we can use the fancy multipaned, tabbed terminal emulators and then run screen or tmux in one or more of those emulator sessions. The only disadvantage I find to any of this is that I sometimes lose track of the existing sessions that are open and so forget that I already have one open already for a task I need to do. The combinations can get to be very complex. All of these interesting features make it possible to manage a large number of terminal sessions in a few windows which keep my desktop less cluttered. Finding a particular session might be a bit problematic, though. It can also be easy to type a command into the wrong terminal session which could create chaos.

K onsole Konsole is the default terminal emulator for the KDE desktop environment. It can be installed and used with any desktop, but it does install a large number of KDE libraries and packages that are not needed by other terminal emulators.

EXPERIMENT 14-5 Open a Konsole terminal emulator instance. Let’s make one configuration change before we go any further. Open Settings ➤ Configure Konsole and choose the Tab Bar tab. Change Tab Bar Visibility to Always Show Tab Bar, and place a check box in Show ‘New Tab’ and ‘Close Tab’ buttons. Click the OK button to make these change take effect. Konsole does not need to be restarted. Now you can see that Konsole provides icons to open and close the tabs on either side of the tab bar, and it allows us to simply double-click in the empty space in the tab bar to open a new tab. New tabs can also be opened in the File menu. Open a second tab using one of the methods just mentioned. Your Konsole window should now look like Figure14-11. 410

Chapter 14

Terminal Emulator Mania

Figure 14-11. The Konsole terminal emulator with two tabs open. A double-click on the empty space in the tab bar opens a new tab Konsole has a very flexible Profiles capability which can be accessed through Settings ➤ Manage Profiles... which opens the Configure dialog. Select the Profiles tab, and click New Profile... to create a new profile using this tab and configure it in different ways to explore the options here. Be sure to place a check mark in the Show column of the Profiles list to enable the new profile(s). Click OK to save the changes. Now open Settings ➤ Switch Profile, and click the name of your new profile. There are many other aspects of Konsole that you can explore. Take some additional time, and let your curiosity take you to some of those interesting places. I like Konsole very much because it provides tabs for multiple terminal sessions while maintaining a clean and simple user interface. I do have a concern about the KDE 411

Chapter 14

Terminal Emulator Mania

Plasma workspace because it seems to be expanding and becoming bloated and slow in general. I have experienced performance issues and crashes with the KDE Plasma desktop, but I have not had any performance problems with Konsole. An instance of Konsole uses 859MB of virtual memory.

T erminator Terminator is another powerful and feature-rich terminal emulator. Although it is based upon the GNOME terminal, its objective is to provide a tool for SysAdmins that can be used to many simultaneous terminals in tabs and grids within each tab.

EXPERIMENT 14-6 As the student user, open an instance of Terminator. Now right-click the window to open the menu as seen in Figure14-12. Choose Split Vertically to split the window in half, and open a new terminal in the right half.

Figure 14-12. All interaction with the Terminator features is through the pop-up menu 412

Chapter 14

Terminal Emulator Mania

You may want to resize the Terminator window to make it larger as you proceed through the rest of this experiment. Start the top program in the right terminal session. Open the man page for Terminator in the left terminal. Split the right-side terminal horizontally. The terminal with top running should be the upper one, and the new terminal should be on the bottom. Run a simple program like ll in the bottom right terminal. Split the bottom right terminal vertically. It may help to adjust the relative sizes of the terminal sessions to make some larger in order to see better. The terminal sessions are delineated by drag bars. Move the mouse pointer over the vertical drag bar between the left and right sides. Then drag the bar to the left to make more room for the terminal sessions on the right.

Note The double-arrow icons are used unlike any other application. When the pointer encounters a vertical drag bar, the up/down double-arrow icon is displayed. All other terminal emulators would use the right/left arrow to indicate the direction in which movement is possible. Your Terminator instance should look similar to Figure14-13. Now open a second tab, and split that tab into at least three terminal sessions.

413

Chapter 14

Terminal Emulator Mania

Figure 14-13. A Terminator instance with two tabs open and four sessions in the visible tab Terminal sessions can be rearranged in the window using drag and drop. Select the title bar for one of the windows in the first tab. Drag that terminal session to another location in the window. Move the terminals around in the window to get a feel for how this feature works. Terminal sessions cannot be dragged to the desktop to open another Terminator window; they can be only dragged to other locations within the window in which they already exist. Right-click to open the Terminator menu, and choose Preferences. Here is where you can make configuration changes and create new profiles. Try creating a new profile using a green on black color scheme and a slightly larger font. Create a third profile using a color scheme of your own choosing. Switch between profiles. Each open terminal must be switched individually to the new scheme. Spend some time exploring Terminator on your own, especially the various preferences.

414

Chapter 14

Terminal Emulator Mania

I find it very useful when I need to have many terminal sessions open and to have several of them visible at the same time. I do find that I sometimes end up with many small windows, so I need to rearrange them to enable me to view the more important terminals. An instance of Terminator typically consumes 753MB of virtual RAM by itself, and programs running in it will consume more.

Chapter summary As with almost every other aspect of Linux, there are many choices available to users and SysAdmins with respect to terminal emulators. I have tried many, but the ones I discussed in this chapter are those I have used the most and which provide me with the means to work most efficiently. If you already have a favorite terminal emulator and I have not included it here, I apologize– there are just too many to include all of them. I keep using each of these repeatedly because I like them, even if for different features. I also keep searching for other terminal emulators that I have not previously encountered because it is always good to learn about new things and one of them might be the terminal emulator that I could use to the exclusion of all others. You should spend some time using each of these terminal emulators outside the bounds of the experiments. Use different terminal emulators for the experiments in the rest of this course. That way you will have an opportunity to understand better how they can help leverage your use of the command line. Do not think that you must use any of these terminal emulators if they do not meet your needs. By all means try others you find and use the ones you like best.

Exercises Perform the following exercises to complete this chapter: 1. Why are there so many choices for terminal emulators? 2. Add a profile to Tilix that configures it so that it meets your needs and wants better than the default. You might want to change colors, fonts, and the default terminal size.

415

Chapter 14

Terminal Emulator Mania

3. Use DNF and the Internet to find new terminal emulators that were not explored in this chapter. Install at least two of them, and explore their capabilities. 4. Of the terminal emulator features we have explored in this chapter, which ones are most important to you at this time? 5. Choose an appropriate terminal emulator and open terminal sessions in it so that you can start and view the following programs all at the same time– top, iotop, and sar to view network statistics in real time. 6. Have you developed a preference for a particular terminal emulator yet? If so, which one? Why?

416

CHAPTER 15

Advanced Shell Topics O bjectives In this chapter you will learn •

The advanced usage of the Bash shell

•

The use of shell options

•

The difference between internal and external commands

•

How to plan for when commands fail

•

How to determine whether an internal command or an external command will be used

•

How to specify that the external command be used

•

The use of globbing to match multiple file names to be acted upon by commands

•

How the PATH variable affects which commands can be used

•

Where to place shell scripts for use by one user or all users

•

The use of compound commands

•

The use of basic flow control in simple compound commands

•

To use grep advanced pattern matching to extract lines from a data stream

•

How to use find to locate files based on simple or complex criteria

In Chapter 7 we looked briefly at the use of the Bash shell and defined some terms to ensure that we have the same understanding of what a terminal emulator is vs. a shell vs. the command line and many more potentially confusing terms. In Chapter 9 we looked at some basic Linux commands and the use of some simple pipelines and redirection. © David Both 2020 D. Both, Using and Administering Linux: Volume 1, https://doi.org/10.1007/978-1-4842-5049-5_15

417

Chapter 15

Advanced Shell Topics

In this chapter we look more closely at the Bash shell. We will explore in some detail the Bash internal commands and the environment and the variables contained there. We explore the effect of the environment on the execution of shell commands. We will also make a start with command-line programming by exploring the capabilities of compound commands and then moving on to some advanced tools, grep and find.

The Bash shell We have already been using the Bash shell and it should now seem at least a little familiar in the sense that we know a bit about how it works. A shell– any shell– is a command-line interpreter. The function of a shell is to take commands entered on the command line, expand any file globs, that is, wildcard characters * and ?, and sets, into complete file or directory names, convert the result into tokens for use by the kernel, and then pass the resulting command to the kernel for execution. The shell then sends any resulting output from execution of the command to STDOUT. Bash is both a command interpreter and a programming language. It can be used to create large and complex programs that use all of the common programming language structures such as flow control and procedures. The Bash shell is like any other command-line program. It can be called using command-line options and arguments. It also has an extensive man page to describe those and other aspects including its internal commands.

Shell options Bash shell options can be set when the Bash executable is launched, but we as users do not usually have access to the command that launches the shell. So the creators of Bash have provided us with the shopt (shell options) command that lets us view and alter many of the options that define the details of the shell’s behavior while the shell is running. The shopt command allows the user access to a superset of the options available with the Bash set command. I have not found it necessary to change any of the options accessible to the shopt command, but I do use the set command to set command-line editing to vi mode.

418

Chapter 15

Advanced Shell Topics

The shopt command can be used without options to list the current state of the Bash options that have been explicitly set to enabled or disabled. It does not list all of the options available. The Bash man page has details of both set and shopt including all of the options they can be used to set.

EXPERIMENT 15-1 Perform this experiment as the student user. We will just take a quick look at the shell options but won’t change any of them. List the shell options by using the shopt command without any options or arguments: [student@studentvm1 ~]$ shopt autocdoff cdable_varsoff cdspelloff checkhashoff checkjobsoff checkwinsizeon cmdhiston compat31off <snip> nullgloboff progcompon promptvarson restricted_shelloff shift_verboseoff sourcepathon xpg_echooff

I have pruned the preceding list, so you should see more output than shown here. As I mentioned, I have never had the need to change any of these shell options.

419

Chapter 15

Advanced Shell Topics

S hell variables We will explore environment and shell variables in more detail in Chapter 17, but let’s take a quick look now. A variable is a named entity that represents a location in memory which contains a value. The value of a variable is not fixed and can be changed as a result of various numeric or string operations. Bash shell variables are not typed; that is, they can be manipulated as a number or a string.

EXPERIMENT 15-2 Perform this experiment as the student user. First let’s print the value of the $HOSTNAME variable in the shell because it already exists. Any time we wish to access the value of a variable in a script or from a CLI command, we use the $ sign to refer to it. The $ sign indicates to the Bash shell that the name that follows (with no empty spaces) is the name of a variable: [student@studentvm1 ~]$ echo $HOSTNAME studentvm1

Now let’s look at another variable– one that does not already exist that we will name MYVAR: [student@studentvm1 ~]$ echo $MYVAR [student@studentvm1 ~]$

Because this variable does not yet exist, it is null, so the shell prints a null line. Let’s assign a value to this variable and then print the variable again: [student@studentvm1 ~]$ MYVAR="Hello World!" [student@studentvm1 ~]$ echo $MYVAR Hello World! [student@studentvm1 ~]$

So you can see that we use the variable name without the preceding $ sign to set a value into a variable. In this case the Bash shell can infer from the context that the name following the equal sign is a variable name.

420

Chapter 15

Advanced Shell Topics

Tip The Bash shell syntax is very strict and sometimes requires spaces or requires no spaces. In the case of a variable assignment, there must be no spaces on either side of the equal sign. I sometimes use “PATH” or “path” as a reference to the path as a general concept, but when I use $PATH, it will always refer to the variable or its value.

Commands The purpose of the shell is to make human interaction with the computer easy and efficient. Shells take the commands we type, modify them so the kernel will understand them, and pass them to the operating system which then executes them. Shells provide the tools to enable this interaction. Commands fall into two categories. There are internal commands that are an integral part of the shell program and external commands that are those with separate existence, which have their own executable files, such as the GNU and Linux core utilities. Other external commands are tools provided separately or by various Linux components such as logical volume management (LVM). This distinction is important because shell internal commands are executed before an external command with the same name. For example, there is a Bash internal echo command and the external echo command. Unless you specify the path to the external command as part of the command line, the Bash internal echo command will be used. This may be a problem if the commands work a bit differently. Let’s get very specific about how the Bash shell works when a command is entered: 1. Type in the command and press Enter. 2. Bash parses the command to see if there is a path prepended to the command name. If there is, skip to step 4. 3. Bash checks to see if the command is internal. If it is, the Bash shell runs the command immediately. 4. If a path is used as part of the command, Bash forks a new subprocess in which to execute the command and then runs the command. This forking takes time as well as system resources such as CPU, I/O, and RAM. 421

Chapter 15

Advanced Shell Topics

5. If no path to the command is specified, and this is not an internal command, Bash searches the list of aliases and shell functions– system and user created procedures. If one is found, it forks a new shell subprocess and executes the function or alias. Again this all takes time although very small amounts. 6. If no alias or function is located, Bash then searches the list of directories specified in the $PATH shell variable to locate the command. When the command is located, Bash forks a new subshell to execute the command. More time is consumed. 7. If a command is run in a subshell, the subshell terminates and execution returns to the parent shell.

The PATH The $PATH is a very important environment variable for the shell. It defines a colon separated list of directories in which the system and the shell look for executable files. The shell looks in each directory listed in $PATH for executable files when a non-internal command is entered. The $PATH environment variable can be altered for the current shell or for all shell instances for a specific user or even for all users. This is usually neither necessary nor desirable because the default $PATH takes into consideration the need of individual users to maintain executable files like shell scripts on their own home directory tree as we will see.

EXPERIMENT 15-3 Perform this experiment as the student user. Let’s start by discovering the default value of $PATH: [student@studentvm1 ~]$ echo $PATH /usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/student/.local/bin:/ home/student/bin

Consider the elements of this PATH.The first is /usr/local/bin which is a specifically defined location for storing locally created executable files such as shell scripts for SysAdmins or 422

Chapter 15

Advanced Shell Topics

use by all users. The /usr/local/etc directory is used for storing configuration files for the executables in /usr/local/bin. The second element is /usr/bin. This is for most user-executable binary files and is intended for use by all users. The third is /usr/sbin which is for standard but nonessential system binaries for use by the SysAdmin. The last two directory specifiers are in the user’s directory tree. So if a user had some private executables, again such as personal shell scripts, those would usually be stored in ~/bin where the kernel will search for them because they are in the user’s $PATH. The $PATH saves a good bit of typing. Remember how we were required to start the cpuHog program? ./cpuHog

The reason we had to precede the command with ./ (dot-slash) is that the cpuHog executable shell script is in the student user’s home directory, /home/student/ which is not part of $PATH. Try it with the student user’s home directory as the PWD and without specifying the home directory in some manner: [student@studentvm1 ~]$ cpuHog Bash: /home/student/bin/cpuHog: No such file or directory

We receive an error, so we need to specify the path using, in this case, the relative path of the current directory. The dot (.) notation is a shortcut for the current directory. We could have issued this command in the following ways: •

./cpuHog

•

~/cpuHog

•

/home/cpuHog

Terminate any currently running instances of the cpuHog. Ensure that the PWD for the student user is the home directory (~). Then let’s try the two methods we have not yet used. Method #1 assumes that the cpuHog script is in the PWD.Method #2 makes no assumptions about the current PWD and uses the ~ (Tilde) shortcut for the user’s home directory. Switch to a different directory, and start the cpuHog using method #2: [student@studentvm1 ~]$ cd /tmp ; ~/cpuHog

423

Chapter 15

Advanced Shell Topics

Use Ctrl-C to terminate this instance of the cpuHog. Remain in the /tmp/ directory and use method #3: [student@studentvm1 tmp]$ /home/student/cpuHog

This method also works, but it does require much more typing. All of these methods require more typing than simply placing the cpuHog file in the user’s private executable file directory, ~/bin. Don’t forget that the lazy SysAdmin does everything possible to type as few keystrokes as necessary. Change the PWD to the home directory and find ~/bin. It is not there, so we have to create it. We can do that, move the cpuHog into it, and launch the program, all in a single compound command: [student@studentvm1 ~]$ cd ; mkdir ~/bin ; mv cpuHog ./bin ; cpuHog

The function of the $PATH is to provide defined locations in which executable files can be stored so that it is not necessary to type out the path to them. We will talk more about compound commands later in this chapter.

Internal commands Linux shells have a large number of internal, built-in, commands of their own. The Bash shell is no exception. The man and info pages provide a list of these commands, but which ones are the internal commands can be a bit difficult to dig out of all the other information. These internal commands are part of the shell itself and do not have an existence outside the Bash shell. This is why they are defined as “internal.”

EXPERIMENT 15-4 Perform this experiment as the student user. The help command is the easiest way to list the internal Bash commands: [student@studentvm1 ~]$ help GNU Bash, version 4.4.23(1)-release (x86_64-redhat-linux-gnu) These shell commands are defined internally.Type `help' to see this list. Type `help name' to find out more about the function `name'.

424

Chapter 15

Advanced Shell Topics

Use `info Bash' to find out more about the shell in general. Use `man -k' or `info' to find out more about commands not in this list. A star (*) next to a name means that the command is disabled. job_spec [&] history [-c] [-d offset] [n] or history -anr> (( expression )) if COMMANDS; then COMMANDS; [ elif COMMANDS;> . filename [arguments] jobs [-lnprs] [jobspec ...] or jobs -x comma> : kill [-s sigspec | -n signum | -sigspec] pid> [ arg... ]let arg [arg ...] [[ expression ]]local [option] name[=value] ... alias [-p] [name[=value] ... ]logout [n] bg [job_spec ...] mapfile [-d delim] [-n count] [-O origin] [-> bind [-lpsvPSVX] [-m keymap] [-f filename] [->popd [-n] [+N | -N] break [n]printf [-v var] format [arguments] builtin [shell-builtin [arg ...]]pushd [-n] [+N | -N | dir] caller [expr]pwd [-LP] case WORD in [PATTERN [| PATTERN]...) COMMAND>read [-ers] [-a array] [-d delim] [-i text] > cd [-L|[-P [-e]] [-@]] [dir] readarray [-n count] [-O origin] [-s count] > command [-pVv] command [arg ...] readonly [-aAf] [name[=value] ...] or readon> compgen [-abcdefgjksuv] [-o option] [-A actio>return [n] complete [-abcdefgjksuv] [-pr] [-DE] [-o opti>select NAME [in WORDS ... ;] do COMMANDS; do> compopt [-o|+o option] [-DE] [name ...]set [-abefhkmnptuvxBCHP] [-o option-name] [-> continue [n]shift [n] coproc [NAME] command [redirections]shopt [-pqsu] [-o] [optname ...] declare [-aAfFgilnrtux] [-p] [name[=value] ..>source filename [arguments] dirs [-clpv] [+N] [-N]suspend [-f] disown [-h] [-ar] [jobspec ... | pid ...]test [expr]

425

Chapter 15

Advanced Shell Topics

echo [-neE] [arg ...]time [-p] pipeline enable [-a] [-dnps] [-f filename] [name ...]times eval [arg ...] trap [-lp] [[arg] signal_ spec ...] exec [-cl] [-a name] [command [arguments ...]>true exit [n]type [-afptP] name [name ...] export [-fn] [name[=value] ...] or export -ptypeset [-aAfFgilnrtux] [-p] name[=value] .> false ulimit [-SHabcdefiklm npqrstuvxPT] [limit] fc [-e ename] [-lnr] [first] [last] or fc -s >umask [-p] [-S] [mode] fg [job_spec]unalias [-a] name [name ...] for NAME [in WORDS ... ] ; do COMMANDS; doneunset [-f] [-v] [-n] [name ...] for (( exp1; exp2; exp3 )); do COMMANDS; don>until COMMANDS; do COMMANDS; done function name { COMMANDS ; } or name () { COM>variables- Names and meanings of some shell> getopts optstring name [arg]wait [-n] [id ...] hash [-lr] [-p pathname] [-dt] [name ...]while COMMANDS; do COMMANDS; done help [-dms] [pattern ...]{ COMMANDS ; } [student@studentvm1 ~]$

Note The greater than character, gt (>), at the ends of some lines in each column of the help output indicates that the line was truncated for lack of space. For details on each command, use the man page for Bash, or just type help with the name of the internal command. For example: [student@studentvm1 ~]$ help echo echo: echo [-neE] [arg ...] Write arguments to the standard output. Display the ARGs, separated by a single space character and followed by a newline, on the standard output. <snip>

426

Chapter 15

Advanced Shell Topics

The man pages provide information for external commands only. The information for the internal commands is only located in the man and info pages for Bash itself: [student@studentvm1 ~]$ man Bash

To find the shell internal commands, use the following search. Yes, in all caps: /^SHELL BUILTIN The forward slash (/) starts the search. The caret (^) is an anchor character which indicates that the search should only find this string if it starts at the beginning of the line. This string does appear in many places, but those all refer to the single location where it starts the section at the beginning of the line, saying: “see SHELL BUILTIN COMMANDS below.” Each internal command is listed in the SHELL BUILTIN COMMANDS section along with its syntax and possible options and arguments. Many of the Bash internal commands, such as for, continue, break, declare, getopts, and others, are for use in scripts or commandline programs rather than as stand-alone commands on the command line. We will look at some of these later in this chapter. Scroll through the SHELL BUILTIN COMMANDS section of the Bash man page. Let’s take three of these commands and use the type utility to identify them: [student@studentvm1 ~]$ type echo getopts egrep echo is a shell builtin getopts is a shell builtin egrep is aliased to `egrep --color=auto'

The type command enables us to easily identify those commands that are shell internals. Like many Linux commands, it can take a list of arguments.

External commands External commands are those that exist as executable files and which are not part of the shell. The executable files are stored inlocations like /bin, /usr/bin, /sbin, and so on.

427

Chapter 15

Advanced Shell Topics

EXPERIMENT 15-5 First, make /bin the PWD and do a long list of the files there: [student@studentvm1 bin]$ ll | less

Scroll through the list and locate some familiar commands. You will also find both echo and getopts in these external commands. Why did the type command not show us this? It can if we use the -a option which locates commands in any form, even aliases: [student@studentvm1 bin]$ type -a echo getopts egrep echo is a shell builtin echo is /usr/bin/echo getopts is a shell builtin getopts is /usr/bin/getopts egrep is aliased to `egrep --color=auto' egrep is /usr/bin/egrep [student@studentvm1 bin]$

The type command searches for executables in the same sequence as the shell would search if it were going to execute the command. Without the -a option, type stops at the first instance, thus showing the executable that would run if the command were to be executed. The -a option tells it to display all instances. What about our cpuHog shell script? What does type tell us about that? Try it and find out.

Forcing theuse ofexternal commands As we have seen, it is possible for both internal and external versions of some commands to be present at the same time. When this occurs, one command may work a bit differently from the other– despite having the same name– and we need to be aware of that possibility in order to use the command that provides the desired result. If it becomes necessary to ensure that the external command runs and that the internal command with the same name does not, simply add the path to the command name as in /usr/bin/echo. This is where an understanding of how the Bash shell searches for and executes commands is helpful.

428

Chapter 15

Advanced Shell Topics

Compound commands We have already used some very simple compound commands. The simplest form of compound command is just stringing several commands together in a sequence on the command line; such commands are separated by a semicolon which defines the end of a command. You can build up compound commands in the same way as you built complex pipelines of commands. To create a simple series of commands on a single line, simply separate each command using a semicolon, like this: command1 ; command2 ; command3 ; command4 ; ... etc. ; No final semicolon is required because pressing the Enter key implies the end of the final command. Adding that last semicolon for consistency is fine. This list of several commands might be something like we did at the end of Experiment 15-1in which we created a new directory, moved the cpuHog file into that directory, and then executed the cpuHog. In such a case, the ability of later commands do depend upon the correct result of the preceding commands: cd ; mkdir ~/bin ; mv cpuHog ./bin ; cpuHog Those commands will all run without a problem so long as no errors occur. But what happens when an error occurs? We can anticipate and allow for errors using the && and || built-in Bash control operators. These two control operators provide us with some flow control and enable us to alter the sequence of code execution. The semicolon is also considered to be a Bash control operator as is the newline character. The && operator simply says that if command1 is successful, then run command2. If command1 fails for any reason, then command2 is skipped. That syntax looks like this: command1 && command2 This works because every command sends a return code (RC) to the shell that indicates whether it completed successfully or whether there was some type of failure during execution. By convention, a return code of zero (0) indicates success, while any positive number indicates some type of failure. Some of the tools we use as SysAdmins return only a one (1) to indicate a failure, but many can return other codes as well to further define the type of failure that occurred. The Bash shell has a variable, $?, which can be checked very easily by a script, the next command in a list of commands, or even us SysAdmins. 429

Chapter 15

Advanced Shell Topics

EXPERIMENT 15-6 First let’s look at return codes. We can run a simple command and then immediately check the return code. The return code will always be for the last command that was run before we look at it: [student@studentvm1 ~]$ ll ; echo "RC = $?" total 284 -rw-rw-r--1 student student 130 drwxrwxr-x2 student student4096 drwxr-xr-x. 2 student student4096 -rw-rw-r--. 1 student student1836 -rw-rw-r--. 1 student student 44297 <snip> drwxrwxr-x. 2 student student4096 drwxr-xr-x. 2 student student4096 RC = 0 [student@studentvm1 ~]$

Sep 15 Nov 10 Aug 18 Sep6 Sep6

16:21 11:09 17:10 09:08 10:52

ascii-program.sh bin Desktop diskusage.txt dmesg1.txt

Sep6 14:48 testdir7 Aug 18 10:21 Videos

The return code (RC) is zero (0) which means the command completed successfully. Now try the same command on a directory for which we do not have permissions: [student@studentvm1 ~]$ ll /root ; echo "RC = $?" ls: cannot open directory '/root': Permission denied RC = 2 [student@studentvm1 ~]$

Where can you find the meaning of this return code? Let’s try the && control operator as it might be used in a command-line program. We start with something simple. Our objective is to create a new directory and create a new file in it. We only want to do this if the directory can be created successfully. We can use ~/testdir which was created in a previous chapter for this experiment. The following command is intended to create a new directory in ~/testdir which should currently be empty: [student@studentvm1 ~]$ mkdir ~/testdir/testdir8 && touch ~/testdir/testdir8/ testfile1 [student@studentvm1 ~]$ ll ~/testdir/testdir8/

430

Chapter 15

Advanced Shell Topics

total 0 -rw-rw-r-- 1 student student 0 Nov 12 14:13 testfile1 [student@studentvm1 ~]$

Everything worked as it should because the testdir directory is accessible and writable. Change the permissions on testdir so it is no longer accessible to the student user. We will explore file ownership and permissions in Chapter 18 of this volume: [student@studentvm1 ~]$ chmod 076 testdir ; ll | d---rwxrw-. 3 student student4096 Nov 12 14:13 drwxrwxr-x. 3 student student4096 Sep6 14:48 drwxrwxr-x. 2 student student4096 Sep6 14:48 drwxrwxr-x. 2 student student4096 Sep6 14:48

grep testdir testdir testdir1 testdir6 testdir7

[student@studentvm1 ~]$

Using the grep command after the long list (ll) shows us the listing for all directories with testdir in their names. You can see that the user student no longer has any access to the testdir directory.1 Now let’s run almost the same commands as before but with a different directory name to create in testdir: [student@studentvm1 ~]$ mkdir ~/testdir/testdir9 && touch ~/testdir/testdir9/ testfile1 mkdir: cannot create directory '/home/student/testdir/testdir9': Permission denied [student@studentvm1 ~]$

Using the && control operator prevents the touch command from running because there was an error in creating testdir9. This type of command-line program flow control can prevent errors from compounding and making a real mess of things. But let’s get a little more complicated. The || control operator allows us to add another program statement that executes when the initial program statement returns a code larger than zero. The || control operator allows us to add another program statement that executes when the initial program statement returns a code greater than zero. The basic syntax looks like this: command1 || command2

We will explore file and directory permissions in detail in Chapter 17.

431

Chapter 15

Advanced Shell Topics

This syntax reads, If command1 fails, execute command2. That implies that if command1 succeeds, command2 is skipped. Let’s try this with our attempt to create a new directory: [student@testvm1 ~]$ mkdir ~/testdir/testdir9 || echo "testdir9 was not created." mkdir: cannot create directory '/home/student/testdir/testdir9': Permission denied testdir9 was not created. [student@testvm1 ~]$

This is exactly what we expected. Because the new directory could not be created, the first command failed which resulted in execution of the second command. Combining these two operators gives us the best of both: [student@studentvm1 ~]$ mkdir ~/testdir/testdir9 && touch ~/testdir/testdir9/ testfile1 || echo "." mkdir: cannot create directory '/home/student/testdir/testdir9': Permission denied [student@studentvm1 ~]$

Now reset the permissions on ~/testdir to 775, and try this last command again. Our compound command syntax using some flow control now takes this general form when we use both of the && and || control operators: preceding commands ; command1 && command2 || command3 ; following commands This syntax can be stated like so: if command1 exits with a return code of 0, then execute command2; otherwise execute command3. The compound command using the control operators may be preceded and followed by other commands that can be related to the ones in the flow control section but which are unaffected by the flow control. All of the preceding and following commands will execute without regard to anything that takes place inside the flow control compound command.

432

Chapter 15

Advanced Shell Topics

Time-saving tools There are some additional tools that we have available both as SysAdmins and non- privileged users that give us a lot of flexibility when performing a wide range of tasks. The use of globbing and sets enable us to match character strings in file names and data streams in order to perform further transformations or actions on them. Brace expansion lets us expand strings that have some commonalities into multiple but different strings. We have already seen several of the meta-characters available in Bash; they provide programming capabilities that greatly enhance the functionality of the shell.

Brace expansion Let’s start with brace expansion because we will use this tool to create a large number of files to use in experiments with special pattern characters. Brace expansion can be used to generate lists of arbitrary strings and insert them into a specific location within an enclosing static string or at either end of a static string. This may be hard to visualize, so let’s just do it.

EXPERIMENT 15-7 First let’s just see what a brace expansion does: [student@studentvm1 ~]$ echo {string1,string2,string3} string1 string2 string3

Well, that is not very helpful, is it? But look what happens when we use it just a bit differently: [student@studentvm1 ~]$ echo "Hello "{David,Jen,Rikki,Jason}. Hello David. Hello Jen. Hello Rikki. Hello Jason.

That looks like something we might be able to use because it can save a good deal of typing. Now try this: [student@studentvm1 ~]$ echo b{ed,olt,ar}s beds bolts bars

433

Chapter 15

Advanced Shell Topics

Here is how we can generate file names for testing: [student@studentvm1 ~]$ echo testfile{0,1,2,3,4,5,6,7,8,9}.txt testfile0.txt testfile1.txt testfile2.txt testfile3.txt testfile4.txt testfile5.txt testfile6.txt testfile7.txt testfile8.txt testfile9.txt

And here is an even better method for creating sequentially numbered files: [student@studentvm1 ~]$ echo test{0..9}.file test0.file test1.file test2.file test3.file test4.file test5.file test6.file test7.file test8.file test9.file

The {x..y} syntax, where x and y are integers, expands to be all integers between and including x and y. The following is a little more illustrative of that: [student@studentvm1 ~]$ echo test{20..54}.file test20.file test21.file test22.file test23.file test24.file test25. file test26.file test27.file test28.file test29.file test30.file test31. file test32.file test33.file test34.file test35.file test36.file test37. file test38.file test39.file test40.file test41.file test42.file test43. file test44.file test45.file test46.file test47.file test48.file test49.file test50.file test51.file test52.file test53.file test54.file

Now try this one: [student@studentvm1 ~]$ echo test{0..9}.file{1..4}

And this one: [student@studentvm1 ~]$ echo test{0..20}{a..f}.file

And this one which prepends leading zeros to keep the length of the numbers and thus the length of the file names equal. This makes for easy searching and sorting: [student@studentvm1 ~]$ echo test{000..200}{a..f}.file

So far all we have done is to create long lists of strings. Before we do something more or less productive, let’s move into a directory in which we can play around ... I mean experiment with creating and working with files. If you have not already done so, make the directory ~/testdir7 the PWD.Verify that there are no other files in this directory and delete them if there are.

434

Chapter 15

Advanced Shell Topics

Now let’s change the format just a bit and then actually create files using the results as file names: [student@studentvm1 testdir7]$ touch {my,your,our}.test.file.{000..200} {a..f}.{txt,asc,file,text}

That was fast. I want to know just how fast, so let’s delete the files we just created and use the time command to, well, time how long it takes: [student@studentvm1 testdir7]$ rm * ; time touch {my,your,our}.test.file. {000..200}{a..f}.{txt,asc,file,text} real0m0.154s user0m0.038s sys0m0.110s [student@studentvm1 testdir7]$

That .154 seconds of real time really is fast to create 14,472 empty files. Verify that using the wc command. If you get 14,473 as the result, why? Can you find a simple way to obtain the correct result? We will use these files in some of the following experiments. Do not delete them.

Special pattern characters Although most SysAdmins talk about file globbing,2 we really mean special pattern characters that allow us significant flexibility in matching file names and other strings when performing various actions. These special pattern characters allow matching single, multiple, or specific characters in a string: ? Matches only one of any character in the specified location within the string. * Zero or more of any character in the specified location within the

string. In all likelihood you have used these before. Let’s experiment with some ways we can use these effectively.

Wikipedia, Glob, https://en.wikipedia.org/wiki/Glob_(programming)

435

Chapter 15

Advanced Shell Topics

EXPERIMENT 15-8 You might have used file globbing to answer the question I posed in Experiment 15-5: [student@studentvm1 testdir7]$ ls *test* | wc 1447214472340092 [student@studentvm1 testdir7]$

In order to achieve this result, we must understand the structure of the file names we created. They all contain the string “test,” so we can use that. The command uses the shell’s built-in file globbing to match all files that contain the string “test” anywhere in their names, and that can have any number of any character both before and after that one specific string. Let’s just see what that looks like without counting the number of lines in the output: [student@studentvm1 testdir7]$ ls *test*

I am sure that “you” don’t want any of “my” files in your home directory. First see how many of “my” files there are, and then delete them all and verify that there are none left: [student@studentvm1 testdir7]$ ls my* | wc ; rm -v my* ; ls my*

The -v option of the rm command lists every file as it deletes it. This information could be redirected to a log file for keeping a record of what was done. This file glob enables the ls command to list every file that starts with “my” and perform actions on them. Find all of “our” files that have txt as the ending extension: [student@studentvm1 testdir7]$ ls our*txt | wc

Locate all files that contain 6in the tens position of the three-digit number embedded in the file names, and that end with asc: [student@studentvm1 testdir7]$ ls *e.?6?*.asc

We must do this with a little extra work to ensure that we specify the positioning of the “6” carefully to prevent listing all of the files that only contain a 6in the hundreds or ones position but not in the tens position of the three-digit number. We know that none of the file names contains 6in the hundreds position, but this makes our glob a bit more general so that it would work in both of those cases.

436

Chapter 15

Advanced Shell Topics

We do not care whether the file name starts with our or your, but we use the final “e.” of “file.”– with the dot– to anchor the next three characters. After “e.” in the file name, all of the files have three digits. We do not care about the first and third digits, just the second one. So we use the ? to explicitly define that we have one and only one character before and after the 6. We then use the * to specify that we don't care how many or which characters we have after that but that we do want to list files that end with “asc”. We want to add some content to some of the files. The file pattern specification we have now is almost where we want it. Let’s add content to all files that have a 6in the middle position of the three-digit number but which also has an “a” after the number, as in x6xa. We want all files that match this pattern regardless of the trailing extension, asc, txt, text, or file. First, let’s make certain that our pattern works correctly: [student@studentvm1 testdir7]$ ls *e.?6?a.* our.test.file.060a.ascour.test.file.163a.textyour.test.file.067a.asc our.test.file.060a.fileour.test.file.163a.txtyour.test.file.067a.file our.test.file.060a.textour.test.file.164a.ascyour.test.file.067a.text our.test.file.060a.txtour.test.file.164a.fileyour.test.file.067a.txt our.test.file.061a.ascour.test.file.164a.textyour.test.file.068a.asc our.test.file.061a.fileour.test.file.164a.txtyour.test.file.068a.file our.test.file.061a.textour.test.file.165a.ascyour.test.file.068a.text <snip> our.test.file.162a.fileyour.test.file.065a.txt your.test.file.169a.file our.test.file.162a.textyour.test.file.066a.asc your.test.file.169a.text our.test.file.162a.txtyour.test.file.066a.fileyour.test.file.169a.txt our.test.file.163a.ascyour.test.file.066a.text our.test.file.163a.fileyour.test.file.066a.txt

That looks like what we want. The full list is 160 files. We want to store some arbitrary data in these files, so we need to install a little program to generate random passwords, pwgen. Normally this tool would be used to generate decent passwords, but we can just as easily use this random data for other things, too: [root@studentvm1 ~]# dnf -y install pwgen

Test the pwgen tool. The following CLI command generates 50 lines of 80 random characters each: [root@studentvm1 ~]# pwgen 80 50

437

Chapter 15

Advanced Shell Topics

Now we will build a short command-line program to place a little random data into each existing file that matches the pattern: [student@studentvm1 testdir7]$ for File in `ls *e.?6?a.*` ; do pwgen 80 50 > $File ; done

To verify that these files contain some data, we check the file sizes: [student@studentvm1 testdir7]$ ll *e.?6?a.*

Use cat to view the content of a few of the files. File globbing– the use of special pattern characters to select file names from a list– is a powerful tool. However there is an extension of these special patterns that give us more flexibility and which makes things we could do with complex patterns much easier. This tool is the set.

Sets Sets are a form of special pattern characters. They give us a means of specifying that a particular one-character location in a string contains any character from the list inside the square braces []. Sets can be used alone or in conjunction with other special pattern characters. A set can consist of one or more characters that will be compared against the characters in a specific, single position in the string for a match. The following list shows some typical example sets and the string characters they match: [0-9]Any numerical character [a-z]Lowercase alpha [A-Z]Uppercase alpha [a-zA-Z]Any uppercase or lowercase alpha [abc]The three lowercase alpha characters, a, b, and c [!a-z]No lowercase alpha [!5-7]No numbers 5, 6, or 7 [a-gxz]Lowercase a through g, x, and z [A-F0-9]Uppercase A through F, or any numeric Once again, this will be easier to explain if we just go right to the experiment.

438

Chapter 15

Advanced Shell Topics

EXPERIMENT 15-9 Perform this experiment as the student user. The PWD should still be ~/testdir7. Start by finding the files that contain a 6in the center of the three-digit number in the file name: [student@studentvm1 testdir7]$ ls *[0-9][6][0-9]*

We could use this alternate pattern because we know that the leftmost digit must be 0 or 1. Count the number of file names returned for both cases to verify this: [student@studentvm1 testdir7]$ ls *[01][6][0-9]*

Now let’s look for the file names that contain a 6in only the center position, but not in either of the other two digits: [student@studentvm1 testdir7]$ ls *[!6][6][!6]*

Find the files that match the pattern we have so far but which also end in t: [student@studentvm1 testdir7]$ ls *[!6][6][!6]*t

Now find all of the files that match the preceding pattern but which also have “a” or “e” immediately following the number: [student@studentvm1 testdir7]$ ls *[!6][6][!6][ae]*t

These are just a few examples of using sets. Continue to experiment with them to enhance your understanding even further. Sets provide a powerful extension to pattern matching that gives us even more flexibility in searching for files. It is important to remember, however, that the primary use of these tools is not merely to “find” these files so we can look at their names. It is to locate files that match a pattern so that we can perform some operation on them, such as deleting, moving, adding text to them, searching their contents for specific character strings, and more.

439

Chapter 15

Advanced Shell Topics

M eta-characters Meta-characters are ones that have special meaning to the shell. The Bash shell has defined a number of these meta-characters, many of which we have already encountered in our explorations: $Shell variable ~Home directory variable &Run command in background ;Command termination/separation >,>>,<I/O redirection |Command pipe ‘,”,\Meta quotes `...`Command substitution (), {}Command grouping &&, ||Shell control operators. Conditional command execution As we progress further through this course, we will explore the meta-characters we already know in more detail, and we will learn about the few we do not already know.

U sing grep Using file globbing patterns can be very powerful, as we have seen. We have been able to perform many tasks on large numbers of files very efficiently. As its name implies, however, file globbing is intended for use on file names so it does not work on the content of those files. It is also somewhat limited in its capabilities. There is a tool, grep, that can be used to extract and print to STDOUT all of the lines from a data stream based on matching patterns. Those patterns can range from simple text patterns to very complex regular expressions (regex). Written by Ken Thompson3 and first released in 1974, the grep utility is provided by the GNU Project4 and is installed by default on every version of Unix and Linux distribution I have ever used.

ikipedia, Ken Thompson, https://en.wikipedia.org/wiki/Ken_Thompson W The GNU Project, www.gnu.org

3 4

440

Chapter 15

Advanced Shell Topics

In terms of globbing characters, which grep does not understand, the default search pattern for the grep command is *PATTERN*. There is an implicit wildcard match before and after the search pattern. Thus, you can assume that any pattern you specify will be found no matter where it exists in the lines being scanned. It could be at the beginning, anywhere in the middle, or at the end. Thus, it is not necessary to explicitly state that there are characters in the string before and/or after the string for which we are searching.

EXPERIMENT 15-10 Perform this experiment as root. Although non-privileged users have access to some of the data we will be searching, only root has access to all of it. One of the most common tasks I do that requires the use of the grep utility is scanning through log files to find information pertaining to specific things. For example, I may need to determine information about how the operating system sees the network interface cards (NICs) starting with their BIOS names,5 ethX.Information about the NICs installed in the host can be found using the dmesg command as well as in the messages log files in /var/log. We’ll start by looking at the output from dmesg. First, just pipe the output through less and use the search facility built into less: [root@studentvm1 ~]# dmesg | less

You can page through the screens generated by less and use the Mark I Eyeball6 to locate the “eth” string, or you can use the search. Initiate the search facility by typing the slash (/) character and then the string for which you are searching: /eth. The search will highlight the string, and you can use the “n” key to find the next instance of the string and the “b” key to search backward for the previous instance. Searching through pages of data, even with a good search facility, is easier than eyeballing it, but not as easy as using grep. The -i option tells grep to ignore case and display the “eth” string regardless of the case of its letters. It will find the strings eth, ETH, Eth, eTh, and so on, which are all different in Linux:

ost modern Linux distributions rename the NICs from the old BIOS names, ethX, to something M like enp0s3. That is a discussion we will encounter in Chapters 33 and 36. 6 Wikipedia, Visual Inspection, https://en.wikipedia.org/wiki/Visual_inspection 5

441

Chapter 15

Advanced Shell Topics

[root@studentvm1 ~]# [1.861192] e1000 [1.861199] e1000 [2.202563] e1000 [2.202568] e1000 [2.205334] e1000 [2.209591] e1000 [root@studentvm1 ~]#

dmesg | grep 0000:00:03.0 0000:00:03.0 0000:00:08.0 0000:00:08.0 0000:00:03.0 0000:00:08.0

-i eth eth0: (PCI:33MHz:32-bit) 08:00:27:a9:e6:b4 eth0: Intel(R) PRO/1000 Network Connection eth1: (PCI:33MHz:32-bit) 08:00:27:50:58:d4 eth1: Intel(R) PRO/1000 Network Connection enp0s3: renamed from eth0 enp0s8: renamed from eth1

These results show data about the BIOS names, the PCI bus on which they are located, the MAC addresses, and the new names that Linux has given them. Now look for instances of the string that begins the new NIC names, “enp.” Did you find any?

Note These numbers enclosed in square braces, [ 2.205334], are timestamps that indicate the log entry was made that number of seconds after the kernel took over control of the computer. In this first example of usage, grep takes the incoming data stream using STDIN and then sends the output to STDOUT.The grep utility can also use a file as the source of the data stream. We can see that in this next example in which we grep through the message log files for information about our NICs: [root@studentvm1 ~]$ cd /var/log ; grep -i eth messages* <snip> messages-20181111:Nov6 09:27:36 studentvm1 dbus-daemon[830]: [system] Rejected send message, 2 matched rules; type="method_call", sender=":1.89" (uid=1000 pid=1738 comm="/usr/bin/pulseaudio --daemonize=no ") interface="org.freedesktop.DBus.ObjectManager" member="GetManagedObjects" error name="(unset)" requested_reply="0" destination="org.bluez" (bus) messages-20181111:Nov6 09:27:36 studentvm1 pulseaudio[1738]: E: [pulseaudio] bluez5-util.c: GetManagedObjects() failed: org.freedesktop. DBus.Error.AccessDenied: Rejected send message, 2 matched rules; type="method_call", sender=":1.89" (uid=1000 pid=1738 comm="/usr/bin/ pulseaudio --daemonize=no ") interface="org.freedesktop.DBus.ObjectManager" member="GetManagedObjects" error name="(unset)" requested_reply="0" destination="org.bluez" (bus)

442

Chapter 15 messages-20181118:Nov 16 07:41:00 studentvm1 (PCI:33MHz:32-bit) 08:00:27:a9:e6:b4 messages-20181118:Nov 16 07:41:00 studentvm1 Intel(R) PRO/1000 Network Connection messages-20181118:Nov 16 07:41:00 studentvm1 (PCI:33MHz:32-bit) 08:00:27:50:58:d4 messages-20181118:Nov 16 07:41:00 studentvm1 Intel(R) PRO/1000 Network Connection <snip>

Advanced Shell Topics

kernel: e1000 0000:00:03.0 eth0: kernel: e1000 0000:00:03.0 eth0: kernel: e1000 0000:00:08.0 eth1: kernel: e1000 0000:00:08.0 eth1:

The first part of each line in our output data stream is the name of the file in which the matched lines were found. If you do a little exploration of the current messages file, which is named just that with no appended date, you may or may not find any lines matching our search pattern. I did not with my VM, so using the file glob to create the pattern “messages*” searches all of the files starting with messages. This file glob matching is performed by the shell and not by the grep tool: You will notice also that on this first try, we found more than we wanted. Some lines that have the “eth” string in them were found as part of the word, “method.” So let’s be a little more explicit and use a set as part of our search pattern: [root@studentvm1 log]# grep -i eth[0-9] messages*

This is better, but what we really care about are the lines that pertain to our NICs after they were renamed. So we now know the names that our old NIC names were changed to, so we can also search for those. But the grep tool allows multiple search patterns. Fortunately for us, grep provides some interesting options such as using -e to specify multiple search expressions. Each search expression must be specified using a separate instance of the -e option: [root@studentvm1 log]# grep -i -e eth[0-9] -e enp0 messages*

That does work, but there is also an extension which allows us to search using extended regular expressions.7 The grep patterns we have been using so far are basic regular expressions (BRE). To get more complex, we can use extended regular expressions (ERE). To do this, we can use the -E option which turns on EREs: [root@studentvm1 log]# grep -Ei "eth[0-9]|enp0" messages*

Chapter 26 explores this subject in detail.

443

Chapter 15

Advanced Shell Topics

You may wish to use the wc (word count) command to verify that both of the last two commands produce the same number of line for their results. This is functionally the same as using the egrep command which is deprecated and may not be available in the future. For now, egrep is still available for backward compatibility for scripts that use it and which have not been updated to use grep -E. Note that the extended regular expression is enclosed in double quotes. Now make /etc the PWD.Sometimes I have needed to list all of the configuration files in the / etc directory. These files typically end with a .conf or .cnf extension or with rc. To do this, we need an anchor to specify that the search string is at the end of the string being searched. We use the dollar sign ($) for that. The syntax of the search string in the following command finds all the configuration files with the listed endings. The -R option for the ll or ls command causes the command to recurse into all of the subdirectories: [root@studentvm1 etc]# ls -aR | grep -E "conf$|cnf$|rc$"

We can also use the caret (^) to anchor the beginning of the string. Suppose that we want to locate all files in /etc that begin with kde because they are used in the configuration of the KDE desktop: [root@studentvm1 etc]# ls -R | grep -E "^kde" kde kde4rc kderc kde.csh kde.sh kdebugrc

One of the advanced features of grep is the ability to read the search patterns from a file containing one or more patterns. This is very useful if the same complex searches must be performed on a regular basis. The grep tool is powerful and complex. The man page offers a good amount of information, and the GNU Project provides a free 36-page manual8 to assist with learning and using grep. That document is available in HTML, so it can be read online with a web browser, ASCII text, an info document, a downloadable PDF file, and more.

GNU Project, GNU grep, www.gnu.org/software/grep/manual/

444

Chapter 15

Advanced Shell Topics

Finding files The ls command and its aliases such as ll are designed to list all of the files in a directory. Special pattern characters and the grep command can be used to narrow down the list of files sent to STDOUT.But there is still something missing. There is a bit of a problem with the command, ls -R | grep -E "^kde", which we used in Experiment 15-8. Some of the files it found were in subdirectories of /etc/ but the ls command does not display the names of the subdirectories in which those files are stored. Fortunately the find command is designed explicitly to search for files in a directory tree using patterns and to either list the files and their directories or to perform some operation on them. The find command can also use attributes such as the date and time a file was created or accessed, files that were created or modified before or after a date and time, its size, permissions, user ID, group ID, and much more. These attributes can be combined to become very explicit, such as all files that are larger than 12M in size, that were created more than five years ago, that have not been accessed in more than a year, that belong to the user with UID XXXX, that are regular files (in other words– not directories, symbolic links, sockets, named pipes, and more)– and more. Once these files are found, the find command has the built-in options to perform actions such as to list, delete, print, or even to execute system commands using the file name as an option, such as to move or copy them. This is a very powerful and flexible command.

EXPERIMENT 15-11 Perform this experiment as the root user. The following command finds all files in /etc and its subdirectories that start with “kde” and, because it uses -iname instead of -name, the search is not case sensitive: [root@studentvm1 ~]# find /etc -iname "kde*" /etc/xdg/kdebugrc /etc/profile.d/kde.csh /etc/profile.d/kde.sh /etc/kde4rc /etc/kderc /etc/kde [root@studentvm1 ~]#

445

Chapter 15

Advanced Shell Topics

Perform the rest of these commands as the student user. Make the student user’s home directory (~) the PWD. Suppose you want to find all of the empty (zero length) files that were created in your home directory as part of our earlier experiments. The next command does this. It starts looking at the home (~) directory for files (type f) that are empty and that contain in the file names the string “test.file”: [student@studentvm1 ~]$ find . -type f -empty -name "*test.file*" | wc -l 9488 [student@studentvm1 ~]$

I have 9,488 empty files from previous experiments in my home directory, but your number may be different. This large number is to be expected since we created a very large number of files that are empty for some earlier experiments. Run this same command, except do not run the data stream through the wc command. Just list the names. Notice that the file names are not sorted. But let’s also see if there are any of these files that are not part of our previous experiments so we want to look for empty files whose file names do not contain the string “test.file”. The “bang” (!) character inverts the meaning of the -name option so that only files that do not match the string we supply for the file name are displayed: [student@studentvm1 ~]$ find . -type f -empty ! -name "*test.file*" ./link3 ./.local/share/ranger/tagged ./.local/share/vifm/Trash/000_file02 ./.local/share/vifm/Trash/000_file03 ./.local/share/orage/orage_persistent_alarms.txt ./.local/share/mc/filepos ./.local/share/user-places.xbel.tbcache ./.cache/abrt/applet_dirlist ./file005 ./newfile.txt ./testdir/file006 ./testdir/file077 ./testdir/link2 ./testdir/file008 ./testdir/file055 ./testdir/file007 <snip>

446

Chapter 15

Advanced Shell Topics

Let’s also find the files that are not empty: [student@studentvm1 ~]$ find . -type f ! -empty -name "*test.file*" | wc -l 160 [student@studentvm1 ~]$

We now know that 160 files that contain the string “test.file” in their names are not empty. Now we know that performing an action on the files we found in the previous command such as deleting them will not effect any other important files. So let’s delete all of the empty files with the string “test.file” in their names. Then verify that none of these empty files remain and that the non-empty files are still there: [student@studentvm1 ~]$ find . -type f -empty -name "*test.file*" -delete [student@studentvm1 ~]$ find . -type f -empty -name "*test.file*" [student@studentvm1 ~]$ find . -type f ! -empty -name "*test.file*" | wc -l 160

Here are a couple more interesting things to try. First, create a file that is quite large for our next example, so create a file that is over 1GB in size and that contains random data. It took about 15 minutes to generate this file on my VM, so be patient: [student@studentvm1 ~]$ pwgen -s 80 14000000 > testdir7/bigtestfile.txt

Use the -ls option to provide a sorted listing of the files found and provides information like the ls -dils command. Note that the iNode9 number will be the leftmost column, which means that the data is sorted by iNode number: [student@studentvm1 ~]$ find . -type f ! -empty -name "*test.file*" -ls

We must do something a bit different to sort the results by size. This is where the -exec option of the find command is useful. This next command finds all files larger than 3K in size, generates a listing of them, and then pipes that data stream through the sort command which uses options -n for numeric sort and -k 7 to sort on the 7th field of the output lines which is the file size in bytes. White space is the default field separator. [student@studentvm1 ~]$ find -type f -size +3k -ls | sort -nk 7

We will see more of the find command later.

iNodes will be covered in Chapter 18.

447

Chapter 15

Advanced Shell Topics

I use the find command frequently because of its ability to locate files based on very exacting criteria. This gives me very exacting yet flexible control over the files I can use automation to choose on which to perform some SysAdmin tasks.

C hapter summary This chapter has provided an exploration of the Bash shell and using shell tools such as file globbing, brace expansion, control operators, and sets. It has also introduced us to some important and frequently used command-line tools. We have looked at many aspects of using the Bash shell and understanding how to perform some powerful and amazing things. For even more detail on the Bash shell, gnu.org has the complete GNU Bash Manual10 available in several formats including PDF and HTML. This is most certainly not a complete exploration of Bash and some of the advanced command-line tools available to us as SysAdmins. It should be enough to get you started and to interest you in learning more.

E xercises Perform the following exercises to complete this chapter: 1. In Chapter 7 we installed some other shells. Choose one of those, and spend a little time performing simple tasks with it to gain a little knowledge of the grammar and syntax. Read the man page for the shell you chose to determine which commands are internal. 2. Do Bash and the shell you chose in exercise 1 have some of the same internal commands? 3. What does the type command do if the cpuHog shell script is located in your home directory rather than ~/bin? 4. What is the function of the $PATH environment variable?

Free Software Foundation, GNU Bash Manual, www.gnu.org/software/Bash/manual/

448

Chapter 15

Advanced Shell Topics

5. Generally speaking, why might you want to use an external command instead of a shell internal command that performs the same function and which has the same name? 6. Locate all of the configuration files in your home directory and all of its subdirectories. 7. What is the largest file in the /etc directory? 8. What is the largest file in the entire filesystem (/)?

449

CHAPTER 16

Linux Boot andStartup O bjectives In this chapter you will learn •

The difference between Linux boot and startup

•

What happens during the hardware boot sequence

•

What happens during the Linux boot sequence

•

What happens during the Linux startup sequence

•

How to manage and modify the Linux boot and startup sequences

•

The function of the display and window managers

•

How the login process works for both virtual consoles and a GUI

•

What happens when a user logs off

This chapter explores the hardware boot sequence, the bootup sequence using the GRUB2 bootloader, and the startup sequence as performed by the systemd initialization system. It covers in detail the sequence of events required to change the state of the computer from off to fully up and running with a user logged in. This chapter is about modern Linux distributions like Fedora and other Red Hat– based distributions that use systemd for startup, shutdown, and system management. systemd is the modern replacement for init and SystemV init scripts.

O verview The complete process that takes a Linux host from an off state to a running state is complex, but it is open and knowable. Before we get into the details, a quick overview of the time the host hardware is turned on until the system is ready for a user to log in will © David Both 2020 D. Both, Using and Administering Linux: Volume 1, https://doi.org/10.1007/978-1-4842-5049-5_16

451

Chapter 16

Linux Boot andStartup

help orient us. Most of the time we hear about “the boot process” as a single entity, but it is not. There are, in fact, three parts to the complete boot and startup process: •

Hardware boot which initializes the system hardware

•

Linux boot which loads the Linux kernel and systemd

•

Linux startup in which systemd makes the host ready for productive work

It is important to separate the hardware boot from the Linux boot process from the Linux startup and to explicitly define the demarcation points between them. Understanding these differences and what part each plays in getting a Linux system to a state where it can be productive makes it possible to manage these processes and to better determine the portion in which a problem is occurring during what most people refer to as “boot.”

H ardware boot The first step of the Linux boot process really has nothing whatever to do with Linux. This is the hardware portion of the boot process and is the same for any Intel-based operating system. When power is first applied to the computer, or the VM we have created for this course, it runs the power-on self-test (POST)1 which is part of BIOS2 or the much newer Unified Extensible Firmware Interface3 (UEFI). BIOS stands for Basic I/O System, and POST stands for power-on self-test. When IBM designed the first PC back in 1981, BIOS was designed to initialize the hardware components. POST is the part of BIOS whose task is to ensure that the computer hardware functioned correctly. If POST fails, the computer may not be usable, and so the boot process does not continue. Most modern motherboards provide the newer UEFI as a replacement for BIOS.Many motherboards also provide legacy BIOS support. Both BIOS and UEFI perform the same functions– hardware verification and initialization, and loading the boot loader. The VM we created for this course uses a BIOS interface which is perfectly fine for our purposes. Wikipedia, Power On Self Test, http://en.wikipedia.org/wiki/Power-on_self-test Wikipedia, BIOS, http://en.wikipedia.org/wiki/BIOS 3 Wikipedia, Unified Extensible Firmware Interface, https://en.wikipedia.org/wiki/ Unified_Extensible_Firmware_Interface 1 2

452

Chapter 16

Linux Boot andStartup

BIOS/UEFI POST checks basic operability of the hardware. Then it locates the boot sectors on all attached bootable devices including rotating or SSD hard drives, DVD or CD-ROM, or bootable USB memory sticks like the live USB device we used to install the StudentVM1 virtual machine. The first boot sector it finds that contains a valid master boot record (MBR)4 is loaded into RAM, and control is then transferred to the RAM copy of the boot sector. The BIOS/UEFI user interface can be used to configure the system hardware for things like overclocking, specifying CPU cores as active or inactive, specific devices from which the system might boot, and the sequence in which those devices are to be searched for a bootable boot sector. I do not create or boot from bootable CD or DVD devices any more. I only use bootable USB thumb drives to boot from external, removable devices. Because I sometimes do boot from an external USB drive– or in the case of a VM, a bootable ISO image like that of the live USB device– I always configure my systems to boot first from the external USB device and then from the appropriate internal disk drive. This is not considered secure in most commercial environments, but then I do a lot of boots to external USB drives. If they steal the whole computer or if it is destroyed in a natural disaster, I can revert to backups5 I keep in my safe deposit box. In most environments you will want to be more secure and set the host to boot from the internal boot device only. Use a BIOS password to prevent unauthorized users from accessing BIOS to change the default boot sequence. Hardware boot ends when the boot sector assumes control of the system.

L inux boot The boot sector that is loaded by BIOS is really stage 1 of the GRUB6 boot loader. The Linux boot process itself is composed of multiple stages of GRUB.We consider each stage in this section.

ikipedia, Master Boot Record, https://en.wikipedia.org/wiki/Master_boot_record W Backups are discussed in Chapter 18 of Volume 2. 6 GNU, GRUB, www.gnu.org/software/grub/manual/grub 4 5

453

Chapter 16

Linux Boot andStartup

G RUB GRUB2 is the newest version of the GRUB bootloader and is used much more frequently these days. We will not cover GRUB1 or LILO in this course because they are much older than GRUB2. Because it is easier to write and say GRUB than GRUB2, I will use the term GRUB in this chapter, but I will be referring to GRUB2 unless specified otherwise. GRUB2 stands for “GRand Unified Bootloader, version 2,” and it is now the standard bootloader for most current Linux distributions. GRUB is the program which makes the computer just smart enough to find the operating system kernel and load it into memory, but it takes three stages of GRUB to do this. Wikipedia has an excellent article on GNU GRUB.7 GRUB has been designed to be compatible with the multiboot specification which allows GRUB to boot many versions of Linux and other free operating systems. It can also chain load the boot record of proprietary operating systems. GRUB can allow the user to choose to boot from among several different kernels for your Linux distribution if there are more than one present due to system updates. This affords the ability to boot to a previous kernel version if an updated one fails somehow or is incompatible with an important piece of software. GRUB can be configured using the /boot/grub/grub.conf file. GRUB1 is now considered to be legacy and has been replaced in most modern distributions with GRUB2, which is a complete rewrite of GRUB1. Red Hat-based distros upgraded to GRUB2 around Fedora 15 and CentOS/RHEL 7. GRUB2 provides the same boot functionality as GRUB1, but GRUB2 also provides a mainframe-like commandbased pre-OS environment and allows more flexibility during the pre-boot phase. The primary function of GRUB is to get the Linux kernel loaded into memory and running. The use of GRUB2 commands within the pre-OS environment is outside the scope of this chapter. Although GRUB does not officially use the stage terminology for its three stages, it is convenient to refer to them in that way, so I will.

GRUB stage 1 As mentioned in the BIOS/UEFI POST section, at the end of POST, BIOS/UEFI searches the attached disks for a boot record, which is located in the master boot record (MBR); it loads the first one it finds into memory and then starts execution of the boot record.

Wikipedia, GNU GRUB, www.gnu.org/software/grub/grub-documentation.html

454

Chapter 16

Linux Boot andStartup

The bootstrap code, that is GRUB stage 1, is very small because it must fit into the first 512-byte sector on the hard drive along with the partition table.8 The total amount of space allocated for the actual bootstrap code in a classic, generic MBR is 446 bytes. The 446-byte file for stage 1 is named boot.img and does not contain the partition table. The partition table is created when the device is partitioned and is overlaid onto the boot record starting at byte 447. In UEFI systems, the partition table has been moved out of the MBR and into the space immediately following the MBR.This provides more space for defining partitions, so it allows a larger number of partitions to be created. Because the boot record must be so small, it is also not very smart and does not understand filesystem structures such as EXT4. Therefore the sole purpose of stage 1 is to load GRUB stage 1.5. In order to accomplish this, stage 1.5 of GRUB must be located in the space between the boot record and the UEFI partition data and the first partition on the drive. After loading GRUB stage 1.5 into RAM, stage 1 turns control over to stage 1.5.

EXPERIMENT 16-1 Log in to a terminal session as root if there is not one already available. As root in a terminal session, run the following command to verify the identity of the boot drive on your VM.It should be the same drive as the boot partition: [root@studentvm1 ~]# lsblk -i NAMEMAJ:MIN RMSIZE RO TYPE MOUNTPOINT sda8:0060G0 disk |-sda18:101G0 part /boot `-sda28:2059G0 part |-fedora_studentvm1-root 253:002G0 lvm/ |-fedora_studentvm1-swap 253:106G0 lvm[SWAP] |-fedora_studentvm1-usr253:20 15G0 lvm/usr |-fedora_studentvm1-home 253:304G0 lvm/home |-fedora_studentvm1-var253:40 10G0 lvm/var `-fedora_studentvm1-tmp253:505G0 lvm/tmp [root@studentvm1 ~]#

Wikipedia, GUID Partition Table, https://en.wikipedia.org/wiki/GUID_Partition_Table

455

Chapter 16

Linux Boot andStartup

Use the dd command to view the boot record of the boot drive. For this experiment I assume it is assigned to the /dev/sda device. The bs= argument in the command specifies the block size, and the count= argument specifies the number of blocks to dump to STDIO.The if= argument (InFile) specifies the source of the data stream, in this case, the USB device:

This prints the text of the boot record, which is the first block on the disk– any disk. In this case, there is information about the filesystem and, although it is unreadable because it is stored in binary format, the partition table. Stage 1 of GRUB or some other boot loader is located in this sector but that, too, is mostly unreadable by us mere humans. We can see a couple messages in ASCII text that are stored in the boot record. It might be easier to read these messages if we do this a bit differently. The od command (octal display) displays the data stream piped to it in octal format in a nice matrix that makes the content a bit easier to read. The -a option tells the command to convert into readable ASCII format characters where possible. The last– at the end of the command tells od to take input from the STDIN stream rather than a file: [root@studentvm1 ~]# dd if=/dev/sda bs=512 count=1 | od -a 1+0 records in 1+0 records out 0000000kc dle dlesoP< nul08 nul nulsoXso@ 0000020{> nul|? nul ack9 nul stxs$j! ack nul 0000040 nul>> bel8 eotuvt etxF dle soh~~ belu 0000060sk syn4 stx0 soh; nul|2 nulnlt sohvt 0000100L stxM dc3j nul| nul nulk~ nul nul nul nul nul 0000120 nul nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul 0000140 nul nul nul nul delz dle dlevB nult enqvBp 0000160t stx2 nuljy| nul nul1@soXsoP<

456

Chapter 16

Linux Boot andStartup

0000200 nulsp {sp d | < del t stxbs B R > enq | 00002201@htD eot@bsD delhtD stxG eot dle nul 0000240fvtrs\|fht\bsfvtrs`|fht 0000260\ffGD ack nulp4BM dc3r enq; nulp 0000300k stxkK`rs9 nul sohso[1v? nul nul 0000320so F | s %us a ` 8 nul ; M sub f enq @ 0000340ugs8 bel;? nul nulf1vf;TCP 0000360Af9 nul stx nul nulf:bs nul nul nulM suba 0000400 del&Z|>us}k etx>.}h4 nul> 00004203}h. nulM cank~GRUBsp nulG 0000440eom nulHardspDisk nulRe 0000460ad nulspErrorcrnl nul; soh nul4 0000500so M dle , < nul u t C nul nul nul nul nul nul nul 0000520 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul * 0000660 nul 0000700 soh 0000720B 0000740 nul 0000760 nul 0001000

nul nul nul nul eot etx~B delso~B nul nul nul nul nul nul nul nul

nul del del nul nul

nul nul\;^. nul nulbs nul nul nul nulsp nulbssp nul nulx_ nul nul nul nul nul nul nul nul nul nul nul nul nul nul

nul nul eot nul nul~ bel nul nul nul nul nul nulU*

Note the star (*) (splat/asterisk) between addresses 0000520 and 0000660. This indicates that all of the data in that range is the same as the last line before it, 0000520, which is all null characters. This saves space in the output stream. The addresses are in octal, which is base 8. A generic boot record that does not contain a partition table is located in the /boot/grub2/ i386-pc directory. Let’s look at the content of that file. It would not be necessary to specify the block size and the count if we used dd because we are looking at a file that already has a limited length. We can also use od directly and specify the file name rather than using the dd command, although we could do that, too.

Note In Fedora 30 and above, the boot.img files are located in the /usr/lib/grub/ i386-pc/ directory. Be sure to use that location when performing the next part of this experiment.

457

Chapter 16

Linux Boot andStartup

[root@studentvm1 ~]# od -a /boot/grub2/i386-pc/boot.img 0000000kc dle nul nul nul nul nul nul nul nul nul nul nul nul nul 0000020 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul * 0000120 nul nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul 0000140 nul nul nul nul delzk enqvB nult enqvBp 0000160t stx2 nuljy| nul nul1@soXsoP< 0000200 nulsp {sp d | < del t stxbs B R > enq | 00002201@htD eot@bsD delhtD stxG eot dle nul 0000240fvtrs\|fht\bsfvtrs`|fht 0000260\ffGD ack nulp4BM dc3r enq; nulp 0000300k stxkK`rs9 nul sohso[1v? nul nul 0000320so F | s %us a ` 8 nul ; M sub f enq @ 0000340ugs8 bel;? nul nulf1vf;TCP 0000360Af9 nul stx nul nulf:bs nul nul nulM suba 0000400 del&Z|>us}k etx>.}h4 nul> 00004203}h. nulM cank~GRUBsp nulG 0000440eom nulHardspDisk nulRe 0000460ad nulspErrorcrnl nul; soh nul4 0000500so M dle , < nul u t C nul nul nul nul nul nul nul 0000520 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul * 0000760 nul nul nul nul nul nul nul nul nul nul nul nul nul nulU* 0001000

There is second area of duplicated data in this output, between addresses 0000020 and 0000120. Because that area is different from the actual boot record and it is all null in this file, we can infer that this is where the partition table is located in the actual boot record. There is also an interesting utility that enables us to just look at the ASCII text strings contained in a file: [root@studentvm1 ~]# strings /boot/grub2/i386-pc/boot.img TCPAf GRUB Geom Hard Disk Read Error

458

Chapter 16

Linux Boot andStartup

This tool is easier to use to locate actual text strings than sorting through many lines of the occasional random ASCII characters to find meaningful strings. But note that like the first line of the preceding output, not all text strings have meaning to humans. The point here is that the GRUB boot record is installed in the first sector of the hard drive or other bootable media, using the boot.img file as the source. The partition table is then superimposed on the boot record in its specified location.

GRUB stage 1.5 As mentioned earlier, stage 1.5 of GRUB must be located in the space between the boot record and the UEFI partition data and the first partition on the disk drive. This space was left unused historically for technical and compatibility reasons and is sometimes called the “boot track” or the “MBR gap.” The first partition on the hard drive begins at sector 63, and with the MBR in sector 0, that leaves 62 512-byte sectors– 31,744 bytes– in which to store stage 1.5 of GRUB which is distributed as the core.img file. The core. img file is 28,535 bytes as of this writing, so there is plenty of space available between the MBR and the first disk partition in which to store it.

EXPERIMENT 16-2 The file containing stage 1.5 of GRUB is stored as /boot/grub2/i386-pc/core.img. You can verify this as we did earlier with stage 1 by comparing the code in the file from that stored in the MBR gap of the boot drive: [root@studentvm1 ~]# dd if=/dev/sda bs=512 count=1 skip=1 | od -a 1+0 records in 1+0 records out 512 bytes copied, 0.000132697 s, 3.9 MB/s 0000000R?t sohf1@vtEbsfA`htf# 0000020l sohfvt- etx}bs nulsi eotd nul nul| del 0000040 nultFfvtgsfvtM eotf1@0 del9 0000060Ebs del etxvtEbs)Ebsf soh enqf etxU 0000100 eot nulG eot dle nulhtD stxfht\bsfhtL 0000120ff G D ack nul p P G D eot nul nul 4 B M dc3 0000140si stx \ nul ; nul p k h fvt E eot fht @

459

Chapter 16

Linux Boot andStartup

0000160si enq D nul fvt enq f 1 R f w 4bs Tnl 0000200f1Rfwt eotbsTvthtDff;Dbs 0000220sicr $ nulvt eot * Dnl 9 Ebs del etxvt E 0000240bs ) Ebs f soh enq f etx U eot nulnl Tcr @ 0000260b acknlLnl~AbsQnllffZRnlt 0000300vt P ; nul pso C 1 [ 4 stx M dc3 r qff 0000320CsoEnlXA` enq sohEnl`rsA` etx 0000340ht A 1 del 1 vso [ | s %us > V soh h 0000360 ack nula etx}bs nulsi enq" del etxoffi dc4 0000400 del`8 bel;; nul nulsoCf1 del? nul stx 0000420f;TCPAf>l soh nul nulgfvtso 0000440f1vf:ht nul nul nulM suba>X sohh 0000460F nulZj nul stx nul nul>[ sohh: nulk ack 0000500>` sohh2 nul>e sohh, nulk~lo 0000520ading nul. nulcrnl nulGeom nul 0000540Read nulspError nul nul nul nul nul 0000560; soh nul4soM dleFnl eot< nulurC nul 0000600 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul * 0000760 nul nul nul nul stx nul nul nul nul nul nul nulo nulspbs 0001000 [root@studentvm1 ~]# dd if=/boot/grub2/i386-pc/core.img bs=512 count=1 | od -a 1+0 records in 1+0 records out 512 bytes copied, 5.1455e-05 s, 10.0 MB/s 0000000R?t sohf1@vtEbsfA`htf# 0000020l sohfvt- etx}bs nulsi eotd nul nul| del 0000040 nultFfvtgsfvtM eotf1@0 del9 0000060Ebs del etxvtEbs)Ebsf soh enqf etxU 0000100 eot nulG eot dle nulhtD stxfht\bsfhtL 0000120ff G D ack nul p P G D eot nul nul 4 B M dc3 0000140si stx \ nul ; nul p k h fvt E eot fht @ 0000160si enq D nul fvt enq f 1 R f w 4bs Tnl 0000200f1Rfwt eotbsTvthtDff;Dbs 0000220sicr $ nulvt eot * Dnl 9 Ebs del etxvt E 0000240bs ) Ebs f soh enq f etx U eot nulnl Tcr @

460

Chapter 16

Linux Boot andStartup

0000260b acknlLnl~AbsQnllffZRnlt 0000300vt P ; nul pso C 1 [ 4 stx M dc3 r qff 0000320CsoEnlXA` enq sohEnl`rsA` etx 0000340ht A 1 del 1 vso [ | s %us > V soh h 0000360 ack nula etx}bs nulsi enq" del etxoffi dc4 0000400 del`8 bel;; nul nulsoCf1 del? nul stx 0000420f;TCPAf>l soh nul nulgfvtso 0000440f1vf:ht nul nul nulM suba>X sohh 0000460F nulZj nul stx nul nul>[ sohh: nulk ack 0000500>` sohh2 nul>e sohh, nulk~lo 0000520ading nul. nulcrnl nulGeom nul 0000540Read nulspError nul nul nul nul nul 0000560; soh nul4soM dleFnl eot< nulurC nul 0000600 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul * 0000760 nul nul nul nul stx nul nul nul nul nul nul nul7 nulspbs 0001000 [root@studentvm1 ~]#

The first sector of each will do for verification, but you should feel free to explore more of the code if you like. There are tools that we could use to compare the file with the data in GRUB stage 1.5 on the hard drive, but it is obvious that these two sectors of data are identical. At this point we know the files that contain stages 1 and 1.5 of the GRUB bootloader and where they are located on the hard drive in order to perform their function as the Linux bootloader. Because of the larger amount of code that can be accommodated for stage 1.5 than for stage 1, it can have enough code to contain a few common filesystem drivers, such as the standard EXT, XFS, and other Linux filesystems like FAT and NTFS.The GRUB2 core. img is much more complex and capable than the older GRUB1 stage 1.5. This means that stage 2 of GRUB2 can be located on a standard EXT filesystem, but it cannot be located on a logical volume because it needs to be read from a specific location on the bootable volume before the filesystem drivers have been loaded.

461

Chapter 16

Linux Boot andStartup

Note that the /boot directory must be located on a filesystem that is supported by GRUB such as EXT4. Not all filesystems are. The function of stage 1.5 is to begin execution with the filesystem drivers necessary to locate the stage 2 files in the /boot filesystem and load the needed drivers.

GRUB stage 2 All of the files for GRUB stage 2 are located in the /boot/grub2 directory and its subdirectories. GRUB2 does not have an image file like stages 1 and 2. Instead, it consists of those files and runtime kernel modules that are loaded as needed from the /boot/ grub2 directory and its subdirectories. Some Linux distributions may store these files in the /boot/grub directory. The function of GRUB stage 2 is to locate and load a Linux kernel into RAM and turn control of the computer over to the kernel. The kernel and its associated files are located in the /boot directory. The kernel files are identifiable as they are all named starting with vmlinuz. You can list the contents of the /boot directory to see the currently installed kernels on your system.

EXPERIMENT 16-3 Your list of Linux kernels should be similar to the ones on my VM, but the kernel versions and probably the releases will be different. You should be using the most recent release of Fedora on your VM, so it should be release 29 or even higher by the time you installed your VMs. That should make no difference to these experiments:

462

You can see that there are four kernels and their supporting files in this list. The System.map files are symbol tables that map the physical addresses of the symbols such as variables and functions. The initramfs files are used early in the Linux boot process before the filesystem drivers have been loaded and the filesystems mounted.

[root@studentvm1 ~]# ll /boot total 187716 -rw-r--r--. 1 root root196376 Apr 232018 config-4.16.3-301.fc28.x86_64 -rw-r--r--. 1 root root196172Aug 15 08:55 config-4.17.14-202.fc28.x86_64 -rw-r--r--1 root root 197953 Sep 19 23:02 config-4.18.9-200.fc28.x86_64 drwx------. 4 root root4096 Apr 302018 efi -rw-r--r--. 1 root root184380 Jun 28 10:55 elf-memtest86+-5.01 drwxr-xr-x. 2 root root4096 Apr 252018 extlinux drwx------. 6 root root4096 Sep 23 21:52 grub2 -rw-------. 1 root root 72032025 Aug 13 16:23 initramfs-0-rescue-7f12524278bd40e9b10a085bc82dc504.img -rw-------. 1 root root 24768511 Aug 13 16:24 initramfs-4.16.3-301.fc28.x86_64.img -rw-------. 1 root root 24251484 Aug 18 10:46 initramfs-4.17.14-202.fc28.x86_64.img -rw-------1 root root 24313919 Sep 23 21:52 initramfs-4.18.9-200.fc28.x86_64.img drwxr-xr-x. 3 root root4096 Apr 252018 loader drwx------. 2 root root16384 Aug 13 16:16 lost+found -rw-r--r--. 1 root root182704 Jun 28 10:55 memtest86+-5.01 -rw-------. 1 root root3888620 Apr 232018 System.map-4.16.3-301.fc28.x86_64 -rw-------. 1 root root4105662 Aug 15 08:55 System.map-4.17.14-202.fc28.x86_64 -rw-------1 root root4102469 Sep 19 23:02 System.map-4.18.9-200.fc28.x86_64 -rwxr-xr-x. 1 root root8286392 Aug 13 16:23 vmlinuz-0-rescue-7f12524278bd40e9b10a085bc82dc504 -rwxr-xr-x. 1 root root8286392 Apr 232018 vmlinuz-4.16.3-301.fc28.x86_64 -rwxr-xr-x. 1 root root8552728 Aug 15 08:56 vmlinuz-4.17.14-202.fc28.x86_64 -rwxr-xr-x1 root root8605976 Sep 19 23:03 vmlinuz-4.18.9-200.fc28.x86_64 [root@studentvm1 ~]#

Chapter 16 Linux Boot andStartup

463

Chapter 16

Linux Boot andStartup

GRUB supports booting from one of a selection of installed Linux kernels. The Red Hat Package Manager, DNF, supports keeping multiple versions of the kernel so that if a problem occurs with the newest one, an older version of the kernel can be booted. As shown in Figure16-1, GRUB provides a pre-boot menu of the installed kernels, including a rescue option and, if configured, a recovery option for each kernel.

Figure 16-1. The GRUB boot menu allows selection of a different kernel The default kernel is always the most recent one that has been installed during updates, and it will boot automatically after a short timeout of five seconds. If the up and down arrows are pressed, the countdown stops, and the highlight bar moves to another kernel. Press Enter to boot the selected kernel. If almost any key other than the up and down arrow keys or the “e” or “c” keys are pressed, the countdown stops and waits for more input. Now you can take your time to use the arrow keys to select a kernel to boot and then press the Enter key to boot from it. Stage 2 of GRUB loads the selected kernel into memory and turns control of the computer over to the kernel. The rescue boot option is intended as a last resort when attempting to resolve boot severe problems– ones which prevent the Linux system from completing the boot process. When some types of errors occur during boot, GRUB will automatically fall back to boot from the rescue image. 464

Chapter 16

Linux Boot andStartup

The GRUB menu entries for installed kernels has been useful to me. Before I became aware of VirtualBox I used to use some commercial virtualization software that sometimes experienced problems when the Linux was updated. Although the company tried to keep up with kernel variations, they eventually stopped updating their software to run with every kernel version. Whenever they did not support a kernel version to which I had updated, I used the GRUB menu to select an older kernel which I knew would work. I did discover that maintaining only three older kernels was not always enough, so I configured the DNF package manager to save up to ten kernels. DNF package manager configuration is covered in Volume 1, Chapter 12.

C onfiguring GRUB GRUB is configured with /boot/grub2/grub.cfg, but we do not change that file because it can get overwritten when the kernel is updated to a new version. Instead, we make modifications to the /etc/default/grub file.

EXPERIMENT 16-4 Let’s start by looking at the unmodified version of the /etc/default/grub file: [root@studentvm1 ~]# cd /etc/default ; cat grub GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="resume=/dev/mapper/fedora_studentvm1-swap rd.lvm. lv=fedora_studentvm1/root rd.lvm.lv=fedora_studentvm1/swap rd.lvm.lv=fedora_ studentvm1/usr rhgb quiet" GRUB_DISABLE_RECOVERY="true" [root@studentvm1 default]#

Chapter 6 of the GRUB documentation referenced in footnote 6 contains a complete listing of all the possible entries in the /etc/default/grub file, but there are three that we should look at here. I always change GRUB_TIMEOUT, the number of seconds for the GRUB menu countdown, from five to ten which gives a bit more time to respond to the GRUB menu before the countdown hits zero. 465

Chapter 16

Linux Boot andStartup

I also change GRUB_DISABLE_RECOVERY from “true” to “false” which is a bit of reverse programmer logic. I have found that the rescue boot option does not always work. To circumvent this problem, I change this statement to allow the grub2-mkconfig command to generate a recovery option for each installed kernel; I have found that when the rescue option fails, these options do work. This also provides recovery kernels for use in case a particular tool or software package that needs to run on a specific kernel version is able to do so.

Note Changing GRUB_DISABLE_RECOVERY in the grub default configuration no longer works starting in Fedora 30. The other changes, GRUB_TIMEOUT and removing “rhgb quiet” from the GRUB_CMDLINE_LINUX variable, still work. The GRUB_CMDLINE_LINUX line can be changed, too. This line lists the command-line parameters that are passed to the kernel at boot time. I usually delete the last two parameters on this line. The rhgb parameter stands for Red Hat Graphical Boot, and it causes the little graphical animation of the Fedora icon to display during the kernel initialization instead of showing boot time messages. The quiet parameter prevents the display of the startup messages that document the progress of the startup and any errors that might occur. Delete both of these entries because SysAdmins need to be able to see these messages. If something goes wrong during boot, the messages displayed on the screen can point us to the cause of the problem. Change these three lines as described so that your grub file looks like this: [root@studentvm1 default]# cat grub GRUB_TIMEOUT=10 GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="resume=/dev/mapper/fedora_studentvm1-swap rd.lvm. lv=fedora_studentvm1/root rd.lvm.lv=fedora_studentvm1/swap rd.lvm.lv=fedora_ studentvm1/usr" GRUB_DISABLE_RECOVERY="false" [root@studentvm1 default]#

Check the current content of the /boot/grub2/grub.cfg file. Run the following command to update the /boot/grub2/grub.cfg configuration file:

466

Chapter 16

Linux Boot andStartup

[root@studentvm1 grub2]# grub2-mkconfig > /boot/grub2/grub.cfg Generating grub configuration file ... Found linux image: /boot/vmlinuz-4.18.9-200.fc28.x86_64 Found initrd image: /boot/initramfs-4.18.9-200.fc28.x86_64.img Found linux image: /boot/vmlinuz-4.17.14-202.fc28.x86_64 Found initrd image: /boot/initramfs-4.17.14-202.fc28.x86_64.img Found linux image: /boot/vmlinuz-4.16.3-301.fc28.x86_64 Found initrd image: /boot/initramfs-4.16.3-301.fc28.x86_64.img Found linux image: /boot/vmlinuz-0-rescue-7f12524278bd40e9b10a085bc82dc504 Found initrd image: /boot/initramfs-0-rescue-7f12524278bd40e9b10a085bc82dc504.img done [root@studentvm1 grub2]#

Recheck the content of /boot/grub2/grub.cfg which should reflect the changes we made. You can grep for the specific lines we changed to verify that the changes occurred. We could also use an alternative form of this command to specify the output file. grub2-mkconfig -o /boot/grub2/grub.cfg Either form works, and the results are the same. Reboot the StudentVM1 virtual machine. Press the Esc key when the GRUB menu is displayed. The first difference you should notice in the GRUB menu is that the countdown timer started at ten seconds. The GRUB menu should now appear similar to that shown in Figure16-2 with a recovery option for each kernel version. The details of your menu will be different from these.

Figure 16-2. After changing /etc/default/grub and running grub2-mkconfig, the GRB menu now contains a recovery mode option for each kernel 467

Chapter 16

Linux Boot andStartup

Use the down arrow key to highlight the recovery option for the default kernel– the second option– and press the Enter key to complete the boot and startup process. This will take you into recovery mode using that kernel. You will also notice many messages displayed on the screen as the system boots and goes through startup. Some of these messages can be seen in Figure16-3 along with messages pertaining to the rescue shell. Based on these messages, we can conclude that “recovery” mode is a rescue mode in which we get to choose the kernel version. The system displays a login message: Give root password for maintenance (or press Control-D to continue):

Type the root password to log in. There are also instructions on the screen in case you want to reboot or continue into the default runlevel target. Notice also at the bottom of the screen in Figure16-3 that the little trail of messages we will embed in the bash startup configuration files in Chapter 17 shows here that the /etc/bashrc and /etc/profile.d/myBashConfig.sh files– along with all of the other bash configuration files in /etc/profile.d– were run at login. I have skipped ahead a bit with this, but I will show you how to test it yourself in Chapter 17. This is good information to have because you will know what to expect in the way of shell configuration while working in recovery mode. While in recovery mode, explore the system while it is in the equivalent of what used to be called single user mode. The lsblk utility will show that all of the filesystems are mounted in their correct locations and the ip addr command will show that networking has not been started. The computer is up and running, but it is in a very minimal mode of operation. Only the most essential services are available to enable pxroblem solving. The runlevel command will show that the host is in the equivalent of the old SystemV runlevel 1.

468

Chapter 16

Linux Boot andStartup

Figure 16-3. After booting to a recovery mode kernel, you use the root password to enter maintenance mode Before completing this experiment, reboot your VM to one of the older regular kernels, and log in to the desktop. Test a few programs, and then open a terminal session to test some command-line utilities. Everything should work without a problem because the kernel version is not bound to specific versions of the rest of the Linux operating system. Running an alternate kernel is easy and commonplace. To end this experiment, reboot the system and allow the default kernel to boot. No intervention will be required. Youx will see all of the kernel boot and startup messages during this normal boot. There are three different terms that are typically applied to recovery mode: recovery, rescue, and maintenance. These are all functionally the same. Maintenance mode is typically used when the Linux host fails to boot to its default target due to some error that 469

Chapter 16

Linux Boot andStartup

occurs during the boot and startup. Being able to see the boot and startup messages if an error occurs can also provide clues as to where the problem might exist. I have found that the rescue kernel, the option at the bottom of the GRUB menu in Figures16-1, 16-2, and 16-3, almost never works and I have tried it on a variety of physical hardware and virtual machines, and it always fails. So I need to use the recovery kernels, and that is why I configure GRUB to create those recovery menu options. In Figure16-2, after configuring GRUB and running the grub2-mkconfig -o /boot/ grub2/grub.cfg command, there are two rescue mode menu options. In my testing I have discovered that the top rescue mode menu option fails but that the bottom rescue mode menu option, the one we just created, does work. But it really does not seem to matter because, as I have said, both rescue and recovery modes provide exactly the same function. This problem is a bug, probably in GRUB, so I reported it to Red Hat using Bugzilla.9 Part of our responsibility as SysAdmins, and part of giving back to the open source community, is to report bugs when we encounter them. Anyone can create an account and log in to report bugs. Updates will be sent to you by e-mail whenever a change is made to the bug report.

The Linux kernel All Linux kernels are in a self-extracting, compressed format to save space. The kernels are located in the /boot directory, along with an initial RAM disk image and symbol maps. After the selected kernel is loaded into memory by GRUB and begins executing, it must first extract itself from the compressed version of the file before it can perform any useful work. The kernel has extracted itself, loads systemd, and turns control over to it. This is the end of the boot process. At this point, the Linux kernel and systemd are running but unable to perform any productive tasks for the end user because nothing else is running, no shell to provide a command line, no background processes to manage the network or other communication links, and nothing that enables the computer to perform any productive function.

Red Hat Bugzilla, https://bugzilla.redhat.com

470

Chapter 16

Linux Boot andStartup

L inux startup The startup process follows the boot process and brings the Linux computer up to an operational state in which it is usable for productive work. The startup process begins when the kernel transfers control of the host to systemd.

s ystemd systemd10,11 is the mother of all processes, and it is responsible for bringing the Linux host up to a state in which productive work can be done. Some of its functions, which are far more extensive than the old SystemV12 init program, are to manage many aspects of a running Linux host, including mounting filesystems and starting and managing system services required to have a productive Linux host. Any of systemd’s tasks that are not related to the startup sequence are outside the scope of this chapter, but we will explore them in Volume 2, Chapter 13. First systemd mounts the filesystems as defined by /etc/fstab, including any swap files or partitions. At this point, it can access the configuration files located in /etc, including its own. It uses its configuration link, /etc/systemd/system/default.target, to determine which state or target, into which it should boot the host. The default.target file is a symbolic link to the true target file. For a desktop workstation, this is typically going to be the graphical.target, which is equivalent to runlevel 5in SystemV.For a server, the default is more likely to be the multi-user.target which is like runlevel 3in SystemV.The emergency.target is similar to single user mode. Targets and services are systemd units. Figure 16-4 is a comparison of the systemd targets with the old SystemV startup runlevels. The systemd target aliases are provided by systemd for backward compatibility. The target aliases allow scripts— and many SysAdmins like myself— to use SystemV commands like init 3 to change runlevels. Of course the SystemV commands are forwarded to systemd for interpretation and execution.

Wikipedia, systemd, https://en.wikipedia.org/wiki/Systemd Yes, systemd should always be spelled like this without any uppercase even at the beginning of a sentence. The documentation for systemd is very clear about this. 12 Wikipedia, Runlevel, https://en.wikipedia.org/wiki/Runlevel 10

471

Chapter 16 SystemV Runlevel

Linux Boot andStartup systemd target

systemd target aliases

ŚĂůƚ͘ƚĂƌŐĞƚ Ϭ

ƉŽǁĞƌŽī͘ƚĂƌŐĞƚ

ĞŵĞƌŐĞŶĐǇ͘ƚĂƌŐĞƚ

ƌĞƐĐƵĞ͘ƚĂƌŐĞƚ

Description ,ĂůƚƐ ƚŚĞƐǇƐƚĞŵǁŝƚŚŽƵƚƉŽǁĞƌŝŶŐŝƚĚŽǁŶ͘

ƌƵŶůĞǀĞůϬ͘ƚĂƌŐĞƚ

,ĂůƚƐƚŚĞƐǇƐƚĞŵĂŶĚƚƵƌŶƐƚŚĞƉŽǁĞƌŽī͘ ^ŝŶŐůĞƵƐĞƌŵŽĚĞ͘EŽƐĞƌǀŝĐĞƐĂƌĞƌƵŶŶŝŶŐ͖ĮůĞƐǇƐƚĞŵƐĂƌĞ ŶŽƚŵŽƵŶƚĞĚ͘dŚŝƐŝƐƚŚĞŵŽƐƚďĂƐŝĐůĞǀĞůŽĨŽƉĞƌĂƟŽŶǁŝƚŚ ŽŶůǇĂŶĞŵĞƌŐĞŶĐǇƐŚĞůůƌƵŶŶŝŶŐŽŶƚŚĞŵĂŝŶĐŽŶƐŽůĞĨŽƌ ƚŚĞƵƐĞƌƚŽŝŶƚĞƌĂĐƚǁŝƚŚƚŚĞƐǇƐƚĞŵ͘

ƌƵŶůĞǀĞůϭ͘ƚĂƌŐĞƚ

ďĂƐŝĐƐǇƐƚĞŵŝŶĐůƵĚŝŶŐŵŽƵŶƟŶŐƚŚĞĮůĞƐǇƐƚĞŵƐǁŝƚŚ ŽŶůǇƚŚĞŵŽƐƚďĂƐŝĐƐĞƌǀŝĐĞƐƌƵŶŶŝŶŐĂŶĚĂƌĞƐĐƵĞƐŚĞůůŽŶ ƚŚĞŵĂŝŶĐŽŶƐŽůĞ͘

ƌƵŶůĞǀĞůϮ͘ƚĂƌŐĞƚ

DƵůƟƵƐĞƌ͕ǁŝƚŚŽƵƚE&^ďƵƚĂůůŽƚŚĞƌŶŽŶͲ'h/ƐĞƌǀŝĐĞƐ ƌƵŶŶŝŶŐ͘

ŵƵůƟͲƵƐĞƌ͘ƚĂƌŐĞƚ ƌƵŶůĞǀĞůϯ͘ƚĂƌŐĞƚ

ůůƐĞƌǀŝĐĞƐƌƵŶŶŝŶŐďƵƚĐŽŵŵĂŶĚůŝŶĞŝŶƚĞƌĨĂĐĞ;>/ͿŽŶůǇ͘

ƌƵŶůĞǀĞůϰ͘ƚĂƌŐĞƚ

hŶƵƐĞĚďƵƚŝĚĞŶƟĐĂůƚŽŵƵůƟͲƵƐĞƌ͘ƚĂƌŐĞƚ͘dŚŝƐƚĂƌŐĞƚĐŽƵůĚ ďĞĐƵƐƚŽŵŝǌĞĚƚŽƐƚĂƌƚůŽĐĂůƐĞƌǀŝĐĞƐǁŝƚŚŽƵƚĐŚĂŶŐŝŶŐƚŚĞ ĚĞĨĂƵůƚŵƵůƟͲƵƐĞƌ͘ƚĂƌŐĞƚ͘

ŐƌĂƉŚŝĐĂů͘ƚĂƌŐĞƚ

ƌƵŶůĞǀĞůϱ͘ƚĂƌŐĞƚ

ŵƵůƟͲƵƐĞƌ͘ƚĂƌŐĞƚǁŝƚŚĂ'h/͘

ƌĞďŽŽƚ͘ƚĂƌŐĞƚ

ƌƵŶůĞǀĞůϲ͘ƚĂƌŐĞƚ

ZĞďŽŽƚ

ĚĞĨĂƵůƚ͘ƚĂƌŐĞƚ

dŚŝƐƚĂƌŐĞƚŝƐĂůǁĂǇƐĂůŝĂƐĞĚǁŝƚŚĂƐǇŵďŽůŝĐůŝŶŬƚŽĞŝƚŚĞƌ ŵƵůƟͲƵƐĞƌ͘ƚĂƌŐĞƚŽƌŐƌĂƉŚŝĐĂů͘ƚĂƌŐĞƚ͘ƐǇƐƚĞŵĚĂůǁĂǇƐƵƐĞƐ ƚŚĞĚĞĨĂƵůƚ͘ƚĂƌŐĞƚƚŽƐƚĂƌƚƚŚĞƐǇƐƚĞŵ͘dŚĞĚĞĨĂƵůƚ͘ƚĂƌŐĞƚ ƐŚŽƵůĚŶĞǀĞƌďĞĂůŝĂƐĞĚƚŽŚĂůƚ͘ƚĂƌŐĞƚ͕ƉŽǁĞƌŽī͘ƚĂƌŐĞƚ͕Žƌ ƌĞďŽŽƚ͘ƚĂƌŐĞƚ͘

Figure 16-4. Comparison of SystemV runlevels with systemd targets and some target aliases Each target has a set of dependencies described in its configuration file. systemd starts the required dependencies. These dependencies are the services required to run the Linux host at a specific level of functionality. When all of the dependencies listed in the target configuration files are loaded and running, the system is running at that target level. 472

Chapter 16

Linux Boot andStartup

systemd also looks at the legacy SystemV init directories to see if any startup files exist there. If so, systemd used those as configuration files to start the services described by the files. The deprecated network service is a good example of one of those that still use SystemV startup files in Fedora. Figure 16-5 is copied directly from the bootup man page. It shows a map of the general sequence of events during systemd startup and the basic ordering requirements to ensure a successful startup. The sysinit.target and basic.target targets can be considered as checkpoints in the startup process. Although systemd has as one of its design goals to start system services in parallel, there are still certain services and functional targets that must be started before other services and targets can be started. These checkpoints cannot be passed until all of the services and targets required by that checkpoint are fulfilled. The sysinit.target is reached when all of the units on which it depends are completed. All of those units, mounting filesystems, setting up swap files, starting udev, setting the random generator seed, initiating low-level services, and setting up cryptographic services if one or more filesystems are encrypted, must be completed, but within the sysinit.target, those tasks can be performed in parallel. The sysinit.target starts up all of the low-level services and units required for the system to be marginally functional, and that are required to enable moving on to the basic.target. After the sysinit.target is fulfilled, systemd next starts the basic.target, starting all of the units required to fulfill it. The basic target provides some additional functionality by starting units that are required for all of the next targets. These include setting up things like paths to various executable directories, communication sockets, and timers. Finally, the user-level targets, multi-user.target, or graphical.target can be initialized. The multi-user.target must be reached before the graphical target dependencies can be met. The underlined targets in Figure16-5 are the usual startup targets. When one of these targets is reached, then startup has completed. If the multi-user.target is the default, then you should see a text mode login on the console. If graphical.target is the default, then you should see a graphical login; the specific GUI login screen you see will depend upon the default display manager.

473

Chapter 16

Linux Boot andStartup

Figure 16-5. The systemd startup map 474

Chapter 16

Linux Boot andStartup

The bootup man page also describes and provides maps of the boot into the initial RAM disk and the systemd shutdown process.

EXPERIMENT 16-5 So far we have only booted to the graphical.target, so let’s change the default target to multiuser.target to boot into a console interface rather than a GUI interface. As the root user on StudentVM1, change to the directory in which systemd configuration is maintained and do a quick listing: [root@studentvm1 ~]# cd /etc/systemd/system/ ; ll drwxr-xr-x. 2 root root 4096 Apr 252018basic.target.wants <snip> lrwxrwxrwx. 1 root root36 Aug 13 16:23default.target -> /lib/systemd/ system/graphical.target lrwxrwxrwx. 1 root root39 Apr 252018display-manager.service -> /usr/ lib/systemd/system/lightdm.service drwxr-xr-x. 2 root root 4096 Apr 252018getty.target.wants drwxr-xr-x. 2 root root 4096 Aug 18 10:16graphical.target.wants drwxr-xr-x. 2 root root 4096 Apr 252018local-fs.target.wants drwxr-xr-x. 2 root root 4096 Oct 30 16:54multi-user.target.wants <snip> [root@studentvm1 system]#

I have shortened this listing to highlight a few important things that will help us understand how systemd manages the boot process. You should be able to see the entire list of directories and links on your VM. The default.target entry is a symbolic link13 (symlink, soft link) to the directory, /lib/systemd/ system/graphical.target. List that directory to see what else is there: [root@studentvm1 system]# ll /lib/systemd/system/ | less

ard and soft links are covered in detail in Chapter 18 in this volume. A symlink is the same as a H soft link.

475

Chapter 16

Linux Boot andStartup

You should see files, directories, and more links in this listing, but look for multi-user.target and graphical.target. Now display the contents of default.target which is a link to /lib/systemd/ system/graphical.target: [root@studentvm1 system]# cat default.target #SPDX-License-Identifier: LGPL-2.1+ # #This file is part of systemd. # #systemd is free software; you can redistribute it and/or modify it #under the terms of the GNU Lesser General Public License as published by #the Free Software Foundation; either version 2.1 of the License, or #(at your option) any later version. [Unit] Description=Graphical Interface Documentation=man:systemd.special(7) Requires=multi-user.target Wants=display-manager.service Conflicts=rescue.service rescue.target After=multi-user.target rescue.service rescue.target display-manager.service AllowIsolate=yes [root@studentvm1 system]#

This link to the graphical.target file now describes all of the prerequisites and needs that the graphical user interface requires. To enable the host to boot to multiuser mode, we need to delete the existing link and then create a new one that points to the correct target. Make PWD /etc/systemd/system if it is not already: [root@studentvm1 system]# rm -f default.target [root@studentvm1 system]# ln -s /lib/systemd/system/multi-user.target default.target

List the default.target link to verify that it links to the correct file: [root@studentvm1 system]# ll default.target lrwxrwxrwx 1 root root 37 Nov 28 16:08 default.target -> /lib/systemd/system/ multi-user.target [root@studentvm1 system]#

476

Chapter 16

Linux Boot andStartup

If your link does not look exactly like that, delete it and try again. List the content of the default.target link: [root@studentvm1 system]# cat default.target #SPDX-License-Identifier: LGPL-2.1+ # #This file is part of systemd. # #systemd is free software; you can redistribute it and/or modify it #under the terms of the GNU Lesser General Public License as published by #the Free Software Foundation; either version 2.1 of the License, or #(at your option) any later version. [Unit] Description=Multi-User System Documentation=man:systemd.special(7) Requires=basic.target Conflicts=rescue.service rescue.target After=basic.target rescue.service rescue.target AllowIsolate=yes [root@studentvm1 system]#

The default.target has different requirements in the [Unit] section. It does not require the graphical display manager. Reboot. Your VM should boot to the console login for virtual console 1 which is identified on the display as tty1. Now that you know what is necessary to change the default target, change it back to the graphical.target using a command designed for the purpose. Let’s first check the current default target: [root@studentvm1 ~]# systemctl get-default multi-user.target [root@studentvm1 ~]# systemctl set-default graphical.target Removed /etc/systemd/system/default.target. Created symlink /etc/systemd/system/default.target → /usr/lib/systemd/ system/graphical.target. [root@studentvm1 ~]#

477

Chapter 16

Linux Boot andStartup

Type the following command to go directly to the display manager login page without having to reboot: [root@studentvm1 system]# systemctl isolate default.target

I am unsure why the term “isolate” was chosen for this subcommand by the developers of systemd. However the effect is to switch targets from one run target to another, in this case from the emergency target to the graphical target. The preceding command is equivalent to the old init 5 command in the days of SystemV start scripts and the init program. Log in to the GUI desktop. We will explore systemd in more detail in Chapter 13 of Volume 2. GRUB and the systemd init system are key components in the boot and startup phases of most modern Linux distributions. These two components work together smoothly to first load the kernel and then to start up all of the system services required to produce a functional GNU/Linux system. Although I do find both GRUB and systemd more complex than their predecessors, they are also just as easy to learn and manage. The man pages have a great deal of information about systemd, and freedesktop.org has a web site that describes the complete startup process14 and a complete set of systemd man pages15 online.

Graphical login screen There are still two components that figure in to the very end of the boot and startup process for the graphical.target, the display manager (dm) and the window manager (wm). These two programs, regardless of which ones you use on your Linux GUI desktop system, always work closely together to make your GUI login experience smooth and seamless before you even get to your desktop.

reedesktop.org, systemd bootup process, www.freedesktop.org/software/systemd/man/ F bootup.html 15 Freedesktop.org, systemd index of man pages, www.freedesktop.org/software/systemd/man/ index.html 14

478

Chapter 16

Linux Boot andStartup

D isplay manager The display manager16 is a program with the sole function of providing the GUI login screen for your Linux desktop. After you log in to a GUI desktop, the display manager turns control over to the window manager. When you log out of the desktop, the display manager is given control again to display the login screen and wait for another login. There are several display managers; some are provided with their respective desktops. For example, the kdm display manager is provided with the KDE desktop. Many display managers are not directly associated with a specific desktop. Any of the display managers can be used for your login screen regardless of which desktop you are using. And not all desktops have their own display managers. Such is the flexibility of Linux and well-written, modular code. The typical desktops and display managers are shown in Figure16-6. The display manager for the first desktop that is installed, that is, GNOME, KDE, etc., becomes the default one. For Fedora, this is usually gdm which is the display manager for GNOME.If GNOME is not installed, then the display manager for the installed desktop is the default. If the desktop selected during installation does not have a default display manager, then gdm is installed and used. If you use KDE as your desktop, the new SDDM17 will be the default display manager.

ikipedia, X Display Manager, https://en.wikipedia.org/ W wiki/X_display_manager_(program_type) 17 Wikipedia, Simple desktop Display Manager, https://en.wikipedia.org/wiki/ Simple_Desktop_Display_Manager 16

479

Chapter 16

Linux Boot andStartup

Desktop

Display Manager

Comments

'EKD

ŐĚŵ

'EKDŝƐƉůĂǇDĂŶĂŐĞƌ

ŬĚŵ

<ŝƐƉůĂǇDĂŶĂŐĞƌ;hƉƚŚƌŽƵŐŚ&ĞĚŽƌĂϮϬͿ

ůŝŐŚƚĚŵ

>ŝŐŚƚǁĞŝŐŚƚŝƐƉůĂǇDĂŶĂŐĞƌ

ůǆĚŵ

>yŝƐƉůĂǇDĂŶĂŐĞƌ

ƐĚĚŵ

^ŝŵƉůĞĞƐŬƚŽƉŝƐƉůĂǇDĂŶĂŐĞƌ;&ĞĚŽƌĂϮϭĂŶĚĂďŽǀĞͿ

ǆĚŵ

ĞĨĂƵůƚ ytŝŶĚŽǁ^ǇƐƚĞŵŝƐƉůĂǇDĂŶĂŐĞƌ

Figure 16-6. A short list of display managers Regardless of which display manager is configured as the default at installation time, later installation of additional desktops does not automatically change the display manager used. If you want to change the display manager, you must do it yourself from the command line. Any display manager can be used, regardless of which window manager and desktop are used.

W indow manager The function of a window manager18 is to manage the creation, movement, and destruction of windows on a GUI desktop including the GUI login screen. The window manager works with the Xwindow19 system or the newer Wayland20 to perform these tasks. The Xwindow system provides all of the graphical primitives and functions to generate the graphics for a Linux or Unix graphical user interface. The window manager also controls the appearance of the windows it generates. This includes the functional decorative aspects of the windows, such as the look of buttons, sliders, window frames, pop-up menus, and more. As with almost every other component of Linux, there are many different window managers from which to choose. The list in Figure16-7 represents only a sample of the available window managers. Some of these window managers are stand-alone, that is, Wikipewdia, X Window Manager, https://en.wikipedia.org/wiki/X_window_manager Wikipedia, X Window System, https://en.wikipedia.org/wiki/X_Window_System 20 Wikipedia, Wayland, https://en.wikipedia.org/wiki/Wayland_(display_server_protocol) 18 19

480

Chapter 16

Linux Boot andStartup

they are not associated with a desktop and can be used to provide a simple graphical user interface without the more complex, feature-rich, and more resource-intensive overhead of a full desktop environment. Stand-alone window managers should not be used with any of the desktop environments. ĞƐŬƚŽƉ hŶŝƚǇ

tŝŶĚŽǁDĂŶĂŐĞƌ

ŽŵŵĞŶƚƐ

ŽŵƉŝǌ &ůƵǆďŽǆ &stD /ĐĞtD

<ǁŝŶ

^ƚĂƌƟŶŐǁŝƚŚ<WůĂƐŵĂϰŝŶϮϬϬϴ

'EKD

DĞƚĂĐŝƚǇ

ĞĨĂƵůƚĨŽƌ'EKDϮ

'EKD

DƵƩĞƌ

ĞĨĂƵůƚƐƚĂƌƟŶŐǁŝƚŚ'EKDϯ

KƉĞŶďŽǆ ƚǁŵ

yĨĐĞ

ǀĞƌǇŽůĚĂŶĚƐŝŵƉůĞǁŝŶĚŽǁŵĂŶĂŐĞƌ͘^ŽŵĞĚŝƐƚƌŽƐůŝŬĞ &ĞĚŽƌĂƵƐĞŝƚĂƐĂĨĂůůďĂĐŬŝŶĐĂƐĞŶŽŽƚŚĞƌǁŝŶĚŽǁŵĂŶĂŐĞƌŽƌ ĚĞƐŬƚŽƉŝƐĂǀĂŝůĂďůĞ͘

ǆĨǁŵϰ

Figure 16-7. A short list of window managers Most window managers are not directly associated with any specific desktop. In fact some window managers can be used without any type of desktop software, such as KDE or GNOME, to provide a very minimalist GUI experience for users. Many desktop environments support the use of more than one window manager.

481

Chapter 16

Linux Boot andStartup

How do Ideal withall these choices? In most modern distributions, the choices are made for you at installation time and are based on your selection of desktops and the preferences of the packagers of your distribution. The desktop and window managers and the display manager can be easily changed. Now that systemd has become the standard startup system in many distributions, you can set the preferred display manager in /etc/systemd/system which is where the basic system startup configuration is located. There is a symbolic link (symlink) named display-manager.service that points to one of the display manager service units in /usr/ lib/systemd/system. Each installed display manager has a service unit located there. To change the active display manager, remove the existing display-manager.service link, and replace it with the one you want to use.

EXPERIMENT 16-6 Perform this experiment as root. We will install additional display managers and stand-alone window managers then switch between them. Check and see which display managers are already installed. The RPMs in which the window managers are packaged have inconsistent naming, so it is difficult to locate them using a simple DNF search unless you already know their RPM package names which, after a bit of research, I do: [root@studentvm1 ~]# dnf list compiz fluxbox fvwm icewm xorg-x11-twm xfwm4 Last metadata expiration check: 1:00:54 ago on Thu 29 Nov 2018 11:31:21 AM EST. Installed Packages xfwm4.x86_644.12.5-1.fc28@updates Available Packages compiz.i6861:0.8.14-5.fc28fedora compiz.x86_641:0.8.14-5.fc28fedora fluxbox.x86_641.3.7-4.fc28fedora fvwm.x86_642.6.8-1.fc28updates icewm.x86_641.3.8-15.fc28fedora xorg-x11-twm.x86_641:1.0.9-7.fc28fedora [root@studentvm1 ~]#

482

Chapter 16

Linux Boot andStartup

Now let’s look for the display managers: [root@studentvm1 ~]# dnf list gdm kdm lightdm lxdm sddm xfdm xorg-x11-xdm Last metadata expiration check: 2:15:20 ago on Thu 29 Nov 2018 11:31:21 AM EST. Installed Packages lightdm.x86_641.28.0-1.fc28 @updates Available Packages gdm.i6861:3.28.4-1.fc28 updates gdm.x86_641:3.28.4-1.fc28 updates kdm.x86_641:4.11.22-22.fc28 fedora lightdm.i6861.28.0-2.fc28 updates lightdm.x86_641.28.0-2.fc28 updates lxdm.x86_64 0.5.3-10.D20161111gita548c73e.fc28fedora sddm.i6860.17.0-3.fc28 updates sddm.x86_640.17.0-3.fc28 updates xorg-x11-xdm.x86_641:1.1.11-16.fc28 fedora [root@studentvm1 ~]#

Each dm is started as a systemd service, so another way to determine which ones are installed is to check the /usr/lib/systemd/system/ directory. The lightdm display manager shows up twice as installed and available because there is an update for it: [root@studentvm1 ~]# cd /usr/lib/systemd/system/ ; ll *dm.service -rw-r--r-- 1 root root 1059 Sep1 11:38 lightdm.service [root@studentvm1 system]#

Like my VM, yours should have only a single dm, the lightdm. Let’s install lxdm and xorgx11-xdm as the additional display managers, with FVWM, fluxbox, and icewm for window managers: [root@studentvm1 ~]# dnf install -y lxdm xorg-x11-xdm compiz fvwm fluxbox icewm

Now we must restart the display manager service so that the newly installed window managers in the display manager selection tool. The simplest way is to log out of the desktop and restart the dm from a virtual console session: [root@studentvm1 ~]# systemctl restart display-manager.service

483

Chapter 16

Linux Boot andStartup

Or we could do this by switching to the multiuser target and then back to the graphical target. Do this, too, just to see what switching between these targets looks like: [root@studentvm1 ~]# systemctl isolate multi-user.target [root@studentvm1 ~]# systemctl isolate graphical.target

But this second method is a lot more typing. Switch back to the lightdm login on vc1, and look in the upper right corner of the lightdm login screen. The leftmost icon, which on my VM looks like a sheet of paper with a wrench,21 allows us to choose the desktop or window manager we want to use before we log in. Click this icon and choose FVWM from the menu in Figure16-8, then log in.

Figure 16-8. The lightdm display manager menu now shows the newly installed window managers Explore this window manager. Open an Xterm instance, and locate the menu option that gives access to application programs. Figure16-9 shows the Fvwm desktop (this is not a desktop environment like KDE or GNOME) with an open Xterm instance and a menu tree that is opened with a left click on the display. A different menu is opened with a right-click. Fvwm is a very basic but usable window manager. Like most window managers, it provides menus to access various functions and a graphical display that supports simple windowing functionality. Fvwm also provides multiple windows in which to run programs for some task management capabilities. Notice that the XDGMenu in Figure16-9 also contains Xfce applications. The Start Here menu item leads to the Fvwm menus that include all of the standard Linux applications that are installed on the host.

he icon on your version of lightdm might be different. It changed for me at least once after T installing updates.

484

Chapter 16

Linux Boot andStartup

Figure 16-9. The Fvwm window manager with an Xterm instance and some of the available menus After spending a bit of time exploring the Fvwm interface, log out. Can’t find the way to do that? Neither could I as it is very nonintuitive. Left-click the desktop and open the FvwmConsole. Then type in the command Quit– yes, with the uppercase Q– and press Enter. We could also open an Xterm session and use the following command which kills all instances of the Fvwm window manager belonging to the student user: [student@studentvm1 ~]# killall fvwm

Try each of the other window managers, exploring the basic functions of launching applications and a terminal session. When you have finished that, exit whichever window manager you are in, and log in again using the Xfce desktop environment.

485

Chapter 16

Linux Boot andStartup

Now let’s change the display manager to one of the new ones we have installed. Each dm has the same function, to provide a GUI for login and some configuration such as the desktop environment or window manager to start as the user interface. Change into the /etc/systemd/ system/ directory, and list the link for the display manager service: [root@studentvm1 ~]# cd /etc/systemd/system/ ; ll display-manager.service total 60 lrwxrwxrwx. 1 root root39 Apr 252018display-manager.service -> /usr/ lib/systemd/system/lightdm.service Locate all of the display manager services in /usr/lib/systemd/system/: [root@studentvm1 system]# ll /usr/lib/systemd/system/*dm.service -rw-r--r-- 1 root root 1059 Sep 26 11:04 /usr/lib/systemd/system/lightdm. service -rw-r--r-- 1 root root384 Feb 142018 /usr/lib/systemd/system/lxdm.service -rw-r--r-- 1 root root287 Feb 102018 /usr/lib/systemd/system/xdm.service

And make the change: [root@studentvm1 system]# rm -f display-manager.service [root@studentvm1 system]# [root@studentvm1 system]# ln -s /usr/lib/systemd/ system/xdm.service display.manager.service [root@studentvm1 system]# ll display-manager.service lrwxrwxrwx 1 root root 35 Nov 30 09:03 display.manager.service -> /usr/lib/ systemd/system/xdm.service [root@studentvm1 system]#

As far as I can tell at this point, rebooting the host is the only way to reliably activate the new dm. Go ahead and reboot your VM now to do that. There is a tool, system-switch-displaymanager, which is supposed to make the necessary changes, and it does seem to work sometimes. But this tool does not restart the dm, and many times that step fails when performed. Unfortunately, my own experiments have determined that restarting the display manager service does not activate the new dm. The following steps are supposed to work; try it to see if it works for you as you switch back to the lightdm display manager: [root@studentvm1 ~]# dnf -y install system-switch-displaymanager [root@studentvm1 ~]# system-switch-displaymanager lightdm [root@studentvm1 ~]# systemctl restart display-manager.service

486

Chapter 16

Linux Boot andStartup

If the second two steps in this sequence does not work, then reboot. Jason Baker, my technical reviewer, says, “This seemed to work for me, but then it failed to actually log in to lightdm, so I had to reboot.” Different distributions and desktops have various means of changing the window manager, but, in general, changing the desktop environment also changes the window manager to the default one for that desktop. For current releases of Fedora Linux, the desktop environment can be changed on the display manager login screen. If standalone display managers are also installed, they also appear in the list with the desktop environments. There are many different choices for display and window managers available. When you install most modern distributions with any kind of desktop, the choices of which ones to install and activate are usually made by the installation program. For most users, there should never be any need to change these choices. For others who have different needs, or for those who are simply more adventurous, there are many options and combinations from which to choose. With a little research, you can make some interesting changes.

About thelogin After a Linux host is turned on, it boots and goes through the startup process. When the startup process is completed, we are presented with a graphical or command-line login screen. Without a login prompt, it is impossible to log in to a Linux host. How the login prompt is displayed and how a new one is displayed after a user logs out are the final stage of understanding the Linux startup.

CLI login screen The CLI login screen is initiated by a program called a getty, which stands for GET TTY. The historical function of a getty was to wait for a connection from a remote dumb terminal to come in on a serial communications line. The getty program would spawn the login screen and wait for a login to occur. When the remote user would log in, the getty would terminate, and the default shell for the user account would launch and allow the user to interact with the host on the command line. When the user would log out, the init program would spawn a new getty to listen for the next connection. 487

Chapter 16

Linux Boot andStartup

Today’s process is much the same with a few updates. We now use an agetty, which is an advanced form of getty, in combination with the systemd service manager, to handle the Linux virtual consoles as well as the increasingly rare incoming modem lines. The steps listed in the following show the sequence of events in a modern Linux computer: 1. systemd starts the systemd-getty-generator daemon. 2. The systemd-getty-generator spawns an agetty on each of the virtual consoles using the [emailprotected]. 3. The agettys wait for virtual console connection, which is the user switching to one of the VCs. 4. The agetty presents the text mode login screen on the display. 5. The user logs in. 6. The shell specified in /etc/passwd is started. 7. Shell configuration scripts run. 8. The user works in the shell session. 9. The user logs off. 10. The systemd-getty-generator spawns an agetty on the logged out virtual console. 11. Go to step 3. Starting with step 3, this is a circular process that repeats as long as the host is up and running. New login screens are displayed on a virtual console immediately after the user logs out of the old session.

GUI login screen The GUI login screen as displayed by the display manager is handled in much the same way as the systemd-getty-generator handles the text mode login: 1. The specified display manager (dm) is launched by systemd at the end of the startup sequence. 2. The display manager displays graphical login screen, usually on virtual console 1. 488

Chapter 16

Linux Boot andStartup

3. The dm waits for a login. 4. The user logs in. 5. The specified window manager is started. 6. The specified desktop GUI, if any, is started. 7. The user performs work in the window manager/desktop. 8. The user logs out. 9. systemd respawns the display manager. 10. Go to step 2. The steps are almost the same, and the display manager functions as a graphical version of the agetty.

Chapter summary We have explored the Linux boot and startup processes in some detail. This chapter explored reconfiguration of the GRUB bootloader to display the kernel boot and startup messages as well as to create recovery mode entries, ones that actually work, for the GRUB menu. Because there is a bug when attempting to boot to the rescue mode kernel, we discussed our responsibility as SysAdmins to report bugs through the appropriate channels. We installed and explored some different window managers as an alternative to more complex desktop environments. The desktop environments do depend upon at least one of the window managers for their low-level graphical functions while providing useful, needed, and sometimes fun features. We also discovered how to change the default display manager to provide a different GUI login screen as well as how the GUI and command-line logins work. This chapter has also been about learning the tools like dd that we used to extract the data from files and from specific locations on the hard drive. Understanding those tools and how they can be used to locate and trace data and files provides SysAdmins with skills that can be applied to exploring other aspects of Linux.

489

Chapter 16

Linux Boot andStartup

Exercises 1. Describe the Linux boot process. 2. Describe the Linux startup process. 3. What does GRUB do? 4. Where is stage 1 of GRUB located on the hard drive? 5. What is the function of systemd during startup? 6. Where are the systemd startup target files and links located? 7. Configure the StudentVM1 host so that the default.target is reboot. target and reboot the system. After watching the VM reboot a couple times, reconfigure the default.target to point to the graphical.target again and reboot. 8. What is the function of an agetty? 9. Describe the function of a display manager. 10. What Linux component attaches to a virtual console and displays the text mode login screen? 11. List and describe the Linux components involved and the sequence of events that take place when a user logs in to a virtual console until they log out. 12. What happens when the display manager service is restarted from a root terminal session on the desktop using the command systemctl restart display-manager.service?

490

CHAPTER 17

Shell Configuration O bjectives In this chapter you will learn •

How the Bash shell is configured

•

How to modify the configuration of the Bash shell so that your changes won’t be overwritten during updates

•

The names and locations of the files used to configure Linux shells at both global and user levels

•

Which shell configuration files should not be changed

•

How to set shell options

•

The locations in which to place or find supplementary configuration files

•

How to set environment variables from the command line

•

How to set environment variables using shell configuration files

•

The function of aliases and how to set them

In this chapter we will learn to configure the Bash shell because it is the default shell for almost every Linux distribution. Other shells have very similar configuration files, and many of them coexist with the Bash configuration files in both the /etc directory for global configuration and in the users’ home directories for local configuration. We will explore environment variables and shell variables and how they contribute to the behavior of the shell itself and the programs that run in a shell. We will discover the files that can be used to configure the Bash shell globally and for individual users. This chapter is not about learning every possible environment variable. It is more about learning where the files used to configure the Bash shell are located and how to manage them.

491

Chapter 17

Shell Configuration

We have looked at the $PATH and $? environment variables, but there are many more variables than just those. The $EDITOR variable, for example, defines the name of the default text mode editor to be used when programs call for an editor, and, as we have already seen, the $PATH environment variable defines a list of directories in which the shell will look for commands. Most of these variables are used to help define how the shell and the programs running in the shell behave. Running programs, whether command line or GUI, can extract the values of one or more environment variables in order to determine specific behaviors.

S tarting theshell The sequence of events that takes place when we start a shell provides us with the information we need to understand its configuration. This sequence begins with global configuration files and then proceeds to the local configuration files which allow users to override global configuration settings. All of the files we encounter in this section are ASCII text files, so they are open and knowable. Some of these files should not be changed, but their content can be overridden inlocal configuration files. Before we can explore any further, we need to define a couple terms. There are multiple ways that one can start a shell, and this results in multiple sets of circ*mstances under which the shell might be started. There are two circ*mstances that we are concerned about here, and they do result in different environments and a somewhat different sequence in which the shell initialization is performed: •

Login shell: A login shell is one that you needed to use a user ID and password to gain access. This is the case with a virtual console or when you log in remotely using SSH.The GUI desktop1 constitutes a login shell.

I n many ways, a GUI desktop can be considered a shell, and its login sequence is very similar to that of a login to a virtual console.

492

Chapter 17

•

Shell Configuration

Non-login shell: A non-login shell is one that is spawned or launched from within another, already running shell. This parent shell can be a login shell or another non-login shell. Non-login shells can be launched from within a GUI desktop, by the screen command, from within a terminal emulator where multiple tabs or windows can each contain a shell instance.

There are five main files and one directory that are used to configure the Bash environment. We will look at each of these in a bit more detail, but they are listed here along with their main functions: •

/etc/profile: System-wide environment and startup programs.

•

/etc/bashrc: System-wide functions and aliases.

•

/etc/profile.d/: This directory contains system-wide scripts for configuring various CLI tools such as vim and mc. The SysAdmin can also place custom configuration scripts in this directory.

•

~/.bash_profile: User-specific environment and startup programs.

•

~/.bashrc: User-specific aliases and functions.

•

~/.bash_logout: User-specific commands to execute when the user logs out.

All user shell configuration files that are located in the /etc/skel directory, such as ~/.bash_profile and ~/.bashrc, are copied into the new account home directory when each new user account is created. We will explore managing users and the creation of new accounts in Volume 2, Chapter 16. The sequence of execution for all of the Bash configuration files is shown in Figure17-1. It can seem convoluted, and it is. But once we unravel it, you will understand how Bash is configured. You will know where you can make changes that can override the defaults, add to the $PATH, and prevent future updates from overwriting the changes you have made. Note that the global configuration files are located in /etc or a subdirectory, and local Bash configuration files are located in the login user’s home directory (~). Let’s walk through the sequence using the flowchart in Figure17-1 and then do a couple experiments that will enable you to understand how to follow the sequence yourself if that should ever be necessary. Note that the dashed lines in Figure17-1 indicate that the script calls an external script and then control returns to the calling 493

Chapter 17

Shell Configuration

script. So /etc/profile and /etc.bashrc both call the scripts located in /etc/profile.d and ~/.bash_profile calls ~/.bashrc and, when those script have completed, control returns to the script that called them.

Figure 17-1. The Bash shell configuration sequence of shell programs 494

Chapter 17

Shell Configuration

Non-login shell startup We start with the non-login shell because it is a bit simpler. Starting on the upper lefthand corner of Figure17-1, we launch the shell. A determination is made that it is a nonlogin shell, so we take the No path out of the decision diamond. This is the case because we are already logged in to the desktop. This path leads us through execution of ~/.bashrc which calls /etc.bashrc. The /etc. bashrc program contains code that calls each of the files ending with *.sh and the file sh.local that are located in /etc/profile.d. These are not the only files in that directory, as other shells also store configuration files there. After all of the Bash configuration files in /etc/profile.d complete their execution, control returns to /etc.bashrc which performs a bit of cleanup and then exits. At this point the Bash shell is fully configured.

Login shell startup The startup and configuration sequence through these shell scripts is more complex for a login shell than it is for a non-login shell. Yet almost all of the same configuration takes place. This time we take the Yes path out of the first decision point in the upper left corner of Figure17-1. This causes the /etc/profile script to execute. The /etc/profile script contains some of its own code that executes all of the files ending with *.sh and the file sh.local that are located in /etc/profile.d. After these files have finished running, control returns to /etc/profile which finishes its own execution. The shell looks now for three files in sequence, ~/.bash_profile, ~/.Bash_login, and ~/.profile. It runs the first one it finds and ignores the others. Fedora home directories typically contain the ~/.bash_profile file so that is the one which is executed. The other two files do not exist because there is no point to that. These two files, ~/.Bash_login and ~/.profile, are considered by some to be possible alternate files that might exist in some old legacy hosts, so the shell continues to look for them so as to maintain backward compatibility. Some software such as the Torch machine learning framework stores its environment variables in ~/.profile, and other software might also use these legacy files. The ~/.bash_profile configuration file also calls the ~/.bashrc file which is also executed and then control returns to ~.bash_profile. When ~.bash_profile finishes execution, the shell configuration is complete.

495

Chapter 17

Shell Configuration

Exploring theglobal configuration scripts The scripts in /etc, /etc/profile and /etc.bashrc, as well as all of the *.sh scripts in /etc/ profile.d, are the global configuration scripts for the Bash shell. Global configuration is inherited by all users. A little knowledge of the content of these scripts helps us better understand how it all fits together.

EXPERIMENT 17-1 Perform this experiment as the student user. Set /etc as the PWD, then look at the permissions of /etc/profile: [root@studentvm1 ~]# cd /etc ; ll profile -rw-r--r--. 1 root root 2078 Apr 172018 profile

It is readable by all but can only be modified by root. Note that its execute permission is not set. In fact, none of these configuration files are marked as executable despite the fact that the commands in them must be executed in order to set up the environment. That is because the shell “sources” /etc/profile which then sources other setup files. After sourcing the file, which and be done with the source command or the much shorter alternative, dot (.), the instructions in the file are executed. Use less to look at the content of /etc/profile to see what it does. It is not necessary to analyze this entire file in detail. But you should be able to see where some of the environment variables are set programmatically. Search for instances of PATH to see how the $PATH is set. The first thing you see after the comments describing the file is a procedure named “pathmunge” which is called by code further down when the initial path needs to be modified: pathmunge () { case ":${PATH}:" in *:"$1":*) ;; *) if [ "$2" = "after" ] ; then PATH=$PATH:$1

496

Chapter 17

Shell Configuration

else PATH=$1:$PATH fi esac }

After this there is some code that determines the effective user ID, $EUID, of the user launching the shell. Then here is the code that sets the first elements of the $PATH environment variable based on whether the $EUID is root with a value of zero (0), or another non-root, nonzero user: # Path manipulation if [ "$EUID" = "0" ]; then pathmunge /usr/sbin pathmunge /usr/local/sbin else pathmunge /usr/local/sbin after pathmunge /usr/sbin after fi

The path is different for root than it is for other users, and this is the Bash shell code that makes that happen. Now let’s look at some code down near the bottom of this file. The next bit of code is the part that locates and executes the Bash configuration scripts in /etc/profile.d: for i in /etc/profile.d/*.sh /etc/profile.d/sh.local ; do if [ -r "$i" ]; then if [ "${-#*i}" != "$-" ]; then . "$i" else . "$i" >/dev/null fi fi done

List the files in the /etc/profile.d directory: [student@studentvm1 ~]$ ll /etc/profile.d/*.sh -rw-r--r--. 1 root root664 Jun 18 06:41 /etc/profile.d/Bash_completion.sh -rw-r--r--. 1 root root201 Feb72018 /etc/profile.d/colorgrep.sh -rw-r--r--. 1 root root 1706 May 29 12:30 /etc/profile.d/colorls.sh

497

Chapter 17

Shell Configuration

-rw-r--r--. 1 root root56 -rw-r--r--. 1 root root183 -rw-r--r--. 1 root root220 -rw-r--r--. 1 root root757 -rw-r--r--1 root root 70 -rw-r--r--1 root root288 -rw-r--r--. 1 root root 2703 -rw-r--r--. 1 root root253 -rwxr-xr-x1 root root153 -rw-r--r--1 root root488 -rw-r--r--. 1 root root248 -rw-r--r--. 1 root root 2092 -rw-r--r--. 1 root root310 [student@studentvm1 ~]$

Apr 192018 May92018 Feb92018 Dec 142017 Aug 31 08:25 Mar 122018 May 25 07:04 Feb 172018 Aug32017 Oct3 13:49 Sep 19 04:31 May 212018 Feb 172018

/etc/profile.d/colorsysstat.sh /etc/profile.d/colorxzgrep.sh /etc/profile.d/colorzgrep.sh /etc/profile.d/gawk.sh /etc/profile.d/gnome-ssh-askpass.sh /etc/profile.d/kde.sh /etc/profile.d/lang.sh /etc/profile.d/less.sh /etc/profile.d/mc.sh /etc/profile.d/myBashConfig.sh /etc/profile.d/vim.sh /etc/profile.d/vte.sh /etc/profile.d/which2.sh

Can you see the file I added? It is myBashConfig.sh which does not exist on your VM.Here is the content of myBashConfig.sh. I have set some aliases set vi editing mode for my Bash shell command line and set a couple environment variables: ############################################################################# # The following global changes to Bash configuration added by me# ############################################################################# alias lsn='ls --color=no' alias vim='vim -c "colorscheme desert" ' alias glances='glances -t1' # Set vi for Bash editing mode set -o vi # Set vi as the default editor for all apps that check this # Set some shell variables EDITOR=vi TERM=xterm

You should also look at the content of some of the other Bash configuration files in /etc/ profile.d to see what they do. And this last bit of code in /etc/profile is to source and run the /etc/bashrc file if it exists and if the $Bash_VERSION- variable is not null: if [ -n "${Bash_VERSION-}" ] ; then if [ -f /etc/bashrc ] ; then

498

Chapter 17 # # # . fi fi

Shell Configuration

Bash login shells run only /etc/profile Bash non-login shells run only /etc/bashrc Check for double sourcing is done in /etc/bashrc. /etc/bashrc

So now look at the content of /etc/bashrc. As the first comment in this file states, its function is to set system-wide functions and aliases. This includes setting the terminal emulator type, the command prompt string, the umask which defines the default permissions of new files when they are created, and– very importantly– the $SHELL variable which defines the fully qualified path and name of the Bash shell executable. We will explore umask in Chapter 18 in this volume. None of the default files used for global configuration of the Bash shell should be modified. To modify or add to the global configuration, you should add a custom file to the /etc/profile.d directory that contains the configuration mode you wish to make. The name of the file is unimportant other than it must end in “.sh” but I suggest naming it something noticeable.

Exploring thelocal configuration scripts The local Bash configuration files are located in each user’s home directory. Each user can modify these files in order to configure the shell environment to their own preferences. The local configuration files, .bashrc and .bash_profile, contain some very basic configuration items.

EXPERIMENT 17-2 When a login shell is started, Bash first runs /etc/profile, and when that finishes, the shell runs ~/.bash_profile. View the ~/.bash_profile file. The local files we are viewing in this experiment are small enough to reproduce here in their entirety: [student@studentvm1 ~]$ cat .bash_profile # .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then

499

Chapter 17

Shell Configuration

. ~/.bashrc fi # User specific environment and startup programs PATH=$PATH:$HOME/.local/bin:$HOME/bin export PATH

First, ~/.bash_profile runs ~/.bashrc to set the aliases and functions into the environment. It then sets the path and exports it. That means that the path is then available to all future nonlogin shells. The ~/.bashrc config file is called by ~/.bash_profile. This file, as shown in the following, calls /etc.bashrc: [student@studentvm1 ~]$ cat .bashrc # .bashrc # Source global definitions if [ -f /etc.bashrc ]; then . /etc.bashrc fi # Uncomment the following line if you don't like systemctl's auto-paging feature: # export SYSTEMD_PAGER= # User specific aliases and functions [student@studentvm1 ~]$

The comments in these files inform the users where they can insert any local configuration such as environment variables or aliases.

Testing it That explanation is all nice and everything, but what does it really mean? There is one way to find out, and this is a technique I use frequently to test for the sequence of execution of a complex and interrelated system of shell programs or of procedures within shell programs. I just add an echo statement at the beginning of each of the programs in question stating which shell program is running.

500

Chapter 17

Shell Configuration

EXPERIMENT 17-3 Edit each of the following shell programs, and add one line to the beginning of the program. I have highlighted the lines to be added in bold so you know where to place them. For this experiment, it is safe to ignore the warning comments embedded in each program against changing them. These first three programs need to be modified by root: 1. Edit /etc/profile: # /etc/profile # System wide environment and startup programs, for login setup # Functions and aliases go in /etc.bashrc # # # #

It's NOT a good idea to change this file unless you know what you are doing. It's much better to create a custom.sh shell script in /etc/profile.d/ to make custom changes to your environment, as this will prevent the need for merging in future updates.

pathmunge () { case ":${PATH}:" in *:"$1":*) ;; *) if [ "$2" = "after" ] ; then PATH=$PATH:$1 else PATH=$1:$PATH fi esac } echo "Running /etc/profile" if [ -x /usr/bin/id ]; then if [ -z "$EUID" ]; then # ksh workaround EUID=`id -u`

501

Chapter 17

Shell Configuration

UID=`id -ru` fi USER="`id -un`" LOGNAME=$USER MAIL="/var/spool/mail/$USER" fi

Note that in the case of /etc/profile, we add our bit of code after the pathmunge procedure. This is because all procedures must appear before any in-line code.2 2 . Edit /etc.bashrc: # /etc.bashrc # System wide functions and aliases # Environment stuff goes in /etc/profile # # # #

echo "Running /etc/bashrc" # Prevent doublesourcing if [ -z ".bashrcSOURCED" ]; then .bashrcSOURCED="Y"

3. Add a new program, /etc/profile.d/myBashConfig.sh, and add the following two lines to it: # /etc/profile.d/myBashConfig.sh echo "Running /etc/profile.d/myBashConfig.sh"

The files .bash_profile and .bashrc should be altered by the student user for the student user’s account.

We will discuss Bash coding, procedures, and program structure in Volume 2, Chapter 10.

502

Chapter 17

Shell Configuration

4. Edit ~/.bash_profile: # .bash_profile echo "Running ~/.bash_profile" # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi # User specific environment and startup programs PATH=$PATH:$HOME/.local/bin:$HOME/bin export PATH

5. Edit ~/.bashrc: # .bashrc echo "Running ~/.bashrc" # Source global definitions if [ -f /etc.bashrc ]; then . /etc.bashrc fi # Uncomment the following line if you don't like systemctl's autopaging feature: # export SYSTEMD_PAGER= # User specific aliases and functions

After all of the files have been modified as shown earlier, open a new terminal session on the desktop. Each file that executes should print its name on the terminal. That should look like this: Running ~/.bashrc Running /etc/bashrc Running /etc/profile.d/myBashConfig.sh [student@studentvm1 ~]$

503

Chapter 17

Shell Configuration

So you can see by the sequence of shell config scripts that were run that this is a non-login shell as shown in Figure17-1. Switch to virtual console 2, and log in as the student user. You should see the following data: Last login: Sat Nov 24 11:20:41 2018 from 192.178.0.1 Running /etc/profile Running /etc/profile.d/myBashConfig.sh Running /etc/bashrc Running ~/.bash_profile Running ~/.bashrc Running /etc/bashrc [student@studentvm1 ~]$

This experiment shows exactly which files are run and in what sequence. It verifies most of what I have read in other documents and my analysis of the code in each of these files. However, I have intentionally left one error in my analysis and the diagram in Figure17-1. Can you figure out what the difference is and why?3

E xploring theenvironment We have already looked at some environment variables and learned that they affect how the shell behaves under certain circ*mstances. Environment variables are just like any other variable, a variable name and a value. The shell or programs running under the shell check the content of certain variables and use the values of those variables to determine how they respond to specific input, data values, or other triggering factors. A typical variable looks like this: VARIABLE_NAME=value The actual content of the environment can be explored and manipulated with simple tools. Permanent changes need to be made in the configuration files, but temporary changes can be made with basic commands from the command line.

Hint: Look for duplicates.

504

Chapter 17

Shell Configuration

EXPERIMENT 17-4 Perform this experiment in a terminal session as the student user. Close all currently open terminal sessions, and then open a new terminal session. View the current environment variables using the printenv command: [student@studentvm1 ~]$ printenv | less

Some environment variables such as LS_COLORS and TERMCAP contain very long strings of text. The LS_COLORS string defines the colors used for display of specific text when various commands are run if the terminal is capable of displaying color. The TERMCAP (TERMinal CAPabilities) variable defines the capabilities of the terminal emulator. Look at some individual values. What is the value of HOME? [student@studentvm1 ~]$ echo $HOME /home/student

Do you think that this might be how the shell knows which directory to make the PWD using the command cd ~? What are the values of LOGNAME, HOSTNAME, PWD, OLDPWD, and USER? Why is OLDPWD empty, that is, null? Make /tmp the PWD, and recheck the values of PWD and OLDPWD.What are they now?

User shell variables Shell variables are part of the local environment. That is, they are accessible to programs, scripts, and user commands. Users can create environment variables within a shell which then become part of the environment for that one shell. No other shells have access to these local user variables. If a change is made to a user shell variable or a new one created, it must be explicitly “exported” in order for any subprocesses forked after the new variable is created and exported to see the change. Recall that shell variables are local to the shell in which they were defined. A modified or added shell variable is only available in the current shell. To make a shell variable available as an environment variable for shells launched after the change, use the export VARNAME command without the dollar $ sign.

505

Chapter 17

Shell Configuration

Note By convention, environment variable names are all uppercase, but they can be mixed or all lowercase if that works for you. Just remember that Linux is case sensitive, so Var1 is not the same as VAR1 or var1. Let’s now look at setting new user shell variables.

EXPERIMENT 17-5 In the existing terminal session as the student user, start by ensuring that a new environment variable named MyVar does not exist, and set it. Then verify that it now exists and contains the correct value: [student@studentvm1 ~]$ echo $MyVar ; MyVar="MyVariable" ; echo $MyVar MyVariable [student@studentvm1 ~]$

Open another Bash terminal session as the student user, and verify that the new variable you created does not exist in this shell: [student@studentvm1 ~]$ echo $MyVar [student@studentvm1 ~]$

Exit from this second shell. In the first terminal session in which the $MyVar variable exists, verify that it still exists and start a screen session: [student@studentvm1 ~]$ echo $MyVar MyVariable [student@studentvm1 ~]$ screen

Now check for $MyVar: [student@studentvm1 ~]$ echo $MyVar [student@studentvm1 ~]$

Note that $MyVar does not exist in this screen instance of the Bash shell. Type the exit command once to exit from the screen session. Now run the export command, and then start another screen session: 506

Chapter 17

Shell Configuration

[student@studentvm1 ~]$ export MyVar="MyVariable" ; echo $MyVar MyVariable [student@studentvm1 ~]$ screen

Now check for $MyVar again while in the screen session: [student@studentvm1 ~]$ echo $MyVar MyVariable [student@studentvm1 ~]$

Exit from the screen session again, and unset MyVar: [student@studentvm1 ~]$ exit [screen is terminating] [student@studentvm1 ~]$ unset MyVar [student@studentvm1 ~]$ echo $MyVar [student@studentvm1 ~]$

Let’s try one last thing. The env utility allows us to set an environment variable temporarily for a program or in this case a subshell. The Bash command must be an argument of the env command in order for this to work: [student@studentvm1 [student@studentvm1 MyVariable [student@studentvm1 exit [student@studentvm1

~]$ env MyVar=MyVariable Bash ~]$ echo $MyVar ~]$ exit ~]$

This last tool can be useful when testing scripts or other tools that require an environment a bit different from the one in which you normally work. Perform a little cleanup– exit from all terminal sessions. We have now discovered empirically that when local variables are set, they become part of the environment for that shell only. Even after exporting the variable, it only becomes part of the environment of a new shell if that is launched via the screen command.

507

Chapter 17

Shell Configuration

I have very seldom had any reason to temporarily create a local user environment variable. I usually add my variable creation statements to the ~/.bashrc file if it is for my login account only, or I add it to a custom shell configuration script in /etc/profile.d if it is intended for all users of the system.

Aliases I dislike typing. I grew up and went to school in a time when boys did not learn typing, so I have really horrible typing skills. Therefore I prefer to type as little as possible. Of course lazy SysAdmins like to minimize typing just to save time regardless of the state of their typing skills. Aliases are a good way to reduce typing which will, therefore, reduce errors. They are a method for substituting a long command for a shorter one that is easier to type because it has fewer characters. Aliases are a common way to reduce typing by making it unnecessary to type in long options that we might use constantly by including them in the alias.

EXPERIMENT 17-6 As the student user, enter the alias command to view the current list of aliases. I did not know until I looked at these aliases that the ls command was already aliased. So when I enter “ls” on the command line, the shell expands that to "ls --color=auto" which would be a lot of extra typing: [student@testvm1 ~]$ alias alias egrep='egrep --color=auto' alias fgrep='fgrep --color=auto' alias glances='glances -t1' alias grep='grep --color=auto' alias l.='ls -d .* --color=auto' alias ll='ls -l --color=auto' alias ls='ls --color=auto' alias alias alias alias

508

lsn='ls --color=no' mc='. /usr/libexec/mc/mc-wrapper.sh' vi='vim' vim='vim -c "colorscheme desert" '

Chapter 17

Shell Configuration

alias which='(alias; declare -f) | /usr/bin/which --tty-only --read-alias --read-functions --show-tilde --show-dot' alias xzegrep='xzegrep --color=auto' alias xzfgrep='xzfgrep --color=auto' alias xzgrep='xzgrep --color=auto' alias zegrep='zegrep --color=auto' alias zfgrep='zfgrep --color=auto' alias zgrep='zgrep --color=auto'

Your results should look similar to mine, but I have added some additional aliases. One is for the glances utility which is not a part of most distributions. Since vi has been replaced by vim, and a lot of SysAdmins like myself have legacy muscle memory and continue to type vi, vi is aliased to vim. Another alias is for vim to use the “desert” color scheme. So when I type vi on the command line and press the Enter key, the Bash shell first expands vi to vim, and then it expands vim to vim -c "colorscheme desert" and then executes that command.

Note For the root user in Fedora, vi is not automatically aliased to vim. Although these aliases are almost all added to the global environment by the shell configuration files in /etc/profile.d, you can add your own using your local configuration files as well as by adding them at the command line. The command-line syntax is identical to that shown earlier. The aliases shown in Experiment 17-6 are primarily intended to set up default behavior such as color and some standard options. I particularly like the ll alias because I like the long listing of directory contents and instead of typing ls -l, I can just type ll. I use the ll command a lot, and it saves typing three characters every time I use it. For slow typists like me, that can amount to a lot of time. Aliases also enable me to use complex commands without the need to learn and remember a long and complex command with lots of options and arguments. I strongly recommend that you do not use aliases to alias Linux commands to those you used in another operating system like some people have done. You will never learn Linux that way.

509

Chapter 17

Shell Configuration

In Experiment 17-5 the alias for the vim editor sets a color scheme which is not the default. I happen to like the desert color scheme better than the default, so aliasing the vim command to the longer command that also specifies my favorite color scheme is one way to get what I want with less typing. You can use the alias command to add your own new aliases to the ~/.bashrc file to make them permanent between reboots and logout/in. To make the aliases available to all users on a host, add them to a customization file in /etc/profile.d as discussed earlier. The syntax in either case is the same as from the command line.

Chapter summary Does shell startup and configuration seem arcane and confusing to you? I would not be surprised because I was– and sometimes still am– confused. I learned and relearned a lot during my research for this chapter. The primary thing to remember is that there are specific files used for permanent configuration and that they are executed in different sequences depending upon whether a login or non-login shell is launched. We have explored the shell startup sequence, and we have looked at the content of the Bash configuration files and at the proper methods for changing the environment. We have also learned to use aliases to reduce the amount of typing we need to do.

Exercises Perform the following exercises to complete this chapter: 1. What is the difference between shell and environment variables? Why is this distinction important? 2. When starting a non-login Bash shell, which configuration file is run first? 3. Can a non-privileged user set or change their own shell variables? 4. Which configuration file is the first one to be executed by a newly launched shell on the desktop?

510

Chapter 17

Shell Configuration

5. What is the value of the COLUMNS variable in each of the open terminal sessions on your current desktop? If you don’t see a difference, resize one or more terminal windows and recheck the values. What might this variable be used for? 6. What is the sequence of shell configuration files run when you log in using a virtual console? 7. Why is it important to understand the sequence in which Bash configuration files are executed? 8. Add an alias that launches vim with a different color scheme and that is used only for the student user. The color schemes and an informative README.txt file are located in the directory, /usr/ share/vim/vim81/colors. Try a couple different color schemes, and test them by opening one of the Bash configuration files. 9. Where did you add the alias in question 8? 10. What sequence of Bash configuration files is run when you use the su command to switch to the root user? 11. What sequence of Bash configuration files is run when you use the sudo command? 12. You have an environment variable to add so that it becomes part of the environment for all users. In what file do you add it? 13. Which shell configuration files are executed when the system is booted into recovery mode for the latest kernel?

511

CHAPTER 18

Files, Directories, andLinks O bjectives In this chapter you will learn •

To define the term “file”

•

To describe the purpose of files

•

To read and describe file permissions

•

How the umask command and settings affect the creation of files by users

•

To set file permissions

•

The structure of the metadata for files including the directory entry and the inode

•

To describe the three types of Linux file timestamps

•

To find, use, and set the three timestamps of a Linux file

•

The easy way to identify what type a file is, binary or text

•

To obtain the metadata for a file

•

To define hard and soft links

•

How to use and manage links

513

Chapter 18

Files, Directories, andLinks

I ntroduction We usually think of files as those things that contain data and that are stored on some form of storage media such as a magnetic or solid-state hard drive. And this is true– as far as it goes in a Linux environment. The Free On-line Dictionary of Computing1 provides a good definition for “computer file” that I will paraphrase here in a way that refers specifically to Linux files. A computer file is a unit of storage consisting of a single sequence of data with a finite length that is stored on a nonvolatile storage medium. Files are stored in directories and are accessed using a file name and an optional path. Files also support various attributes such as permissions and timestamps for creation, last modification, and last access. Although this definition is basically what I said, it provides more detail about the characteristics that are an intrinsic part of Linux files. I would amend the FOLDOC definition to say that files are usually stored on some nonvolatile medium. Files can also be stored on volatile media such as virtual filesystems which we will explore in Volume 2, Chapter 5. In this chapter we will explore these characteristics, the data meta-structures that provide these capabilities, and more.

P reparation We did create a few directories and files in Chapter 7, but because there are no user files in the ~/Documents directory for us to experiment with during this chapter, let’s create some there.

Free On-line Dictionary of Computing, http://foldoc.org/, Editor Denis Howe

514

Chapter 18

Files, Directories, andLinks

EXPERIMENT 18-1 We will create some new files and a new user to help illustrate some aspects of file permissions. Start this experiment as the student user. Make the PWD the ~/Documents directory. Enter the following command on a single line: [student@studentvm1 Documents]$ for I in `seq -w 20`;do dmesg > testfile$I;touch test$I file$I;done

The seq utility prints a sequence of numbers, in this case from 0 to 20. The back-tics (`) around that command cause the results to be expanded into a list that can be used by the for command. The -w option specifies that all numbers will have the same length, so if the largest number is two digits in length, the single-digit numbers are padded with zeros so that 1 becomes 01 and so on. Display a long list of files, and display their sizes in human-readable format rather than an exact byte count: [student@studentvm1 Documents]$ ll -h total 880K -rw-rw-r-- 1 student student0 Dec4 -rw-rw-r-- 1 student student0 Dec4 -rw-rw-r-- 1 student student0 Dec4 -rw-rw-r-- 1 student student0 Dec4 -rw-rw-r-- 1 student student0 Dec4 -rw-rw-r-- 1 student student0 Dec4 <snip> -rw-rw-r-- 1 student student0 Dec4 -rw-rw-r-- 1 student student0 Dec4 -rw-rw-r-- 1 student student0 Dec4 -rw-rw-r-- 1 student student 44K Dec4 -rw-rw-r-- 1 student student 44K Dec4 -rw-rw-r-- 1 student student 44K Dec4 <snip> -rw-rw-r-- 1 student student 44K Dec4 -rw-rw-r-- 1 student student 44K Dec4

09:47 09:47 09:47 09:47 09:47 09:47

file01 file02 file03 file04 file05 file06

09:47 09:47 09:47 09:47 09:47 09:47

test18 test19 test20 testfile09 testfile02 testfile03

09:47 testfile19 09:47 testfile20

515

Chapter 18

Files, Directories, andLinks

Now we have a few files to work with. But we will also need another user for testing, so log in to a terminal session as root, if one is not already available, and add a new user. Using a simple password is fine: [root@studentvm1 ~]# useradd -c "Student user 1" student1 [root@studentvm1 ~]# passwd student1 Changing password for user student1. New password: <Enter the password> BAD PASSWORD: The password is shorter than 8 characters Retype new password: <Enter the password again> passwd: all authentication tokens updated successfully.

Now we are ready.

User accounts andsecurity User accounts are the first line of security on your Linux computer and are used in the Linux world to provide access to the computer, to keep out people who should not have access, and to keep valid users from interfering with each other’s data and usage of the computer. We will explore more aspects of user accounts in Chapter 16 of Volume 2. The security of the computer and the data stored there are based on the user accounts created by the Linux system administrator or some form of centralized authorization system.2 A user cannot access any resources on a Linux system without logging on with an account ID and password. The administrator creates an account for each authorized user and assigns an initial password. The attributes of permissions and file ownership are one aspect of security provided by Linux. Each file and directory on a Linux system has an owner and a set of access permissions. Setting the ownership and permissions correctly allows users to access the files that belong to them but not to files belonging to others.

Centralized authentication systems are beyond the scope of this course.

516

Chapter 18

Files, Directories, andLinks

F ile attributes The listing of the files created in Experiment 18-1 shows a number of file attributes that are important to security and access management. The file permissions, the number of hard links, the user and group3 ownership both shown here as “student,” the file size, the date and time it was last modified, and the file name itself are all shown in this listing. There are more attributes that are not displayed in this listing, but we will explore all of them as we proceed through this chapter.

F ile ownership The sample file listing shown in Figure18-1 and extracted from the listing in Experiment 18-1 shows the details of a single file. We will use this file to explore the structure and attributes of a file. File ownership is one of the attributes that is part of the Linux file security protocols. -rw-rw-r-- 1 student student 44K Dec 4 09:47 testfile09

Figure 18-1. The long listing of a sample file There are two owners associated with every file, the User who owns the file and the Group ownership. The user who created the file is always the owner of a file– at least until ownership is changed. In Red Hat–based distributions, each user has their own private group and files created by them also belong to that group. This is the Red Hat Private Group method and is used to improve security. In many older Unix and some Linux systems, all users, and thus the files they created, belonged to a common group, usually group 100, “users.” This meant that all users could, in theory at least, access files belonging to other users, so long as directory permissions allowed it. This is a holdover from a time when data security and privacy on computers was much less of an issue than it is now. This Red Hat Private Group scheme is intended to improve security by reducing the number of users who have access to the files by default to one– the file owner.

I capitalize User, Group, and Other here and many places throughout this course in order to explicitly refer to the ownership classes shown in Figure18-2.

517

Chapter 18

Files, Directories, andLinks

So the file in Figure18-1 is owned by the user student, and the group ownership is student. The user and group ownership can be expressed using the notation User.Group. The root user can always change user and group ownership– or anything else. The User (owner) of a file can only change the Group ownership under certain circ*mstances. There are some standards that we need to consider when adding users and groups. When adding group IDs for things like shared directories and files, I like to choose numbers starting at 5000 and above. This allows space for 4,000 users with identical UID and GID numbers. We will explore UID and GID assignments and standards in Chapter 16 of Volume 2. Let’s explore file ownership and its implications in Experiment 18-2.

EXPERIMENT 18-2 Perform this experiment as the student user. Look at one of the files we created in our ~/ Documents directory in Experiment 18-1, file09: [student@studentvm1 Documents]$ ll file09 -rw-rw-r-- 1 student student 0 Dec4 09:47 file09

This file, like all of the others in our Documents directory, has the ownership student.student. Let’s try to change it to ownership of student1.student user using the chown (Change OWNersip) command: [student@studentvm1 Documents]$ chown student1 file09 chown: changing ownership of 'file09': Operation not permitted

The student user does not have authority to change the user ownership of a file to any other user. Now let’s try to change the group ownership. If you are changing the User ownership of the file and not the group ownership, it is not necessary to specify the group with the chown command. We can use the chgrp (Change GRouP) command to attempt changing the group ownership: [student@studentvm1 Documents]$ chgrp student1 file09 chgrp: changing group of 'file09': Operation not permitted

Once again we are not authorized to change the ownership on this file. Linux prevents users from changing the ownership of files to protect us from other users and to protect those other users from us. The root user can change the ownership of any file.

518

Chapter 18

Files, Directories, andLinks

It looks like the user cannot change the file’s user and group ownership at all. This is a security feature. It prevents one user from creating files in the name of another user. But what if I really do want to share a file with someone else? There is one way to circumvent the ownership issue. Copy the file to /tmp. Let’s see how that works.

EXPERIMENT 18-3 As the student user, let’s first add a bit of data to file09: [student@studentvm1 Documents]$ echo "Hello world." > file09 [student@studentvm1 Documents]$ cat file09 Hello world.

Now copy the file to /tmp: [student@studentvm1 Documents]$ cp file09 /tmp

Open a terminal session, and use the su command to switch user to student1: [student@studentvm1 ~]$ su- student1 Password: <Enter password for student1> Running /etc/profile Running /etc/profile.d/myBashConfig.sh Running /etc/Bashrc Running /etc/Bashrc [student1@studentvm1 ~]

Now view the contents of the file that is located in /tmp. Then copy the file from /tmp to the student1 home directory, and view it again: [student1@studentvm1 ~]$ cat /tmp/file09 Hello world. [student1@studentvm1 ~]$ cp /tmp/file09 . ; cat file09 Hello world.

Why does this work? Let’s look at file permissions to find out: [student1@studentvm1 ~]$ ll /tmp/file09 file09 -rw-rw-r-- 1 student1 student1 13 Apr1 09:00 file09 -rw-rw-r-- 1 studentstudent13 Apr1 08:56 /tmp/file09 [student1@studentvm1 ~]$

519

Chapter 18

Files, Directories, andLinks

F ile permissions The file permissions, also called the file mode, along with file ownership, provide a means of defining which users and groups have specific types of access to files and directories. For now we just look at files and will examine directory permissions later. Figure18-2 shows the three types of permissions and their representation in symbolic (rwx) and octal (421) formats. Octal is only a bit different from Hex– literally. Hex characters are composed of four binary bits, and octal is composed of three binary bits. User, Group, and Other define the classes of users that the permissions affect. The User is the primary owner of the file. So the User student owns all files with user ownership of student. Those files may or may not have group ownership of student, but in most circ*mstances, they will. So the User permissions define the access rights of the User who “owns” the file. The Group permissions define the access rights of the Group that owns the file, if it is different from the User ownership. And Other is everyone else. All other users fall into the Other category, so access by all other users on the system is defined by the Other permissions.

hƐĞƌ

'ƌŽƵƉ

KƚŚĞƌ

WĞƌŵŝƐƐŝŽŶƐ

r w x

ŝƚƐ

1 1 1

KĐƚĂůǀĂůƵĞ

4 2 1

Figure 18-2. File permission representations and their octal values There are three permissions bits for each class, User, Group, and Other. Each bit has a meaning, (r)ead, (w)rite, and e(x)ecute, and a corresponding octal positional value. We can simplify the class notation by using “UGO” either together or separately in this text. These classes are expressed in lowercase in the commands that affect them:

520

•

Read means that the file can be read by members of that class.

•

Write means that the file can be written by members of the class.

•

Execute means that the file is executable by members of that class.

Chapter 18

Files, Directories, andLinks

Using file09 from Experiment 18-3 as our example, the permissions shown for that file in Figure18-3 should now be easier to decipher. The permissions of rw-rw-r-(420,420,400 which equals 664) mean that the student user can read and write the file and it is not executable. The student group can also read and write this file. And all other users can read the file but cannot write to it which means they cannot alter it in any way. rw-rw-r-- 1 student student 0 Dec 4 09:47 file09

Figure 18-3. The long listing of file09 You see what is possible here? The file is readable by any user. That means that copying it from the /tmp directory, which is universally accessible, to the student1 home directory by student1 will work so long as the file has the read permission set for Other.

EXPERIMENT 18-4 As the user student, change the permissions on /tmp/file09 to rw-rw---- so that Other does not have permissions to read the file: [student@studentvm1 ~]$ cd /tmp -rw-rw-r-- 1 student student 13 [student@studentvm1 tmp]$ chmod -rw-rw---- 1 student student 13

; ll file* Dec6 14:05 file09 660 file09 ; ll file* Dec6 14:05 file09

Now as the student1 user try to read the file: [student1@studentvm1 ~]$ cat /tmp/file09 cat: /tmp/file09: Permission denied

Even though the file is located in a directory that is accessible by all users, users other than student no longer have access to the file. They cannot now view its content, and they cannot copy it. In Experiment 18-4 we changed the files using the octal representation of the permissions we want, which is the shortest command and so the least amount of typing. How did we get 660 for the permissions? Let’s start with the permissions for the User which is one octal digit. 521

Chapter 18

Files, Directories, andLinks

Each octal digit can be represented by three bits, r,w,x, with the positional values of 4,2,1. So if we want read and write but not execute, that is 110in binary which translates to 4+2+0=6. We perform the same operation for the Group ownership. Full read, write, execute translates to 111in binary which becomes 4+2+1=7in octal. We will discuss file permissions and methods for changing them a bit later in this chapter.

Directory permissions Directory permissions are not all that different from file permissions: •

The read permission on a directory allows access to list the content of the directory.

•

Write allows the users of a class to create, change, and delete files in the directory.

•

Execute allows the users of a class to make the directory the present working directory (PWD).

There are two additional permissions, called special mode bits, which are used extensively by the system but that are usually functionally invisible to non-root users. These are the setgid and setuid bits. We will use the setgid permission later in this chapter.

Implications ofGroup ownership We still need a way for users to share files with some other users but not all users. This is where groups can provide an answer. Group ownership in Linux is all about security while also being able to allow sharing access to files with other users. One of the Unix legacies that Linux has inherited is file ownership and permissions. This is good, but a bit of explanation is in order. A group is an entity defined in the /etc/group file with a meaningful name, such as “development” or “dev” that lists the user IDs, like “student,” of the members of the that group. So by making group ownership of a file to be “development,” all members of the development group can access the file based on its Group permissions. Let’s see how this is done in Experiment 18-5 and learn a few other things along the way. 522

Chapter 18

Files, Directories, andLinks

EXPERIMENT 18-5 This experiment will require working as different users including root. We will create a new user to use for testing and a group for developers. We will use the short version, dev, for the name. We will then create a directory, also called dev, where shared files can be stored and add two of our now three non-root users to the dev group. Start as root and create the new user. Again, it is fine to use a short password on your VM for these experiments: [root@studentvm1 ~]# useradd -c "Student User 2" student2 [root@studentvm1 ~]# passwd student2

Changing password for user student2: New password: <Enter new password> BAD PASSWORD: The password is shorter than 8 characters Retype new password: <Enter new password> passwd: all authentication tokens updated successfully.

Add the new group. There are some loose standards for group ID numbers which we will explore in a later chapter, but the bottom line is that we will use GID (Group ID) 5000 for this experiment: [root@studentvm1 ~]# groupadd -g 5000 dev

We now add two of the existing users to the dev group, student and student1, using the usermod (user modify) utility. The -G option is a list of the groups to which we are adding the user. In this case the list of groups is only one in length, but we could add a user to more than one group at a time: [root@studentvm1 ~]# usermod -G 5000 student [root@studentvm1 ~]# usermod -G 5000 student1

Another option for adding the users to the new group would be to use gpasswd instead of usermod. Either of these methods creates the same result so that both users are added to the dev group: [root@studentvm1 ~]# gpasswd -M student,student1 dev

523

Chapter 18

Files, Directories, andLinks

Look at the /etc/group file. The tail command shows the last ten lines of the data stream: [root@studentvm1 ~]# tail /etc/group vboxsf:x:981: dnsmasq:x:980: tcpdump:x:72: student:x:1000: screen:x:84: systemd-timesync:x:979: dictd:x:978: student1:x:1001: student2:x:1002: dev:x:5000:student,student1

As the root user, create the shared directory /home/dev, and set the group ownership to dev and the permissions to 770 (rwxrwx---) which will prevent users who are not members of the dev group from accessing the directory: [root@studentvm1 ~]# cd /home ; mkdir dev ; ll total 32 drwxr-xr-x2 rootroot4096 Dec9 10:04 dev drwx------.2 root root 16384 Aug 13 16:16 lost+found drwx------. 22 studentstudent 4096 Dec9 09:35 student drwx------4 student1 student14096 Dec9 09:26 student1 drwx------3 student2 student24096 Dec7 12:37 student2 [root@studentvm1 home]# chgrp dev dev ; chmod 770 dev ; ll total 32 drwxrwx---2 rootdev 4096 Dec9 10:04 dev drwx------.2 root root 16384 Aug 13 16:16 lost+found drwx------. 22 studentstudent 4096 Dec9 09:35 student drwx------4 student1 student14096 Dec9 09:26 student1 drwx------3 student2 student24096 Dec7 12:37 student2

As the student user, make /home/dev the PWD: [student@studentvm1 ~]$ cd /home/dev -Bash: cd: /home/dev: Permission denied

This fails because the new group membership has not been initialized: [student@studentvm1 ~]$ id

524

Chapter 18

Files, Directories, andLinks

uid=1000(student) gid=1000(student) groups=1000(student)

Group memberships are read and set by the shell when it is started in a terminal session or a virtual console. To make this change, you need to exit from all of your terminal sessions, log out, log back in, and start new terminal sessions in order for the shells to initialize the new group settings. After starting a new shell, verify that the new group has been initialized for your user ID. The key to understanding this is that Linux only reads the /etc/group file when a login shell is started. The GUI desktop is the login shell, and the terminal emulator sessions that you start on the desktop are not login shells. Remote access using SSH is a login shell as are the virtual consoles. The shells that run in screen sessions are not login shells. Remember the startup sequences we followed in Chapter 17 of this volume? Login shells run a different set of shell configuration scripts during their startup. Refer to Figure17-1: [student@studentvm1 ~]$ id uid=1000(student) gid=1000(student) groups=1000(student),5000(dev)

Make /home/dev the PWD and verify that the directory is empty: [student@studentvm1 ~]$ cd /home/dev ; ll -a total 8 drwxrwx---2 root dev4096 Dec9 10:04 . drwxr-xr-x. 7 root root 4096 Dec9 10:04 ..

As the student user, create a file in the /home/dev directory, change the Group ownership to dev, and set permissions to 660 to prevent Other users from having access to the file: [student@studentvm1 dev]$ echo "Hello World" > file01 ; chgrp dev file01 ; ll total 4 -rw-rw-r-- 1 student dev 12 Dec9 13:09 file01

Now open a new terminal session and switch user to student1: [student@studentvm1 ~]$ su- student1 Password: <Enter password for student1> Running /etc/profile Running /etc/profile.d/myBashConfig.sh Running /etc/Bashrc Running /etc/Bashrc [student1@studentvm1 ~]

525

Chapter 18

Files, Directories, andLinks

As the student1 user, make /home/dev the PWD, and add some text to the file: [student1@studentvm1 ~]$ cd ../dev ; echo "Hello to you, too" >> file01 ; cat file01 Hello World Hello to you, too

Now we have a way to share files among users. But there is still one more thing we can do to make it even easier. When we created the file in the shared dev directory, it had the group ID that belonged to the user that created, it but we changed that to the group dev. We can add the setgid (Set Group ID) bit, or SGID, on the directory which informs Linux to create files in the /home/dev directory with the GID being the same as the GID of the directory. Set the SGID bit using symbolic notation. It can be done with octal mode, but this is easier: [root@studentvm1 home]# chmod g+s dev ; ll total 36 drwxrws---2 rootdev 4096 Dec9 drwx------.2 root root 16384 Aug 13 drwx------. 22 studentstudent 4096 Dec9 drwx------4 student1 student14096 Dec9 drwx------4 student2 student24096 Dec9 [root@studentvm1 home]#

13:09 16:16 15:16 12:56 13:03

dev lost+found student student1 student2

Notice the lowercase s in the group permissions of the dev directory. Lowercase s means that both the setgid and execute bits are on, and an uppercase S means that the setgid bit is on, but the execute bit is off. For those who want to try this using octal mode, the octal mode settings we usually use consist of three octal digits, from 0 to 7, as User, Group, and Other sets of permissions. But there is a fourth octal digit that can precede these three more common digits, but if it is not specified, it is ignored. The SGID bit is octal 2 (010in binary), so we know we want the octal permissions settings to be 2770 on the dev directory. That can be set like this: [root@studentvm1 home]# ll | grep dev ; chmod 2770 dev ; ll | grep dev drwxrwx---2 rootdev 4096 Apr1 13:39 dev drwxrws---2 rootdev 4096 Apr1 13:39 dev

526

Chapter 18

Files, Directories, andLinks

As both student and student1 users, make /home/dev the PWD, and create some new files. Notice that the files were created with dev as the group owner, so it was not necessary to change it with the chgrp command. In a terminal session, switch user to student2, and make /home/dev the PWD: [student2@studentvm1 ~]$ cd /home/dev -Bash: cd: /home/dev: Permission denied [student2@studentvm1 ~]$

Permission is denied because student2 is not a member of the dev group and the directory permissions do not allow access to the directory by nonmembers. We now have an easy way for the users in the group to share files securely. This could be one group that shares files on a host. Other groups might be accounting, marketing, transportation, test, and so on.

u mask When a user creates a new file using commands like touch, or redirecting the output of a command to a file, or using an editor like Vim, the permissions of the file are -rw- rw-r--.Why? Because umask. The umask is a setting that Linux uses to specify the default permissions of all new files. The umask is set in /etc/profile, one of the Bash shell configuration files that we covered in Chapter 17. The umask for root is 0022, and the umask for unprivileged users is 0002. The tricky element of umask is that it is a form of reverse logic. It does not specify the bits of the file privileges we want to set to on, it specifies the ones we want to set to off when the file is created. The execute bit is never set to on for new files. Therefore the umask setting only applies to the read and write permissions. With a umask of 000, and considering that the execute bit is never set to on for new files, the default permissions of a new file would be rw-rw-rw-, but with the umask 2-bit on for Other, the write permission is rw-rw-r-- so that Other users can read the file but not delete or change it. The umask command is used to set the umask value.

527

Chapter 18

Files, Directories, andLinks

EXPERIMENT 18-6 This experiment should be performed as the student user. Since we have already seen the permissions on plenty of new files using the default umask, we start by viewing the current umask value: [student@studentvm1 ~]$ umask 0002

There are four digits there, and the three right ones are User, Group, and Other. What is the first one? Although this is meaningless for Linux files when using this command, the leading zero can be used in some commands to specify the special mode bits, setgid and setuid as we have just seen. This can be safely ignored when using the umask command. The info setgid command can provide a link to more information about these special mode bits. Now let’s change the umask and run a quick test. There is probably already a file01in your home directory, so we will create the file umask.test as a test of the new umask: [student@studentvm1 ~]$ umask 006 ; umask 0006 [student@studentvm1 ~]$ touch umask.test ; ll umask.test -rw-rw---- 1 student student 0 Apr2 08:50 umask.test [student@studentvm1 ~]$

The umask is only set for the shell in which the command is issued. To make it persistent across all new shell sessions and after reboots, it would need to be changed in /etc/profile. The new file was created with permissions that do not allow any access for users in the Other class. Set the umask back to 002. I have never personally encountered a situation in which changing the umask for any of my Linux systems made sense for me, but I know of situations in which it did for some other users. For example, it might make sense to set the umask to 006 to prevent Other users from any access to the file even when it is located in a commonly accessible directory, as we did in Experiment 18-6. It might also make sense to change it before performing operations on many files in a script so that it would not be necessary to perform a chmod on every file.

528

Chapter 18

Files, Directories, andLinks

Changing file permissions You have probably noticed that the methods for setting file and directory permissions are flexible. When setting permissions, there are two basic ways of doing so: symbolic and octal numeric. We have used both in setting permissions, but it is necessary to delve a little further into the chmod command to fully understand its limitations as well as the flexibility it provides.

EXPERIMENT 18-7 Perform this experiment as the student user. Let’s first look at setting permissions using numeric notation. Suppose we want to set a single file’s permissions to rw-rw-r. This is simple. Let’s use ~/umask.test for this. Verify the current permissions, and then set the new ones: [student@studentvm1 ~]$ ll umask.test ; chmod 664 umask.test ;ll umask.test -rw-rw---- 1 student student 0 Apr2 08:50 umask.test -rw-rw-r-- 1 student student 0 Apr2 08:50 umask.test [student@studentvm1 ~]$

This method of setting permissions ignores any existing permissions. Regardless of what they were before the command, they are now what was specified in the command. There is no means to change only one or some permissions. This may not be what we want if we need to add a single permission to multiple files. In order to test this, we need to create some additional files and set some differing permissions on them. Make the ~/testdir the PWD: [student@studentvm1 ~]$ cd ~/testdir [student@studentvm1 testdir]$ for I in `seq -w 100` ; do touch file$I ; done

You can list the directory content to verify that the new files all have permissions of rwrw-r--.If the width of your terminal is 130 columns or more, you can pipe the output like this: [student@studentvm1 testdir]$ ll | column total 0-rw-rw---- 1 student student 0 Dec 12 21:56 file051 -rw-rw---- 1 student student 0 Dec 12 21:56 file001-rw-rw---- 1 student student 0 Dec 12 21:56 file052

529

Chapter 18

Files, Directories, andLinks

-rw-rw---- 1 student student 0 Dec 12 21:56 file002-rw-rw---- 1 student student 0 Dec 12 21:56 file053 -rw-rw---- 1 student student 0 Dec 12 21:56 file003-rw-rw---- 1 student student 0 Dec 12 21:56 file054 -rw-rw---- 1 student student 0 Dec 12 21:56 file004-rw-rw---- 1 student student 0 Dec 12 21:56 file055 <snip>

We could also do something like this to display just the file names and their permissions which leaves enough space to format the output data stream into columns: [student@studentvm1 testdir]$ ll | awk '{print $1" "$9}' | column total-rw-rw---- file026-rw-rw---- file052-rw-rw----rw-rw---- file001-rw-rw---- file027-rw-rw---- file053-rw-rw----rw-rw---- file002-rw-rw---- file028-rw-rw---- file054-rw-rw----rw-rw---- file003-rw-rw---- file029-rw-rw---- file055-rw-rw---<snip> -rw-rw---- file019-rw-rw---- file045-rw-rw---- file071-rw-rw----rw-rw---- file020-rw-rw---- file046-rw-rw---- file072-rw-rw----rw-rw---- file021-rw-rw---- file047-rw-rw---- file073-rw-rw----rw-rw---- file022-rw-rw---- file048-rw-rw---- file074-rw-rw----rw-rw----file023-rw-rw---- file049-rw-rw---- file075 -rw-rw----file024-rw-rw---- file050-rw-rw---- file076 -rw-rw----file025-rw-rw---- file051-rw-rw---- file077 [student@studentvm1 testdir]$

file078 file079 file080 file081 file097 file098 file099 file100

The awk command uses the whitespaces, to determine the fields in the original data stream from the ll command. We then use variables with a list of the fields we want to print, in this case fields $1 and $9. Then we pipe the result of that through the column utility to enable better use of the terminal width. Let’s change the permissions on some of these files. First we change all of them. Be sure to verify the results after each change: [student@studentvm1 testdir]$ chmod 760 *

Now let’s add read to Other for a subset of the files. And then a few more changes: [student@studentvm1 testdir]$ chmod 764 file06* ; ll [student@studentvm1 testdir]$ chmod 764 file0*3 ; ll [student@studentvm1 testdir]$ chmod 700 file0[2-5][6-7] ; ll

530

Chapter 18

Files, Directories, andLinks

[student@studentvm1 testdir]$ chmod 640 file0[4-7][2-4] ; ll

There should be several differing sets of permissions. So far we have mostly been using brute force to change all of the permissions on various files filtered by the file globbing and sets. This is the best we can do using numeric formats for our changes. Now we become a bit more targeted. Suppose we want to turn on the G (group) execute bit for files file013, file026, file027, file036, file053, and file092. Also, a file cannot be executed if the read bit for the G class is not also set to on, so we need to turn that bit on, too, for these files. Note that some of these files already have some of these bits set, but that is ok; setting them to the same value again does not cause any problems. We also want to ensure that the write bit is off for all of these files so that users in the same group cannot change the files. We can do this all in one command without changing any of the other permissions on these files or any other files: [student@studentvm1 testdir]$ chmod g+rx,g-w file013 file026 file027 file036 file053 file092 [student@studentvm1 testdir]$ ll | awk '{print $1" "$9}' | column

We have used the symbolic mode to both add and remove permissions from a list of files having a range of existing permissions that we needed to keep unchanged.

Applying permissions Permissions can sometimes be tricky. Given a file with ownership of student.student and the permissions --- rw- rw-, would you expect the student user to be able to read this file? You probably would, but permissions do not work like that. The permissions are scanned from left to right with the first match in the sequence providing permissions access. In this case, the student user attempts to read the file, but the scan of the permissions finds permissions --- for the User of the file. This means that the User has no access to this file.

531

Chapter 18

Files, Directories, andLinks

EXPERIMENT 18-8 As the student user in ~/testdir, change the permissions of file001 to 066 then try to read it: [student@studentvm1 testdir]$ chmod 066 file001 ; ll file001 ; cat file001 ----rw-rw- 1 student student 0 Dec 12 21:56 file001 cat: file001: Permission denied

Despite the fact that Group and Others have read and write access to the file, the User cannot access it. The user can, however, change the permissions back by adding u+rw. Now as the student user make /home/dev the PWD and create a file with a bit of content there, set the permissions to 066, and read the file: [student@studentvm1 dev]$ echo "Hello World" > testfile-01.txt ; ll ; cat testfile-01.txt total 4 -rw-rw-r-- 1 student dev 12 Apr2 09:19 testfile-01.txt Hello World

Note that the group ownership of this file is dev. Then as the student1 user, make /home/dev/ the PWD, and read the file: [student1@studentvm1 ~]$ cd /home/dev ; cat testfile-01.txt Hello World

This shows that we can create a file to which the owner has no access but members of a common group (dev in this case) or anyone else can have access to read and write it.

Timestamps All files are created with three timestamps: access, atime; modify, mtime; and ctime, change. These three timestamps can be used to determine the last time a file was accessed, the permissions or ownership changed, or the content modified. Note that the time displayed in a long file listing is the mtime which is the time that a file or directory was last modified. This time in the listing is truncated to the nearest second, but all of the timestamps are maintained to the nanosecond. We will look at this information in more detail when we look at the “File information” section.

532

Chapter 18

Files, Directories, andLinks

F ile meta-structures All of these file attributes are all stored in the various meta-structures on the hard drive. Each file has a directory entry that points to an inode for the file. The inode contains most of the information pertaining to the file including the location of the data on the hard drive. We will look in detail at the meta-structures of the EXT4 filesystem, which is the default for many distributions, in Chapter 19 of this volume.

The directory entry The directory entry is very simple. It resides in a directory such as your home directory and contains the name of the file and the pointer to the inode belonging to the file. This pointer is the inode number.

T he inode The inode is more complex than the directory entry because it contains all of the other metadata pertaining to the file. This metadata includes the User and Group IDs, timestamps, access permissions, what type of file such as ASCII text or binary executable, pointers to the data on the hard drive, and more. Each inode in a filesystem– a partition or logical volume– is identified with a unique inode number. We will discuss the inode in more detail later in this chapter because it is a very important part of the EXT filesystem meta-structure.

F ile information There are a number of different types of files that you can run into in a Linux environment. Linux has some commands to help you determine a great deal of information about files. Most of the information provided by these tools is stored in the file inode.

533

Chapter 18

Files, Directories, andLinks

EXPERIMENT 18-9 The file command tells what type of file it is. The following command tells us that the .bash_ profile file is ASCII text file: [student@studentvm1 ~]$ file .bash_profile .bash_profile: ASCII text

And the following command tells us that /bin/ls is a compiled executable binary file that is dynamically linked: [student@studentvm1 ~]$ file /bin/ls /bin/ls: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, Build ID[sha1]=d6d0ea6be508665f5586e90a30819d090710842f, stripped, too many notes (256)

The strings command extracts all of the text strings from any file including binary executables. Use the following command to view the text strings in the ls executable. You may need to pipe the output through the less filter: [student@studentvm1 ~]$ strings /bin/ls

The strings command produces a lot of output from a binary file like ls. Much of the ASCII plain text is just random text strings that appear in the binary file, but some are actual messages. The stat command provides a great deal of information about a file. The following command shows atime, ctime, and mtime, the file size in bytes and blocks, its inode, the number of (hard) links, and more: [student@studentvm1 ~]$ stat /bin/ls File: /bin/ls Size: 157896Blocks: 312IO Block: 4096regular file Device: fd05h/64773dInode: 787158 Links: 1 Access: (0755/-rwxr-xr-x)Uid: (0/root) Gid: (0/root) Access: 2018-12-13 08:17:37.728474461 -0500 Modify: 2018-05-29 12:33:21.000000000 -0400 Change: 2018-08-18 10:35:22.747214543 -0400 Birth: -

534

Chapter 18

Files, Directories, andLinks

Look at one of the files in ~/testdir that we just changed the permissions for: [student@studentvm1 testdir]$ stat file013 File: file013 Size: 0Blocks: 0IO Block: 4096regular empty file Device: fd07h/64775dInode: 411 Links: 1 Access: (0754/-rwxr-xr--)Uid: ( 1000/ student) Gid: ( 1000/ student) Access: 2018-12-12 21:56:04.645978454 -0500 Modify: 2018-12-12 21:56:04.645978454 -0500 Change: 2018-12-13 09:56:19.276624646 -0500 Birth: -

This shows that the ctime (Change) records the date and time that the file attributes such as permissions or other data stored in the inode were changed. Now let’s change the content by adding some text to the file and check the metadata again: [student@studentvm1 testdir]$ echo "Hello World" > file013 ; stat file013 File: file013 Size: 12Blocks: 8IO Block: 4096regular file Device: fd07h/64775dInode: 411 Links: 1 Access: (0754/-rwxr-xr--)Uid: ( 1000/ student) Gid: ( 1000/ student) Access: 2018-12-12 21:56:04.645978454 -0500 Modify: 2018-12-13 12:33:29.544098913 -0500 Change: 2018-12-13 12:33:29.544098913 -0500 Birth: -

The mtime has changed because the file content was changed. The number of blocks assigned to the file has changed, and these changes are stored in the inode, so the ctime is changed, too. Note that the empty file had 0 data blocks assigned to it, and after adding 12 characters, 8 blocks have been assigned which is way more than needed. But this illustrates that file space on the hard dive is preallocated when the file is created in order to help reduce file fragmentation which can reduce file access efficiency. Let’s read the data in the file and check the metadata one more time: [student@studentvm1 testdir]$ cat file013 ; stat file013 Hello World File: file013 Size: 12Blocks: 8IO Block: 4096regular file Device: fd07h/64775dInode: 411 Links: 1 Access: (0754/-rwxr-xr--)Uid: ( 1000/ student) Gid: ( 1000/ student)

535

Chapter 18 Access: Modify: Change: Birth:

Files, Directories, andLinks 2018-12-13 12:44:47.425748206 -0500 2018-12-13 12:33:29.544098913 -0500 2018-12-13 12:33:29.544098913 -0500 -

First we see the content of the file, then we can see that this access to the file changed the atime. Spend some time exploring the results from other files including some of the ones in your home directory and the ~/testdir.

L inks Links are an interesting feature of Linux filesystems that can make some tasks easier by providing access to files from multiple locations in the filesystem directory tree without the need for typing long pathnames. There are two types of links: hard and soft. The difference between the two types of links is significant, but both types are used to solve similar problems. Both types of links provide multiple directory entries, that is, references, to a single file, but they do it quite differently. Links are powerful and add flexibility to Linux filesystems. I have found in the past that some application programs required a particular version of a library. When an upgrade to that library replaced the old version, the program would crash with an error specifying the name of the old library that was missing. Usually the only change in the library name was the version number. Acting on a hunch, I simply added a link to the new library but named the link after the old library name. I tried the program again, and it worked perfectly. And, OK, the program was a game, and everyone knows the lengths that gamers will go to to keep their games running. In fact almost all applications are linked to libraries using a generic name with only a major version number in the link name, while the link points to the actual library file that also has a minor version number. In other instances, required files have been moved from one directory to another in order to be in compliance with the Linux Filesystem Hierarchical Standard (FHS) that we will learn about in Chapter 19. In this circ*mstance, links have been provided in the old directories to provide backward compatibility for those programs that have not yet caught up with the new locations. If you do a long listing of the /lib64 directory, you can find many examples of both. A shortened listing can be seen in Figure18-4.

536

Chapter 18

Files, Directories, andLinks

lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.hwm > ../../usr/share/cracklib/pw_dict.hwm lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.pwd > ../../usr/share/cracklib/pw_dict.pwd lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.pwi > ../../usr/share/cracklib/pw_dict.pwi lrwxrwxrwx. 1 root root libaccountsservice.so.0.0.0 -rwxr-xr-x. 1 root root lrwxrwxrwx

1 root root

-rwxr-xr-x

1 root root

lrwxrwxrwx. 1 root root

27 Jun 9 2016 libaccountsservice.so.0 ->

288456 Jun 9 2016 libaccountsservice.so.0.0.0 15 May 17 11:47 libacl.so.1 -> libacl.so.1.1.0 36472 May 17 11:47 libacl.so.1.1.0 15 Feb 4 2016 libaio.so.1 -> libaio.so.1.0.1

-rwxr-xr-x. 1 root root

6224 Feb 4 2016 libaio.so.1.0.0

-rwxr-xr-x. 1 root root

6224 Feb 4 2016 libaio.so.1.0.1

lrwxrwxrwx. 1 root root calendar.so.4.14.26 -rwxr-xr-x.

1 root root

lrwxrwxrwx. 1 root root contact.so.4.14.26

30 Jan 16 16:39 libakonadi-calendar.so.4 -> libakonadi816160 Jan 16 16:39 libakonadi-calendar.so.4.14.26 29 Jan 16 16:39 libakonadi-contact.so.4 -> libakonadi-

Figure 18-4. This very short listing of the /lib64 directory contains many examples of the use of symbolic links The leftmost character of some of the entries in the long file listing in Figure18-4 is an “l” which means that this is a soft or symbolic link, but the arrow syntax in the file name section is even more noticeable. So, to select one file as an example, libacl.so.1 is the name of the link, and -> libacl.so.1.1.0 points to the actual file. Short listings using ls do not show any of this. On most modern terminals, links are color coded. This figure does not show hard links, but let’s start with hard links as we go deeper.

H ard links A hard link is a directory entry that points to the inode for a file. Each file has one inode that contains information about that file including the location of the data belonging to that file. Each inode is referenced by at least one and sometimes more directory entries. In Figure18-5 multiple directory entries point to single inode. These are all hard links. I have abbreviated the locations of three of the directory entries using the tilde (~) convention for home directory, so that ~ is equivalent to /home/user in this example.

537

Chapter 18

Files, Directories, andLinks

Note that the fourth directory entry is in a completely different directory, /home/shared, which represents a location for sharing files between users of the computer. Figure 18-5 provides a good illustration of the meta-structures that contain the metadata for a file and provide the operating system with the data needed to access the file for reading and writing.

Figure 18-5. For hard links, multiple directory entries point to the same inode using the inode number that is unique for the filesystem In figure 18-6 we see from a long listing with the -i option, which lists the inode numbers, which all of these directory entries point to the same inode.

538

Chapter 18

Files, Directories, andLinks

[student@studentvm1 ~]$ ll -i Documents/TextFiles/file.txt ~/tmp/file* /home/shared/file.txt 434 -rw-rw-r-- 4 student student 12 Apr

2 12:32 Documents/TextFiles/file.txt

434 -rw-rw-r-- 4 student student 12 Apr 2 12:32 /home/shared/file.txt 434 -rw-rw-r-- 4 student student 12 Apr 2 12:32 /home/student/tmp/file2.txt 434 -rw-rw-r-- 4 student student 12 Apr

2 12:32 /home/student/tmp/file.txt

Figure 18-6. A long listing of the files shown in Figure18-5. The inode number of 434 is the first field. All of these directory entries share the same inode We will explore this figure in detail in Chapter 19. For now we will learn about links.

EXPERIMENT 18-10 As the student user, make ~/testdir the PWD, and delete all of the files contained there: [student@studentvm1 testdir]$ cd ~/testdir ; rm -rf * ; ll total 0

Create a single file with a bit of plain text content, and list the directory contents: [student@studentvm1 testdir]$ echo "Hello World" > file001 ; ll total 4 -rw-rw---- 1 student student 12 Dec 13 18:43 file001

Notice the number 1 between the permissions and the user and group owners. This is the number of hard links to this file. Because there is only one directory entry pointing to this file, there is only one link. Use the stat command to verify this: [student@studentvm1 testdir]$ stat file001 File: file001 Size: 12Blocks: 8IO Block: 4096regular file Device: fd07h/64775dInode: 157 Links: 1 Access: (0660/-rw-rw----)Uid: ( 1000/ student) Gid: ( 1000/ student) Access: 2018-12-13 18:43:48.199515467 -0500 Modify: 2018-12-13 18:43:48.199515467 -0500 Change: 2018-12-13 18:43:48.199515467 -0500 Birth: -

539

Chapter 18

Files, Directories, andLinks

The inode number for this file on my VM is 157 but it will probably be different on your VM.Now create a hard link to this file. The ln utility defaults to creation of a hard link. [student@studentvm1 testdir]$ ln file001 link1 ; ll total 8 -rw-rw---- 2 student student 12 Dec 13 18:43 file001 -rw-rw---- 2 student student 12 Dec 13 18:43 link1

The link count is now 2 for both directory entries. Display the content of both files and then stat them both: [student@studentvm1 testdir]$ cat file001 link1 Hello World Hello World [student@studentvm1 testdir]$ stat file001 link1 File: file001 Size: 12Blocks: 8IO Block: 4096regular file Device: fd07h/64775dInode: 157 Links: 2 Access: (0660/-rw-rw----)Uid: ( 1000/ student) Gid: ( 1000/ student) Access: 2018-12-13 18:51:27.103658765 -0500 Modify: 2018-12-13 18:43:48.199515467 -0500 Change: 2018-12-13 18:49:35.499380712 -0500 Birth: File: link1 Size: 12Blocks: 8IO Block: 4096regular file Device: fd07h/64775dInode: 157 Links: 2 Access: (0660/-rw-rw----)Uid: ( 1000/ student) Gid: ( 1000/ student) Access: 2018-12-13 18:51:27.103658765 -0500 Modify: 2018-12-13 18:43:48.199515467 -0500 Change: 2018-12-13 18:49:35.499380712 -0500 Birth: -

All of the metadata for both files is identical including the inode number and the number of links. Create another link in the same directory. It does not matter which existing directory entry we use to create the new link because they both point to the same inode: [student@studentvm1 testdir]$ ln link1 link2 ; ll total 12 -rw-rw---- 3 student student 12 Dec 13 18:43 file001 -rw-rw---- 3 student student 12 Dec 13 18:43 link1

540

Chapter 18

Files, Directories, andLinks

-rw-rw---- 3 student student 12 Dec 13 18:43 link2 [student@studentvm1 testdir]$

You should stat all three of these files to verify that the metadata for them is identical. Let’s create a link to this inode in your home directory: [student@studentvm1 testdir]$ ln link1 ~/link3 ; ll ~/link* -rw-rw---- 4 student student 12 Dec 13 18:43 /home/student/link3

You can see from the listing that we now have 4 hard links to this file. It is possible to view the inode number with the ls -li or ll -i command. The number 157 at the left side of each file listing is the inode number: [student@studentvm1 testdir]$ ll total 12 157 -rw-rw---- 4 student student 157 -rw-rw---- 4 student student 157 -rw-rw---- 4 student student

-i 12 Dec 13 18:43 file001 12 Dec 13 18:43 link1 12 Dec 13 18:43 link2

Let’s create another link from /tmp/: [student@studentvm1 testdir]$ link file001 /tmp/link4 link: cannot create link '/tmp/link4' to 'file001': Invalid cross-device link

This attempt to create a hard link from /tmp to a file in /home fails because these directories are separate filesystems. Hard links are limited to files contained within a single filesystem. Filesystem is used here in the sense of a partition or logical volume that is mounted on a specified mount point, such as in this case, /home. This is because inode numbers are unique only within each filesystem and a different filesystem, /var or /opt, for example, will have inodes with the same number as the inode for our file. Because all of the hard links point to the single inode which contains the metadata about the file, all of these attributes are part of the file, such as ownerships, permissions, and the total number of hard links to the inode, and cannot be different for each hard link. It is one file with one set of attributes. The only attribute that can be different is the file name, which is not contained in the inode. Hard links to a single file/inode that are located in the same directory must have different names due to the fact that there can be no duplicate file names within a single directory.

541

Chapter 18

Files, Directories, andLinks

One of the interesting consequences of hard links is that deleting the actual file inode and data requires deleting all of the links. The problem with this is that it may not be obvious where all of the links are located. A normal file listing does not make this immediately obvious. So we need a way to search for all of the links for a specific file.

Locating files withseveral hard links The find command can locate files with multiple hard links. It can locate all files with a given inode number which means we can find all of the hard links to a file.

EXPERIMENT 18-11 As root let’s look for all files with 4 hard links. We could also use +4 or -4 to find all files with more or less than 4 hard links, respectively, but we will look for exactly 4: [root@studentvm1 ~]# find / -type f -links 4 /home/student/link3 /home/student/testdir/link2 /home/student/testdir/file001 /home/student/testdir/link1 /usr/sbin/fsck.ext2 /usr/sbin/mkfs.ext3 /usr/sbin/mke2fs /usr/sbin/mkfs.ext4 /usr/sbin/e2fsck /usr/sbin/fsck.ext3 /usr/sbin/mkfs.ext2 /usr/sbin/fsck.ext4 <snip>

This shows the hard links we created in Experiment 18-9, as well as some other interesting files such as the programs for creating filesystems like EXT3 and EXT4. Exploring this a little more, we look for the inode numbers of the mkfs files. The -exec option executes the command that follows. The curly braces– {}– in this command substitute the file names found into the ls -li command so that we get a long listing of just the found files. The -i option displays the inode number. The last part of this command is an escaped semicolon (\;) which us used to terminate the -exec command list. Using an unescaped semicolon would be used to separate individual commands for the -exec option if there were more: 542

Chapter 18 [root@studentvm1 ~]# find / -type f -name mkfs*[0-9] 531003 -rwxr-xr-x. 4 root root 133664 May 242018 531003 -rwxr-xr-x. 4 root root 133664 May 242018 531003 -rwxr-xr-x. 4 root root 133664 May 242018

Files, Directories, andLinks

-links 4 -exec ls -li {} \; /usr/sbin/mkfs.ext3 /usr/sbin/mkfs.ext4 /usr/sbin/mkfs.ext2

All three of these files have the same inode (531003) so that they are really the same file with multiple links. But there are 4 hard links to this file, so let’s find all of them by searching for files with the inode number 531003. Be sure to use the inode number that matches the one for this file on your VM– it will be different from the one shown here: [root@studentvm1 ~]# find /usr -inum 531003 /usr/sbin/mkfs.ext3 /usr/sbin/mke2fs /usr/sbin/mkfs.ext4 /usr/sbin/mkfs.ext2

We could also use the -samefile option to accomplish the same thing without knowing the inode number. This option finds both hard and soft links: [root@studentvm1 ~]# find /usr -samefile /usr/sbin/mkfs.ext3 /usr/sbin/mkfs.ext3 /usr/sbin/mke2fs /usr/sbin/mkfs.ext4 /usr/sbin/mkfs.ext2

The result shows that the name search we were doing previously would not find the fourth link.

Symbolic (soft) links In Experiment 18-11 we found experimentally that hard links do not work across filesystem boundaries. Soft links, also known as symbolic or symlinks, can circumvent that problem for us. A symlink can be used in most of the same places as a hard link and more. The difference between a hard link and a soft link is that while hard links point directly to the inode belonging to the file, soft links point to a directory entry, that is, one of the hard links. Because soft links point to a hard link for the file and not the inode, they are not dependent upon the inode number and can work across filesystems, spanning partitions, and logical volumes. And, unlike hard links, soft links can point to the directory itself, which is a common use case for soft links. 543

Chapter 18

Files, Directories, andLinks

The downside to this is that if the hard link to which the symlink points is deleted or renamed, the symlink is broken. The symbolic link is still, there but it points to a hard link that no longer exists. Fortunately the ls command highlights broken links with flashing white text on a red background in a long listing.

EXPERIMENT 18-12 As the student user in a terminal session, make ~/testdir directory the PWD.There are three hard links there, so let’s create a symlink to one of the hard links, and then list the directory: student@studentvm1 testdir]$ ln total 12 -rw-rw---- 4 student student 12 -rw-rw---- 4 student student 12 -rw-rw---- 4 student student 12 lrwxrwxrwx 1 student student5

-s link1 softlink1 ; ll Dec Dec Dec Dec

13 13 13 14

18:43 18:43 18:43 14:57

file001 link1 link2 softlink1 -> link1

The symbolic link is just a file that contains a pointer to the target file to which it is linked. This can be further tested by the following command: [student@studentvm1 testdir]$ stat softlink1 link1 File: softlink1 -> link1 Size: 5Blocks: 0IO Block: 4096symbolic link Device: fd07h/64775dInode: 159 Links: 1 Access: (0777/lrwxrwxrwx)Uid: ( 1000/ student) Gid: ( 1000/ student) Access: 2018-12-14 14:58:00.136339034 -0500 Modify: 2018-12-14 14:57:57.290332274 -0500 Change: 2018-12-14 14:57:57.290332274 -0500 Birth: File: link1 Size: 12Blocks: 8IO Block: 4096regular file Device: fd07h/64775dInode: 157 Links: 4 Access: (0660/-rw-rw----)Uid: ( 1000/ student) Gid: ( 1000/ student) Access: 2018-12-14 15:00:36.706711371 -0500 Modify: 2018-12-13 18:43:48.199515467 -0500 Change: 2018-12-13 19:02:05.190248201 -0500 Birth: -

544

Chapter 18

Files, Directories, andLinks

The first file is the symlink, and the second is the hard link. The symlink has a different set of timestamps, a different inode number, and even a different size than the hard links which are still all the same because they all point to the same inode. Now we can create a link from /tmp to one of these files and verify the content: [student@studentvm1 testdir]$ cd /tmp ; ln -s ~/testdir/file001 softlink2 ; ll /tmp total 92 <snip> drwx------. 2 rootroot16384 Aug 13 16:16 lost+found lrwxrwxrwx1 studentstudent 29 Dec 14 15:18 softlink2 -> /home/ student/testdir/file001 <snip> [student@studentvm1 tmp]$ cat softlink2 Hello World

This enables to access the file by placing a link of it in /tmp, but, unlike a copy of the file, the current version of the file is always there. Now let’s delete the original file and see what happens: lrwxrwxrwx 1 student student5 Dec 14 14:57 softlink1 -> link1 [student@studentvm1 testdir]$ rm file001 ; ll total 8 -rw-rw---- 3 student student 12 Dec 13 18:43 link1 -rw-rw---- 3 student student 12 Dec 13 18:43 link2 lrwxrwxrwx 1 student student5 Dec 14 14:57 softlink1 -> link1 [student@studentvm1 testdir]$ ll /tmp/soft* lrwxrwxrwx 1 student student 29 Dec 14 15:18 /tmp/softlink2 -> /home/student/testdir/file001

Notice what happens to the soft link. Deleting the hard link to which the soft link points leaves a broken link in /tmp. On my system the broken link is highlighted, and the target hard link is flashing. If the broken link needs to be fixed, you can create another hard link in the same directory with the same name as the old one. If the soft link is no longer needed, it can be deleted with the rm command.

545

Chapter 18

Files, Directories, andLinks

The unlink command can also be used to delete files and links. It is very simple and has no options as the rm command does. Its name does more accurately reflect the underlying process of deletion in that it removes the link– the directory entry– to the file being deleted.

Chapter summary This chapter has explored file, directories, and links in detail. We looked at file and directory ownership and permissions, file timestamps, the Red Hat Private Group concept and its security implications, umask for setting the default permissions on new files, and how to obtain information about files. We also created a directory in which users can easily share files with enough security to prevent other users from accessing them. We learned about file metadata, its locations, and the metadata structures like the directory entry and the file inode. We explored hard and soft links, how they differ, how they relate to the metadata structures, and some uses for them. Don’t forget that permissions and ownership are mostly irrelevant to the root user. The root user can do anything even if that sometimes takes a bit of hacking such as changing permissions.

Exercises Complete these exercises to finish this chapter: 1. If the student user, who is a member of the ops group, sets the permissions of file09in the /tmp or other shared directory to 066 and group ownership to ops, who has what type of access to it and who does not? Explain the logic of this in detail. 2. If the development group uses a shared directory, /home/dev, to share files, what specific permission needs to be set on the dev directory to ensure that files created in that directory are accessible by the entire group without additional intervention? 3. Why are the permissions for your home directory, /home/student, set to 700? 546

Chapter 18

Files, Directories, andLinks

4. For file09in exercise 1, how can the student user regain access to the file? 5. Why did we set the shared directory permissions to 770in Experiment 18-5? 6. What would be different if we set the permissions of the shared directory to 774? 7. Given that the directory, ~/test, has ownership of student.student and the file permissions are set to --xrwxrwx (177), which of the following tasks can the student user perform? Listing the content of the directory? Creating and deleting files in the directory? Making the directory the PWD? 8. Create a file in a publicly accessible directory such as /tmp, and give it permissions so that all users except those belonging to the dev group can access it for read and write. Users in the dev group should have no access at all. 9. Create file as the student user, and set the permissions on a file such that the root user has no access but the student user, who created the file, has full read/write access and Other users can read the file. 10. Which type of link is required when linking from one filesystem to another? Why? 11. The umask for the root user is 022. What are the permissions for new files created by root? 12. Why does a hard link not break if one of the links is moved to another directory in the same filesystem? Demonstrate this. 13. Fix the symlink in /tmp that we broke when we deleted file001.

547

CHAPTER 19

Filesystems O bjectives In this chapter you will learn •

Three definitions for the term “Filesystem”

•

The meta-structures of the EXT4 filesystem.

•

How to obtain information about an EXT4 filesystem

•

To resolve problems that prevent a host from booting due to errors in configuration files

•

To detect and repair filesystem inconsistencies that might result in data loss

•

To describe and use the Linux Filesystem Hierarchical Standard (FHS)

•

To create a new partition and install an EXT4 filesystem on it

•

Configure /etc/fstab to mount a new partition on boot

O verview Every general purpose computer needs to store data of various types on a hard disk drive (HDD), a solid-state drive (SSD), or some equivalent such as a USB memory stick. There are a couple reasons for this. First, RAM loses its contents when the computer is switched off so that everything stored in RAM gets lost. There are nonvolatile types of RAM that can maintain the data stored there after power is removed, such as flash RAM that is used in USB memory sticks and solid-state drives (SSD).

549

Chapter 19

Filesystems

The second reason that data needs to be stored on hard drives is that even standard RAM is still less expensive than disk space. Both RAM and disk costs have been dropping rapidly, but RAM still leads the way in terms of cost per byte. A quick calculation of the cost per byte, based on costs for 16GB of RAM vs. a 2TB hard drive, shows that the RAM is about 71 times more expensive per unit than the hard drive. A typical cost for RAM is around $0.0000000043743750 per byte as of this writing. For a quick historical note to put present RAM costs in perspective, in the very early days of computing, one type of memory was based on dots on a CRT screen. This was very expensive at about $1.00 per bit!

Definitions You may hear people talk about the term “filesystems” in a number of different and possibly confusing ways. The word itself can have multiple meanings, and you may have to discern the correct meaning from the context of a discussion or document. So I will attempt to define the various meanings of the word, “filesystem,” based on how I have observed it being used in different circ*mstances. Note that while attempting to conform to standard “official” meanings, my intent is to define the term based on its various usages. These meanings will be explored in more detail in the following sections of this chapter: 1. A specific type of data storage format such as EXT3, EXT4, BTRFS, XFS, and so on. Linux supports almost 100 types of filesystems including some very old ones, as well as some of the newest. Each of these filesystem types uses its own metadata structures to define how the data is stored and accessed. 2. The entire Linux hierarchical directory structure starting at the top (/) root directory. 3. A partition or logical volume formatted with a specific type of filesystem that can be mounted on a specified mount point on a Linux filesystem. This chapter covers all three meanings of the term, “filesystem.”

550

Chapter 19

Filesystems

Filesystem functions Disk storage is a necessity that brings with it some interesting and inescapable details. Disk filesystems are designed to provide space for nonvolatile storage of data. There are many other important functions that flow from that requirement. A filesystem is all of the following: 1. Data storage: A structured place to store and retrieve data; this is the primary function of any filesystem. 2. Namespace: A naming and organizational methodology that provides rules for naming and structuring data. 3. Security model: A scheme for defining access rights. 4. Application programming interface (API): System function calls to manipulate filesystem objects like directories and files. 5. Implementation: The software to implement the above. All filesystems need to provide a namespace, that is, a naming and organizational methodology. This defines how a file can be named, specifically the length of a file name and the subset of characters that can be used for file names out of the total set of characters available. It also defines the logical structure of the data on a disk, such as the use of directories for organizing files instead of just lumping them all together in a single, huge data space. Once the namespace has been defined, a metadata structure is necessary to provide the logical foundation for that namespace. This includes the data structures required to support a hierarchical directory structure, structures to determine which blocks of space on the disk are used and which are available, structures that allow for maintaining the names of the files and directories, information about the files such as their size and times they were created, modified or last accessed, and the location or locations of the data belonging to the file on the disk. Other metadata is used to store high-level information about the subdivisions of the disk such as logical volumes and partitions. This higher- level metadata and the structures it represents contain the information describing the filesystem stored on the drive or partition but are separate from and independent of the filesystem metadata.

551

Chapter 19

Filesystems

Filesystems also require an application programming interface (API) that provides access to system function calls which manipulate filesystem objects like files and directories. APIs provide for tasks such as creating, moving, and deleting files. It also provides functions that determine things like where a file is placed on a filesystem. Such functions may account for objectives such as speed or minimizing disk fragmentation. Modern filesystems also provide a security model which is a scheme for defining access rights to files and directories. The Linux filesystem security model helps to ensure that users only have access to their own files and not those of others or the operating system itself. The final building block is the software required to implement all of these functions. Linux uses a two-part software implementation as a way to improve both system and programmer efficiency which is illustrated in Figure19-1. The first part of this two-part implementation is the Linux virtual filesystem. This virtual filesystem provides a single set of commands for the kernel and developers, to access all types of filesystems. The virtual filesystem software calls the specific device driver required to interface to the various types of filesystems. The filesystem-specific device drivers are the second part of the implementation. The device driver interprets the standard set of filesystem commands to ones specific to the type of filesystem on the partition or logical volume.

Figure 19-1. The Linux two-part filesystem structure

552

Chapter 19

Filesystems

The Linux Filesystem Hierarchical Standard As a usually very organized Virgo, I like things stored in smaller, organized groups rather than in one big bucket. The use of directories helps me to store and then locate the files I want when I want them. Directories are also known as folders because they can be thought of as folders in which files are kept in a sort of physical desktop analogy. In Linux, and many other operating systems, directories can be structured in a tree-like hierarchy. The Linux directory structure is well defined and documented in the Linux Filesystem Hierarchy Standard (FHS).1 This standard has been put in place to ensure that all distributions of Linux are consistent in their directory usage. Such consistency makes writing and maintaining shell and compiled programs easier for SysAdmins because the programs, their configuration files, and their data, if any, should be located in the standard directories.

T he standard The latest Filesystem Hierarchical Standard (3.0)2 is defined in a document maintained by the Linux Foundation.3 The document is available in multiple formats from their web site, as are historical versions of the FHS.I suggest that you set aside some time and at least scan the entire document in order to better understand the roles played by the many subdirectories of these top-level ones. Figure 19-2 provides a list of the standard, well known, and defined top-level Linux directories and their purposes. These directories are listed in alphabetical order.

Linux Foundation, Linux Filesystem Hierarchical Standard, http://refspecs.linuxfoundation. org/fhs.shtml 2 http://refspecs.linuxfoundation.org/fhs.shtml 3 The Linux Foundation maintains documents defining many Linux standards. It also sponsors the work of Linus Torvalds. 1

553

Chapter 19

Filesystems

ŝƌĞĐƚŽƌǇ

WĂƌƚŽĨͬ

ĞƐĐƌŝƉƚŝŽŶ

ͬ;ƌŽŽƚĨŝůĞƐǇƐƚĞŵͿ

zĞƐ

dŚĞƌŽŽƚĨŝůĞƐǇƐƚĞŵŝƐƚŚĞƚŽƉ

ͬďŝŶ

zĞƐ

dŚĞͬďŝŶĚŝƌĞĐƚŽƌǇĐŽŶƚĂŝŶƐƵƐĞƌĞǆĞĐƵƚĂďůĞĨŝůĞƐ͘

ͬďŽŽƚ

EŽ

ŽŶƚĂŝŶƐƚŚĞƐƚĂƚŝĐďŽŽƚůŽĂĚĞƌĂŶĚŬĞƌŶĞůĞǆĞĐƵƚĂďůĞĂŶĚĐŽŶĨŝŐƵƌĂƚŝŽŶĨŝůĞƐ

ͬĚĞǀ

zĞƐ

dŚŝƐĚŝƌĞĐƚŽƌǇĐŽŶƚĂŝŶƐƚŚĞĚĞǀŝĐĞĨŝůĞƐĨŽƌĞǀĞƌǇŚĂƌĚǁĂƌĞĚĞǀŝĐĞĂƚƚĂĐŚĞĚƚŽ

ͬĞƚĐ

zĞƐ

ŽŶƚĂŝŶƐĂǁŝĚĞǀĂƌŝ

ͬŚŽŵĞ

EŽ

,ŽŵĞĚŝƌĞĐƚŽƌǇƐƚŽƌĂŐĞĨŽƌƵƐĞƌĨŝůĞƐ͘ĂĐŚƵƐĞƌŚĂƐĂƐƵďĚŝƌĞĐƚŽƌǇŝŶͬŚŽŵĞ͘

ͬůŝď

zĞƐ

ŽŶƚĂŝŶƐƐŚĂƌĞĚůŝďƌĂƌǇĨŝůĞƐƚŚĂƚĂƌĞƌĞƋƵŝƌĞĚƚŽďŽŽƚƚŚĞƐǇƐƚĞŵ͘

ͬŵĞĚŝĂ

EŽ

ƉůĂĐĞƚŽŵŽƵŶƚ

ͬŵŶƚ

EŽ

ƚĞŵƉŽƌĂƌǇŵŽƵŶƚƉŽŝŶƚĨŽƌƌĞŐƵůĂƌĨŝůĞƐǇƐƚĞŵƐ;ĂƐŝŶŶŽƚƌĞŵŽǀĂďůĞŵĞĚŝĂͿ

ͬŽƉƚ

EŽ

KƉƚŝŽŶĂůĨŝůĞƐƐƵĐŚĂƐǀĞŶĚŽƌƐƵƉƉůŝĞĚĂƉƉůŝĐĂƚŝŽŶƉƌŽŐƌĂŵƐƐŚŽƵůĚďĞ

ͬƉƌŽĐ

sŝƌƚƵĂů

sŝƌƚƵĂůĨŝůĞƐǇƐƚĞŵƵƐĞĚƚŽĞǆƉŽƐĞĂĐĐĞƐƐƚŽŝŶƚĞƌŶĂůŬĞƌŶĞůŝŶĨŽƌŵĂƚŝŽŶĂŶĚ

ͬƌŽŽƚ

zĞƐ

dŚŝƐŝƐŶŽƚƚŚĞƌŽŽƚ;ͬͿ

Figure 19-2. The top level of the Linux Filesystem Hierarchical Standard

ote that /bin and /sbin are now just links to /usr/bin and /usr/sbin, respectively. They are no N longer generally split into “essential” and “non-essential” as they used to be.

554

Chapter 19 ŝƌĞĐƚŽƌǇ

WĂƌƚŽĨͬ

Filesystems

ĞƐĐƌŝƉƚŝŽŶ

ͬƐďŝŶ

zĞƐ

^ǇƐƚĞŵďŝŶĂƌǇĨŝůĞƐ͘dŚĞƐĞĂƌĞĞǆĞĐƵƚĂďůĞƐƵƐĞĚĨŽƌƐǇƐƚĞŵĂĚŵŝŶŝƐƚƌĂƚŝŽŶ͘

ͬƐĞůŝŶƵǆ

sŝƌƚƵĂů

dŚŝƐĨŝůĞƐǇƐƚĞŵŝƐŽŶůǇƵƐĞĚǁŚĞŶ^>ŝŶƵǆŝƐĞŶĂďůĞĚ͘

ͬƐǇƐ

sŝƌƚƵĂů

dŚŝƐǀŝƌƚƵĂůĨŝůĞƐǇƐƚĞŵĐŽŶƚĂŝŶƐŝŶĨŽƌŵĂƚŝŽŶĂďŽƵƚƚŚĞh^ĂŶĚW/ďƵƐƐĞƐĂŶĚ ƚŚĞĚĞǀŝĐĞƐĂƚƚĂĐŚĞĚƚŽĞĂĐŚ͘

ͬƚŵƉ

EŽ

dĞŵƉŽƌĂƌǇĚŝƌĞĐƚŽƌǇ͘hƐĞĚďǇƚŚĞŽƉĞƌĂƚŝŶŐƐǇƐƚĞŵĂŶĚŵĂŶǇƉƌŽŐƌĂŵƐƚŽ ƐƚŽƌĞƚĞŵƉŽƌĂƌǇĨŝůĞƐ͘hƐĞƌƐŵĂǇĂůƐŽƐƚŽƌĞĨŝůĞƐŚĞƌĞƚĞŵƉŽƌĂƌŝůǇ͘EŽƚĞƚŚĂƚ ĨŝůĞƐƐƚŽƌĞĚŚĞƌĞŵĂǇďĞĚĞůĞƚĞĚĂƚĂŶǇƚŝŵĞǁŝƚŚŽƵƚƉƌŝŽƌŶŽƚŝĐĞ͘

ͬƵƐƌ

EŽ

dŚĞƐĞĂƌĞƐŚĂƌĞĂďůĞ͕ƌĞĂĚŽŶůǇĨŝůĞƐŝŶĐůƵĚŝŶŐĞǆĞĐƵƚĂďůĞďŝŶĂƌŝĞƐĂŶĚ ůŝďƌĂƌŝĞƐ͕ŵĂŶ΀ƵĂů΁ĨŝůĞƐ͕ĂŶĚŽƚŚĞƌƚǇƉĞƐŽĨĚŽĐƵŵĞŶƚĂƚŝŽŶ͘

ͬƵƐƌͬůŽĐĂů

EŽ

dŚĞƐĞĂƌĞƚǇƉŝĐĂůůǇƐŚĞůůƉƌŽŐƌĂŵƐŽƌĐŽŵƉŝůĞĚƉƌŽŐƌĂŵƐĂŶĚƚŚĞŝƌƐƵƉƉŽƌƚŝŶŐ ĐŽŶĨŝŐƵƌĂƚŝŽŶĨŝůĞƐƚŚĂƚĂƌĞǁƌŝƚƚĞŶůŽĐĂůůǇĂŶĚƵƐĞĚďǇƚŚĞ^ǇƐĚŵŝŶŽƌŽƚŚĞƌ ƵƐĞƌƐŽĨƚŚĞŚŽƐƚ͘

ͬǀĂƌ

EŽ

sĂƌŝĂďůĞĚĂƚĂĨŝůĞƐĂƌĞƐƚŽƌĞĚŚĞƌĞ͘dŚŝƐĐĂŶŝŶĐůƵĚĞƚŚŝŶŐƐůŝŬĞůŽŐĨŝůĞƐ͕ DǇ^Y>ĂŶĚŽƚŚĞƌĚĂƚĂďĂƐĞĨŝůĞƐ͕ǁĞďƐĞƌǀĞƌĚĂƚĂĨŝůĞƐ͕ĞŵĂŝůŝŶďŽǆĞƐ͕ĂŶĚ ŵƵĐŚŵŽƌĞ͘

Figure 19-2. (continued) The directories shown in Figure19-2, along with their subdirectories, that have a Yes in column 2 are considered to be an integral part of the root filesystem. That is, they cannot be created as a separate filesystem and mounted at startup time. This is because they, specifically their contents, must be present at boot time in order for the system to boot properly. The /media and /mnt directories are part of the root filesystem, but they should never contain any data. Rather, they are simply temporary mount points. The rest of the directories do not need to be present during the boot sequence but will be mounted later, during the startup sequence that prepares the host to perform useful work.

555

Chapter 19

Filesystems

Wikipedia also has a good description of the FHS.5 This standard should be followed as closely as possible to ensure operational and functional consistency. Regardless of the filesystem types, that is, EXT4, XFS, etc., used on a host, this hierarchical directory structure is the same.

P roblem solving One of the best reasons I can think of for adhering to the Linux FHS is that of making the task of problem solving as easy as possible. Many applications expect things to be in certain places, or they won’t work. Where you store your cat pictures and MP3s doesn’t matter, but where your system configuration files are located does. Using the Linux Filesystem Hierarchical Standard promotes consistency and simplicity which makes problem solving easier. Knowing where to find things in the Linux filesystem directory structure has saved me from endless failing about on more than just a few occasions. I find that most of the core utilities, Linux services, and servers provided with the distributions I use are consistent in their usage of the /etc directory and its subdirectories for configuration files. This means that finding a configuration file for a misbehaving program or service supplied by the distribution should be easy to find. I typically use a number of the ASCII text files in /etc to configure Sendmail, Apache, DHCP, NFS, NTP, DNS, and more. I always know where to find the files I need to modify for those services, and they are all open and accessible because they are in ASCII text which makes them readable to both computers and humans.

Using thefilesystem incorrectly One situation involving the incorrect usage of the filesystem occurred while I was working as a lab administrator at a large technology company. One of our developers had installed an application in the wrong location, /var. The application was crashing because the /var filesystem was full and the log files, which are stored in /var/log on that filesystem, could not be appended with new messages that would indicate that the /var

Wikipedia, Filesystem Hierarchy Standard, https://en.wikipedia.org/wiki/ Filesystem_Hierarchy_Standard

556

Chapter 19

Filesystems

filesystem was full due to the lack of space in /var. However the system remained up and running because the critical / (root) and /tmp filesystems did not fill up. Removing the offending application and reinstalling it in the /opt filesystem, where it was supposed to be, resolved that problem. I also had a little discussion with the developer who did the original installation.

A dhering tothestandard So how do we as SysAdmins adhere to the Linux FHS? It is actually pretty easy, and there is a hint way back in Figure19-2. The /usr/local directory is where locally created executables and their configuration files should be stored. By local programs, the FHS means those that we create ourselves as SysAdmins to make our work or the work of other users easier. This includes all of those powerful and versatile shell programs we write. Our programs should be located in /usr/local/bin and the configuration files, if any, in /usr/local/etc. There is also a /var/local directory in which the database files for local programs can be stored. I have written a fair number of shell programs over the years, and it took me at least five years before I understood the appropriate places to install my own software on host computers. In some cases I had even forgotten where I installed them. In other cases, I installed the configuration files in /etc instead of /usr/local/etc, and my file was overwritten during an upgrade. It took a few hours to track that down the first time it happened. By adhering to these standards when writing shell programs, it is easier for me to remember where I have installed them. It is also easier for other SysAdmins to find things by searching only the directories that we as SysAdmins would have installed those programs and their files.

Linux unified directory structure The Linux filesystem unifies all physical hard drives and partitions into a single directory structure. It all starts at the top– the root (/) directory. All other directories and their subdirectories are located under the single Linux root directory. This means that there is only one single directory tree in which to search for files and programs.

557

Chapter 19

Filesystems

This can work only because a filesystem such as /home, /tmp, /var, /opt, or /usr can be created on separate physical hard drives, a different partition, or a different logical volume from the / (root) filesystem and then be mounted on a mount point (directory) as part of the root filesystem tree. Even removable drives such as a USB thumb drive or an external USB or ESATA hard drive will be mounted onto the root filesystem and become an integral part of that directory tree. One reason to do this is apparent during an upgrade from one version of a Linux distribution to another, or changing from one distribution to another. In general, and aside from any upgrade utilities like dnf-upgrade in Fedora, it is wise to occasionally reformat the hard drive(s) containing the operating system during an upgrade to positively remove any cruft that has accumulated over time. If /home is part of the root filesystem, it will be reformatted as well and would then have to be restored from a backup. By having /home as a separate filesystem, it will be known to the installation program as a separate filesystem, and formatting of it can be skipped. This can also apply to /var where database, e-mail inboxes, web site, and other variable user and system data are stored. You can also be intentional about which files reside on which disks. If you’ve got a smaller SSD and a large piece of spinning rust, put the important frequently accessed files necessary for booting on the SSD.Or your favorite game, or whatever. Similarly, don’t waste SSD space on archival storage of large files that you rarely access. As another example, a long time ago, when I was not yet aware of the potential issues surrounding having all of the required Linux directories as part of the / (root) filesystem, I managed to fill up my home directory with a large number of very big files. Since neither the /home directory nor the /tmp directory was separate filesystems but simply subdirectories of the root filesystem, the entire root filesystem filled up. There was no room left for the operating system to create temporary files or to expand existing data files. At first the application programs started complaining that there was no room to save files, and then the OS itself started to act very strangely. Booting to single user mode and clearing out the offending files in my home directory allowed me to get going again; I then reinstalled Linux using a pretty standard multi-filesystem setup and was able to prevent complete system crashes from occurring again. I once had a situation where a Linux host continued to run but prevented the user from logging in using the gui desktop. I was able to log in using the command-line interface (CLI) locally using one of the virtual consoles and remotely using SSH.The problem was that the /tmp filesystem had filled up, and some temporary files required 558

Chapter 19

Filesystems

by the gui desktop could not be created at login time. Because the CLI login did not require files to be created in /tmp, the lack of space there did not prevent me from logging in using the CLI.In this case the /tmp directory was a separate filesystem, and there was plenty of space available in the volume group of which the /tmp logical volume was a part. I simply expanded the /tmp logical volume to a size that accommodated my fresh understanding of the amount of temporary file space needed on that host, and the problem was solved. Note that this solution did not require a reboot, and as soon as the /tmp filesystem was enlarged, the user was able to log in to the desktop.

F ilesystem types Linux supports reading around 100 partition types; it can create and write to only a few of these. But it is possible– and very common– to mount filesystems of different types on the same root filesystem. In this context, we are talking about filesystems in terms of the structures and metadata required to store and manage the user data on a partition of a hard drive or a logical volume. The complete list of filesystem partition types recognized by the Linux fdisk command is provided in Figure19-3, so that you can get a feel for the high degree of compatibility that Linux has with very many types of systems.

559

Chapter 19

Filesystems

(PSW\1(&'260LQL[ROG/LQEI6RODULV )$7+LGGHQ17)6:LQ/LQX[VZDS6RF'5'26VHF)$7 ;(1,;URRW3ODQ/LQX[F'5'26VHF)$7 ;(1,;XVUF3DUWLWLRQ0DJLF26KLGGHQRUF'5'26VHF)$7 )$709HQL[/LQX[H[WHQGHGF6\ULQ[ ([WHQGHG33&35H3%RRW17)6YROXPHVHWGD1RQ)6GDWD )$7 6)617)6YROXPHVHWGE&30&726 +3)617)6H[)$7G41;[/LQX[SODLQWH[WGH'HOO8WLOLW\ $,;H41;[QGSDUWH/LQX[/90GI%RRW,W $,;ERRWDEOHI41;[UGSDUW$PRHED H'26DFFHVV D26%RRW0DQDJ2Q7UDFN'0$PRHED%%7H'2652 E:)$72Q7UDFN'0$X[I%6'26H6SHHG6WRU F:)$7/%$ &30D,%07KLQNSDGKLHD5XIXVDOLJQPHQW H:)$7/%$ 2Q7UDFN'0$X[D)UHH%6'HE%H26IV I:([W G/%$ 2Q7UDFN'0D2SHQ%6'HH*37 2386(='ULYHD1H;767(3HI(),)$7 +LGGHQ)$7*ROGHQ%RZD 'DUZLQ8)6I/LQX[3$5,6&E &RPSDTGLDJQRVWF3ULDP(GLVND1HW%6'I6SHHG6WRU +LGGHQ)$76SHHG6WRUDE'DUZLQERRWI6SHHG6WRU +LGGHQ)$7*18+85'RU6\VDI+)6+)6I'26VHFRQGDU\ +LGGHQ+3)617)1RYHOO1HWZDUHE%6',IVIE90ZDUH90)6 $676PDUW6OHHS1RYHOO1HWZDUHE%6',VZDSIF90ZDUH90.&25( E+LGGHQ:)$7'LVN6HFXUH0XOWEE%RRW:L]DUGKLGIG/LQX[UDLGDXWR F+LGGHQ:)$73&,;EF$FURQLV)$7/IH/$1VWHS H+LGGHQ:)$72OG0LQL[EH6RODULVERRWII%%7

Figure 19-3. The list of filesystems supported by Linux The main purpose in supporting the ability to read so many partition types is to allow for compatibility and at least some interoperability with other filesystems. The choices available when creating a new filesystem with Fedora are shown in the following list:

560

•

btrfs

•

cramfs

•

ext2

•

ext3

•

ext4

•

fat

•

gfs2

•

hfsplus

•

minix

•

msdos

Chapter 19

•

ntfs

•

reiserfs

•

vfat

•

xfs

Filesystems

Other Linux distributions support creating different filesystem types. For example, CentOS 6 supports creating only those filesystems highlighted in bold in the preceding list.

Mounting The term to “mount” a filesystem in Linux refers back to the early days of computing when a tape or removable disk pack would need to be physically mounted on an appropriate drive device. After being physically placed on the drive, the filesystem on the disk pack would be “mounted” by the operating system to make the contents available for access by the OS, application programs, and users. A mount point is simply an empty directory, like any other, which is created as part of the root filesystem. So, for example, the home filesystem is mounted on the directory /home. Filesystems can be mounted at mount points on non-root filesystems in the directory tree, but this is less common. The Linux root filesystem is mounted on the root directory (/) very early in the boot sequence. Other filesystems are mounted later, by the Linux startup programs, either rc under SystemV or by systemd in newer Linux releases. Mounting of filesystems during the startup process is managed by the /etc/fstab configuration file. An easy way to remember that is that fstab stands for “filesystem table,” and it is a list of filesystems that are to be mounted, their designated mount points, and any options that might be needed for specific filesystems. Filesystems are mounted on an existing directory/mount point using the mount command. In general, any directory that is used as a mount point should be empty and not have any other files contained in it. Linux will not prevent users from mounting one filesystem over one that is already there or on a directory that contains files. If you mount a filesystem on an existing directory or filesystem, the original contents will be hidden, and only the content of the newly mounted filesystem will be visible.

561

Chapter 19

Filesystems

The Linux EXT4 filesystem Although written for Linux, the EXT filesystem has its roots in the Minix operating system and the Minix filesystem which predate Linux by about five years, having been first released in 1987. When writing the original Linux kernel, Linus Torvalds needed a filesystem and did not want to write one at that point. So he simply included the Minix filesystem6 which had been written by Andrew S.Tanenbaum7 and which was a part of Tanenbaum’s Minix8 operating system. Minix was a Unix-like operating system written for educational purposes. Its code was freely available and was appropriately licensed to allow Torvalds’ to include it in his first version of Linux. The original EXT filesystem9 (Extended) was written by Rémy Card10 and released with Linux in 1992in order to overcome some size limitations of the Minix filesystem. The primary structural changes were to the metadata of the filesystem which was based on the Unix filesystem, UFS, also known as the Berkeley Fast File System or FFS.The EXT2 filesystem quickly replaced the EXT filesystem; EXT3 and EXT4 followed with additional fixes and features. The current default filesystem for Fedora is EXT4. The EXT4 filesystem has the following meta-structures: •

A boot sector11 in the first sector of the hard drive on which it is installed. The boot block includes a very small boot record and a partition table that supports up to four primary partitions.

•

After the boot sector has some reserved space which spans the space between the boot record and the first partition on the hard drive which is usually on the next cylinder boundary. The GRUB212 boot loader uses this space for the majority of its boot code.

Wikipedia, Minix Filesystem, https://en.wikipedia.org/wiki/MINIX_file_system Wikipedia, Andrew S.Tanenbaum, https://en.wikipedia.org/wiki/Andrew_S._Tanenbaum 8 Wikipedia, Minix, https://en.wikipedia.org/wiki/MINIX 9 Wikipedia, Extended Filesystem, https://en.wikipedia.org/wiki/Extended_file_system 10 Wikipedia, Rémy Card, https://en.wikipedia.org/wiki/Rémy_Card 11 Wikipedia, Boot sector, https://en.wikipedia.org/wiki/Boot_sector 12 Both, David, Opensource.com, An introduction to the Linux boot and startup processes, https://opensource.com/article/17/2/linux-boot-and-startup 6 7

562

Chapter 19

•

The space in each EXT4 partition is divided into cylinder groups that allow for more granular management of the data space. In my experience, the group size usually amounts to about 8MB.

•

Each cylinder group contains

•

Filesystems

•

A superblock which contains the metadata that defines the other filesystem structures and locates them on the physical disk assigned to the group.

•

An inode bitmap block that is used to determine which inodes are used and which are free.

•

The inodes which have their own space on the disk. Each inode contains information about one file, including the locations of the data blocks, that is, zones belonging to the file.

•

A zone bitmap to keep track of the used and free data zones.

A journal13 which records in advance the changes that will be performed to the filesystem and which helps to eliminate data loss due to crashes and power failures.

C ylinder groups The space in each EXT4 filesystem is divided into cylinder groups that allow for more granular management of the data space. In my experience, the group size can vary from about 8 MiB for older systems and software versions with newer hosts, larger hard drives, and newer versions of the EXT filesystem creating cylinder groups of about 34MiB.Figure19-4 shows the basic structure of a cylinder group. The data allocation unit in a cylinder is the block, which is usually 4K in size.

Wikipedia, Journaling file system, https://en.wikipedia.org/wiki/Journaling_file_system

563

Chapter 19

Filesystems

Figure 19-4. The structure of a cylinder group The first block in the cylinder group is a superblock which contains the metadata that defines the other filesystem structures and locates them on the physical disk. Some of the additional groups in the partition will have backup superblocks, but not all. A damaged superblock can be replaced by using a disk utility such as dd to copy the contents of a backup superblock to the primary superblock. It does not happen often, but I have experienced a damaged superblock once many years ago, and I was able to restore its contents using that of one of the backup superblocks. Fortunately I had been foresighted and used the dumpe2fs command to dump the descriptor information of the partitions on my system. Each cylinder group has two types of bitmaps. The inode bitmap is used to determine which inodes are used and which are free within that group. The inodes have their own space, the inode table in each group. Each inode contains information about one file, including the locations of the data blocks belonging to the file. The block bitmap keeps track of the used and free data blocks within the filesystem. On very large filesystems, the group data can run to hundreds of pages in length. The group metadata includes a listing of all of the free data blocks in the group. For both types of bitmaps, one bit represents one specific data zone or one specific inode. If the bit is zero, the zone or inode is free and available for use, while if the bit is one, the data zone or inode is in use. Let’s take a look at the metadata for the root filesystem of our VMs. The details and values of yours will probably be different from mine.

564

Chapter 19

Filesystems

EXPERIMENT 19-1 Perform this experiment as root. We use the dumpe2fs utility to dump the data from the primary superblock of the root (/) filesystem. You may need to run the output data stream from the dumpe2fs command through the less utility to see it all: [root@studentvm1 ~]# dumpe2fs -h /dev/mapper/fedora_studentvm1-root dumpe2fs 1.44.3 (10-July-2018) Filesystem volume name:root Last mounted on:/ Filesystem UUID:f146ab03-1469-4db0-8026-d02192eab170 Filesystem magic number:0xEF53 Filesystem revision #:1 (dynamic) Filesystem features:has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_ file dir_nlink extra_isize metadata_csum Filesystem flags:signed_directory_hash Default mount options:user_xattr acl Filesystem state:clean Errors behavior:Continue Filesystem OS type:Linux Inode count:131072 Block count:524288 Reserved block count:26214 Free blocks:491265 Free inodes:129304 First block:0 Block size:4096 Fragment size:4096 Group descriptor size:64 Reserved GDT blocks:255 Blocks per group:32768 Fragments per group:32768 Inodes per group:8192 Inode blocks per group:512 Flex block group size:16 Filesystem created:Sat Dec 22 11:01:11 2018

565

Chapter 19

Filesystems

Last mount time:Thu Dec 27 10:54:26 2018 Last write time:Thu Dec 27 10:54:21 2018 Mount count:9 Maximum mount count:-1 Last checked:Sat Dec 22 11:01:11 2018 Check interval:0 (<none>) Lifetime writes:220MB Reserved blocks uid:0 (user root) Reserved blocks gid:0 (group root) First inode:11 Inode size:256 Required extra isize:32 Desired extra isize:32 Journal inode:8 Default directory hash:half_md4 Directory Hash Seed:838c2ec7-0945-4614-b7fd-a671d8a40bbd Journal backup:inode blocks Checksum type:crc32c Checksum:0x2c27afaa Journal features:journal_64bit journal_checksum_v3 Journal size:64M Journal length:16384 Journal sequence:0x000001fa Journal start:1 Journal checksum type:crc32c Journal checksum:0x61a70146

There is a lot of information here, and what you see on your VM should be similar. There are some specific data that are of special interest. The first two entries give the filesystem label and the last mount point. That makes it easy to see that this is the root (/) filesystem. If your /etc/fstab uses UUIDs to mount one or more partitions, such as /boot, this is that UUID as it is stored in the filesystem’s primary superblock. The current filesystem state is “clean” which means that all of the data has been written from buffers and the journal to the data space and the filesystem is consistent. If the filesystem were not clean, then not all data has been written to the data area of the hard drive yet. Note that this and some other data in the superblock may not be current if the filesystem is mounted. 566

Chapter 19

Filesystems

This also tells us that the filesystem type is “Linux” which is type 83 as shown in Figure19-3. This is a non-LVM partition. Type 8e would be a Linux LVM partition. You can also see the inode and block counts which tell us how many files and how much total data can be stored on this filesystem. Since each file uses one inode, this filesystem can hold 131,072 files. Along with the block size of 4096 bytes, the total block count gives 1,073,741,824 total bytes of storage with 53,686,272 bytes in reserved blocks. When a data block is found by various error detection mechanisms to have errors, the data is moved to one of the reserved blocks, and the regular data block is marked as defective and unavailable for future data storage. The number of free blocks tells us that 1,006,110,720 bytes are free and available. The directory hash and hash seed are used by the HTree14 directory tree structure implementation to hash directory entries so that they can be easily found during file seek operations. Much of the rest of the superblock information is relatively easy to extract and understand. The man page for EXT4 has some additional information about the filesystem features listed near the top of this output. Now use the following command to view both the superblock and the group data for this partition: [root@studentvm1 ~]# dumpe2fs /dev/mapper/fedora_studentvm1-root | less <snip> Group 0: (Blocks 0-32767) csum 0x6014 [ITABLE_ZEROED] Primary superblock at 0, Group descriptors at 1-1 Reserved GDT blocks at 2-256 Block bitmap at 257 (+257), csum 0xa86c6430 Inode bitmap at 273 (+273), csum 0x273ddfbb Inode table at 289-800 (+289) 23898 free blocks, 6438 free inodes, 357 directories, 6432 unused inodes Free blocks: 8870-32767 Free inodes: 598, 608, 1661, 1678, 1683, 1758, 1761-8192 Group 1: (Blocks 32768-65535) csum 0xa5fe [ITABLE_ZEROED] Backup superblock at 32768, Group descriptors at 32769-32769 Reserved GDT blocks at 32770-33024 Block bitmap at 258 (bg #0 + 258), csum 0x21a5f734

his Wikipedia entry needs a lot of work but can give you a slightly more accurate description of T HTree. https://en.wikipedia.org/wiki/HTree

567

Chapter 19

Filesystems

Inode bitmap at 274 (bg #0 + 274), csum 0x951a9172 Inode table at 801-1312 (bg #0 + 801) 28068 free blocks, 8190 free inodes, 2 directories, 8190 unused inodes Free blocks: 33039, 33056-33059, 33067, 33405, 33485, 33880-33895, 34240- 34255, 34317-34318, 34374-34375, 34398-34415, 34426-34427, 34432-34447, 34464-34479, 34504-34507, 34534-34543, 34546-34681, 34688-34820, 34822-36071, 36304-36351, 36496-36529, 36532-36546, 36558-36575, 36594-36697, 36704, 36706-36708, 36730, 36742, 36793, 36804-36807, 36837, 36840, 36844-37889, 37895-38771, 38776-38779, 38839-38845, 38849-38851, 38855, 38867, 38878, 38881-38882, 38886, 38906-38910, 38937, 38940-38941, 38947, 38960-39423, 39440-39471, 39473, 39483-39935, 39938-39939, 39942-39951, 39954-39955, 39957-39959, 39964-40447, 40454-40965, 40971-41472, 41474-45055, 47325-47615, 47618-47620, 47622-65535 Free inodes: 8195-16384 Group 2: (Blocks 65536-98303) csum 0x064f [ITABLE_ZEROED] Block bitmap at 259 (bg #0 + 259), csum 0x2737c1ef Inode bitmap at 275 (bg #0 + 275), csum 0x951a9172 Inode table at 1313-1824 (bg #0 + 1313) 30727 free blocks, 8190 free inodes, 2 directories, 8190 unused inodes Free blocks: 67577-98303 Free inodes: 16387-24576 <snip>

I have pruned the output from this command to show data for the first three groups. Each group has its own block and inode bitmaps and an inode table. The listing of free blocks in each group enables the filesystem to easily locate free space in which to store new files or to add to existing ones. If you compare the block number range for the entire group against the free blocks, you will see that the file data is spread through the groups rather than being jammed together starting from the beginning. We will see more about this later in this chapter in the section, “Data allocation strategies.” Group 2in the preceding output has no data stored in it because all of the data blocks assigned to this group are free. If you scroll down toward the end of the data for this filesystem, you will see that the remaining groups have no data stored in them either.

568

Chapter 19

Filesystems

T he inode What is an inode? Short for index node, an inode is one 256-byte block on the disk that stores data about a file. This includes the size of the file, the user IDs of the user and group owners of the file, and the file mode, that is, the access permissions, three timestamps specifying the time and date that the file was last accessed and modified and that the data in the inode itself was last modified. The inode has been mentioned previously as a key component of the metadata of the Linux EXT filesystems. Figure19-5 shows the relationship between the inode and the data stored on the hard drive. This diagram is the directory and inode for a single file which, in this case, is highly fragmented. The EXT filesystems work actively to reduce fragmentation, so it is very unlikely you will ever see a file with this many indirect data blocks or extents. In fact fragmentation is extremely low in EXT filesystems, so most inodes will use only one or two direct data pointers and none of the indirect pointers.

Figure 19-5. The inode stores information about each file and enables the EXT filesystem to locate all data belonging to it 569

Chapter 19

Filesystems

The inode does not contain the name of the file. Access to a file is via the directory entry which itself is the name of the file and which contains a pointer to the inode. The value of that pointer is the inode number. Each inode in a filesystem has a unique ID number, but inodes in other filesystems on the same computer and even hard drive can have the same inode number. This has implications for links that were discussed in Chapter 18. For files that have significant fragmentation, it becomes necessary to have some additional capabilities in the form of indirect nodes. Technically these are not really inodes, so I use the name node here for convenience. An indirect node is a normal data block in the filesystem that is used only for describing data and not for storage of metadata. Thus more than 15 entries can be supported. For example, a block size of 4K can support 512 4-byte indirect nodes thus allowing 12(Direct)+512(Indirect)=524 extents for a single file. Double and triple indirect node support is also supported, but files requiring that many extents are unlikely to be encountered in most environments. In Minix and the EXT1-3 filesystems, the pointers to the data is in the form of a list of data zones or blocks. For EXT4, the inode lists the extents that belong to the file. An extent is a list of contiguous data blocks that belong to a file. Files may be comprised of more than one extent. The only limit on the number of data blocks in a single extent is the total size of a cylinder group. Practically, the limit is the amount of contiguous free space available in a group at the time the file is created.

J ournal The journal, introduced in the EXT3 filesystem, had the singular objective of overcoming the massive amounts of time that the fsck program required to fully recover a disk structure damaged by an improper shutdown that occurred during a file update operation. The only structural addition to the EXT filesystem to accomplish this was the journal15 which records in advance the changes that will be performed to the filesystem. Instead of writing data to the disk data areas directly, the journal provides for writing of file data to a specified area on the disk along with its metadata. Once the data is safe on the hard drive, it can be merged in or appended to the target file with almost zero chance of losing data. As this data is committed to the data area of the disk, the journal is updated so that the filesystem will still be in a consistent state in the event of a system

Wikipedia, Journaling File System,https://en.wikipedia.org/wiki/Journaling_file_system

570

Chapter 19

Filesystems

failure before all of the data in the journal is committed. On the next boot, the filesystem will be checked for inconsistencies, and data remaining in the journal will then be committed to the data areas of the disk to complete the updates to the target file. Journaling does impact data write performance; however there are three options available for the journal that allow the user to choose between performance and data integrity and safety. The EXT4 man page has a description of these settings: •

Journal: Both metadata and file contents written to the journal before commit to the main filesystem. This offers the greatest reliability with a performance penalty because the data written twice.

•

Writeback: The metadata is written to the journal, but the file contents are not. This is a faster option but subject to possible of out- of-order writes in which files being appended to during a crash may gain a tail of garbage on the next mount.

•

Ordered: This option is a bit like writeback, but it forces file contents to be written before associated metadata is marked as committed in the journal. It is an acceptable compromise between reliability and performance and is the default for new EXT3 and EXT4 filesystems.

My personal preference is the middle ground because my environments do not require heavy disk write activity, so performance should not normally be an issue. I go with the default which provides reliability with a bit of a performance hit. This choice can be set in /etc/fstab as a mount option or as a boot parameter by passing the option to the kernel by editing the GRUB2 kernel options line. The journaling function reduces the time required to check the hard drive for inconsistencies after a failure from hours or even days to mere minutes at the most. Of course these times may vary significantly depending upon many factors, especially the size and type of drives. I have had many issues over the years that have crashed my systems. The details could fill another chapter but suffice it to say that most were selfinflicted like kicking out a power plug. Fortunately the EXT journaling filesystems have reduced that boot up recovery time to two or three minutes. In addition, I have never had a problem with lost data since I started using EXT3 with journaling. The journaling feature of EXT4 may be turned off, and it then functions as an EXT2 filesystem. The journal itself still exists, empty and unused. Simply remount the partition with the mount command using the type parameter to specify EXT2. You may be able to do this from the command line, depending upon which filesystem you are working with, 571

Chapter 19

Filesystems

but you can change the type specifier in the /etc/fstab file and then reboot. I strongly recommend against mounting an EXT3 or EXT4 filesystem as EXT2 because of the additional potential for lost data and extended recovery times. An existing EXT2 filesystem can be upgraded with the addition of a journal using the following command where /dev/sda1 is the drive and partition identifier. Be sure to change the file type specifier in /etc/fstab and remount the partition to have the change take effect: tune2fs -j /dev/sda1 This should seldom be necessary because the EXT2 filesystem was superseded by EXT3 with a journal in 2001.16

Data allocation strategies The EXT filesystem implements several data allocation strategies that ensured minimal file fragmentation. Reducing fragmentation results in improved filesystem performance. Data allocation for the EXT4 filesystem is managed using extents. An extent is described by its starting and ending place on the hard drive. This makes it possible to describe very long physically contiguous files in a single inode pointer entry which can significantly reduce the number of pointers required to describe the location of all the data in larger files. Other allocation strategies have been implemented in EXT4 to further reduce fragmentation. EXT4 reduces fragmentation by scattering newly created files across the disk so that they are not bunched up in one location at the beginning of the disk as many early PC filesystems such as FAT did. The file allocation algorithms attempt to spread the files as evenly as possible among the cylinder groups and, when fragmentation is necessary, to keep the discontinuous file extents close to the others belonging to the same file to minimize head seek and rotational latency as much as possible. Additional strategies are used to preallocate extra disk space when a new file is created or when an existing file is extended. This helps to ensure that extending the file will not automatically result in its becoming fragmented. New files are never allocated immediately following the end of existing files which also reduces or prevents fragmentation of the existing files.

Wikipedia, EXT3, https://en.wikipedia.org/wiki/Ext3

572

Chapter 19

Filesystems

Aside from the actual location of the data on the disk, EXT4 uses functional strategies such as delayed allocation to allow the filesystem to collect all of the data being written to the disk before allocating the space to it. This can improve the likelihood that the allocated data space will be contiguous.

Data fragmentation For many older PC filesystems such as FAT and all its variants and NTFS, fragmentation has been a significant problem resulting in degraded hard drive performance. Defragmentation became an industry in itself with different brands of defragmentation software that ranged from very effective to only marginally so. Hard drives use magnetic disks that rotate at high speed and moving heads to position the data read/write transducers over the correct track. It is this wait for the heads to seek to a specific track and then the wait for the desired data block to be read by the read/write heads that causes the delays when files are fragmented. Although SSD drives can experience file fragmentation, there is no performance penalty because, like all solid-state memory, even though SSDs emulate a hard drive, they do not have the spinning platters and moving heads of a traditional hard drive. Linux’s Extended filesystems use data allocation strategies that help to minimize fragmentation of files on the hard drive and reduce the effects of fragmentation when it does occur. You can use the fsck command on EXT filesystems to check the total filesystem fragmentation. The following example is to check the home directory of my main workstation which was only 1.5% fragmented. Jason, my diligent technical reviewer, reports 1.2% fragmentation on his home desktop workstation: fsck -fn /dev/mapper/vg_01-home Let’s see how fragmented our VM home directories are.

EXPERIMENT 19-2 Let’s look at the amount of file fragmentation on the hard drive of your VM.Perform this experiment as root. The fsck (filesystem check) command is usually used to repair filesystems after a crash or other incident which may make them inconsistent. It can also be used to report on fragmentation. The -f option forces checking of the filesystem even if it is marked as clean and 573

Chapter 19

Filesystems

the -n option tells fsck to not fix problems it finds. This results in a report, hopefully short, of the current state of the filesystem: [root@studentvm1 ~]# fsck -fn /dev/mapper/fedora_studentvm1-home fsck from util-linux 2.32.1 e2fsck 1.44.3 (10-July-2018) Warning!/dev/mapper/fedora_studentvm1-home is mounted. Warning: skipping journal recovery because doing a read-only filesystem check. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information home: 289/131072 files (0.0% non-contiguous), 26578/524288 blocks [root@studentvm1 ~]#

Some problems may occasionally be reported such as inconsistencies in the inode or data block counts. This can occur during normal operation on a virtual hard drive just as it can on a physical one. I have on occasion simply powered off the VM without a proper shutdown. It is unlikely that you will have errors like this. For now, look at the last line of the output from fsck. This shows that there are 0.0% noncontiguous blocks which implies that there is 0% fragmentation. Jason reported 1.9% fragmentation on his StudentVM1 host. That may not be exactly true because the actual number may be very small and not within the granularity of a single decimal place. From a practical standpoint, 0.0% is essentially zero fragmentation. The other numbers on this line are rather obscure. After reading the man page for fsck and many online searches, I have found that these numbers are not explicitly defined. I think that the first pair means that 114 inodes from a total of 131,072 have been used. This would mean that there are 114 files and directories– directories are just files with directory entries contained in them. Cross-checking with the output of dumpe2fs in Experiment 19-1, the number 131,072 is correct for the total number of inodes, and the free inode count is 130,958, with the difference being 114. The total block count of 524,288 also matches up as does the difference between that and the free blocks, so we can conclude that my initial assumptions were correct. Check all of these numbers on your own VM to verify that they are correct. 574

Chapter 19

Filesystems

I once performed some theoretical calculations to determine whether disk defragmentation might result in any noticeable performance improvement. While I did make some assumptions, the disk performance data I used were from a then new 300GB, Western Digital hard drive with a 2.0ms track to track seek time. The number of files in this example was the actual number that existed in the filesystem on the day I did the calculation. I did assume that a fairly large amount of the fragmented files would be touched each day, 20%. Total files

271,794

% Fragmentation

5.00%

Discontinuities

13,590

% fragmented files touched per day (assumption) Number of additional seeks

20% 2,718

Average seek time

10.90ms

Total additional seek time per day

29.63Sec

Track to Track seek time

2.00ms

Total additional seek time per day

5.44Sec

Figure 19-6. The theoretical effects of fragmentation on disk performance I have done two calculations for the total additional seek time per day, one based on the track to track seek time, which is the more likely scenario for most files due to the EXT file allocation strategies, and one for the average seek time which I assumed would make a fair worst-case scenario.

575

Chapter 19

Filesystems

You can see from Figure19-6 that the impact of fragmentation on a modern EXT filesystem with a hard drive of even modest performance would be minimal and negligible for the vast majority of applications. You can plug the numbers from your environment into your own similar spreadsheet to see what you might expect in the way of performance impact. This type of calculation most likely will not represent actual performance, but it can provide a bit of insight into fragmentation and its theoretical impact on a system. Jason reports noticeable impact from fragmentation with very large files that are very near continually accessed– usually databases or datastores– for which the application itself is also reading nonsequentially, meaning there was enough jumping around to begin with that disk I/O was already a limiting factor. Most of the partitions on my primary workstation are around 1.5% or 1.6% fragmented; I do have one 128GB filesystem on a logical volume (LV) that is 3.3% fragmented. That is a with fewer than 100 very large ISO image files, and I have had to expand the LV several times over the years as it got too full. This resulted in more fragmentation than had I been able to allocate a larger amount of space to the LV in the beginning. Some application environments require greater assurance of even less fragmentation. The EXT filesystem can be tuned with care by a knowledgeable admin who can adjust the parameters to compensate for specific workload types. This can be done when the filesystem is created or later using the tune2fs command. The results of each tuning change should be tested, meticulously recorded, and analyzed to ensure optimum performance for the target environment. In the worst case where performance cannot be improved to desired levels, other filesystem types are available that may be more suitable for a particular workload. And remember that it is common to mix filesystem types on a single host system to match the load placed on each filesystem. Due to the low amount of fragmentation on most EXT filesystems, it is not necessary to defragment. In any event there is no safe defragmentation tool for EXT filesystems. There are a few tools that allow you to check the fragmentation of an individual file, or the fragmentation of the remaining free space in a filesystem. There is one tool, e4defrag, which will defragment a single file, directory or filesystem as much as the remaining free space will allow. As its name implies, it only works on files in an EXT4 filesystem, and it does have some limitations.

576

Chapter 19

Filesystems

EXPERIMENT 19-3 Perform this experiment as root. Run the following command to check the fragmentation status of the filesystem: [root@studentvm1 ~]# e4defrag -c /dev/mapper/fedora_studentvm1-home e4defrag 1.44.3 (10-July-2018)

now/bestsize/ext 1. 2. 3. 4. 5.

/home/student/dmesg2.txt1/144KB /home/student/.xsession-errors1/14KB /home/student/dmesg3.txt1/144KB /home/student/.bash_history1/14KB /home/student/.ssh/authorized_keys1/1 4KB

Total/best extents87/85 Average size per extent17KB Fragmentation score4 [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag] This device (/dev/mapper/fedora_studentvm1-home) does not need defragmentation. Done.

This output shows a list of fragmented files, a score, and information about how to interpret that score. It also contains a recommendation about whether to defrag or not. It is not clear why these files are shown as fragmented because they each only have a single extent so are 100% contiguous by definition. Let’s just defrag one of these files just to see what that would look like. Choose a file with the most fragmentation for your test: [root@studentvm1 ~]# e4defrag -v /home/student/dmesg2.txt e4defrag 1.44.3 (10-July-2018) ext4 defragmentation for /home/student/dmesg2.txt [1/1]/home/student/dmesg2.txt:100%extents: 1 -> 1 [ OK ] Success:[1/1]

Read the man page for e4defrag for more information on its limitations.

577

Chapter 19

Filesystems

There are no safe tools for defragmenting EXT1, 2, and 3 filesystems. And, according to its own man page, the e4defrag utility is not guaranteed to perform complete defragmentation. It may be able to “reduce” file fragmentation. Based on the inconsistency in its report shown in Experiment 19-3, I am disinclined to use it, and, in any event, there is seldom any necessity to do so. If it does become necessary to perform a complete defragmentation on an EXT filesystem, there is only one method that will work reliably. You must move all of the files from the filesystem to be defragmented, ensuring that they are deleted after being safely copied to another location. If possible, you could then increase the size of the filesystem to help reduce future fragmentation. Then copy the files back onto the target filesystem. Even this does not guarantee that all of the files will be completely defragmented.

R epairing problems We can repair problems that cause the host not to boot, such as a misconfigured /etc/ fstab file, but in order to do so, the filesystem on which the configuration file being repaired resides must be mounted. That presents a problem if the filesystem in question cannot be mounted during Linux startup. This means that the host must be booted into recovery mode to perform the repairs.

The /etc/fstab file How does Linux know where to mount the filesystems on the directory tree? The /etc/ fstab file defines the filesystems and the mount points on which they are to be mounted. Since I have already mentioned the /etc/fstab as a potential problem, let’s look at it to see what it does. Then we will break it in order to see how to fix it. Figure 19-7 shows the /etc/fstab from our VM, StudentVM1. Your fstab should look almost identical to this one with the exception value of the UUID for the boot partition. The function of fstab is to specify the filesystems that should be mounted during startup and the mount points on which they are to be mounted, along with any options that might be necessary. Each filesystem has at least one attribute that we can use to refer to in /etc/fstab in order to identify it to the startup process. Each of the filesystem line entries in this simple fstab contains six columns of data.

578

Chapter 19

Filesystems

# # /etc/fstab # Created by anaconda on Sat Dec 22 11:05:37 2018 # # Accessible filesystems, by reference, are maintained under '/dev/disk/'. # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info. # # After editing this file, run 'systemctl daemon-reload' to update systemd # units generated from this file. # /dev/mapper/fedora_studentvm1-root /

ext4

defaults

1 1

UUID=9948ca04-c03c-4a4a-a9ca-a688801555c3 /boot

ext4

defaults

1 2

/dev/mapper/fedora_studentvm1-home /home

ext4

defaults

1 2

/dev/mapper/fedora_studentvm1-tmp /tmp

ext4

defaults

1 2

/dev/mapper/fedora_studentvm1-usr /usr

ext4

defaults

1 2

/dev/mapper/fedora_studentvm1-var /var

ext4

defaults

1 2

/dev/mapper/fedora_studentvm1-swap swap

swap

defaults

0 0

Figure 19-7. The filesystem table (fstab) for StudentVM1 The first column is an identifier that identifies the filesystem so that the startup process knows which filesystem to work with in this line. There are multiple ways to identify the filesystem, two of which are shown here. The /boot partition in Figure19-7 is identified using the UUID, or Universal Unique IDentifier. This is an ID that is guaranteed to be unique so that no other partition can have the same one. The UUID is generated when the filesystem is created and is located in the superblock for the partition. All of the other partitions on our VMs are identified using the path to the device special files in the /dev directory. Another option would be to use the labels we entered when we created the filesystems during the installation process. A typical entry in fstab would look like that in Figure19-8.

LABEL=boot /boot

ext4

defaults

1 2

Figure 19-8. Using a label to identify the system in /etc/fstab

579

Chapter 19

Filesystems

The filesystem label is also stored in the partition superblock. Let’s change the /boot partition entry in the fstab to use the label we have already created to identify it.

EXPERIMENT 19-4 Perform this experiment as root. Be sure to verify the device special ID for the boot partition and then dump the content of the superblock for the /boot partition: [root@studentvm1 ~]# lsblk NAMEMAJ:MIN RMSIZE RO TYPE MOUNTPOINT sda8:0060G0 disk ├─sda18:101G0 part /boot └─sda28:2059G0 part ├─fedora_studentvm1-root 253:002G0 lvm/ <snip> [root@studentvm1 ~]# dumpe2fs /dev/sda1 dumpe2fs 1.44.3 (10-July-2018) Filesystem volume name:boot Last mounted on:/boot Filesystem UUID:9948ca04-c03c-4a4a-a9ca-a688801555c3 <snip>

The Filesystem Volume Name is the label. We can test this. Change the label and then check the superblock: [root@studentvm1 ~]# e2label /dev/sda1 MyBoot [root@studentvm1 ~]# dumpe2fs /dev/sda1 Filesystem volume name:MyBoot Last mounted on:/boot Filesystem UUID:9948ca04-c03c-4a4a-a9ca-a688801555c3 <snip>

Notice the Filesystem UUID in the superblock is identical to that shown in the /etc/fstab file in Figure19-7. Use the Vim editor to comment out the current entry for the /boot partition, and create a new entry using the label. The fstab should now look like this. I have modified it to be a bit more tidy by aligning the columns better:

580

Chapter 19

Filesystems

<snip> # /dev/mapper/fedora_studentvm1-root /ext4defaults1 # UUID=9948ca04-c03c-4a4a-a9ca-a688801555c3 /bootext4defaults1 LABEL=boot/bootext4defaults1 /dev/mapper/fedora_studentvm1-home /homeext4defaults1 /dev/mapper/fedora_studentvm1-tmp/tmpext4defaults1 /dev/mapper/fedora_studentvm1-usr/usrext4defaults1 /dev/mapper/fedora_studentvm1-var/varext4defaults1 /dev/mapper/fedora_studentvm1-swap swapswapdefaults0

1 2 2 2 2 2 2 0

Reboot StudentVM1 to ensure that the change works as expected. Ooops! it did not.

Figure 19-9. An error occurred during the reboot of StudentVM1 after changing fstab 581

Chapter 19

Filesystems

If you have followed my instructions carefully, this problem shows up during startup (after boot)17 with the message shown on the last line in Figure19-9. This indicates that the boot. device (/dev/sda1) cannot be mounted. Can you think of any reason that might be the case? I can– I intentionally skipped the step of setting the filesystem label from MyBoot back to just boot. We can wait until the 1-minute and 30-second timeout completes and then the system, having determined that the filesystem cannot be mounted, will automatically proceed to “emergency” mode. Type in your root password, and press the Enter key to continue. Verify the current filesystem label, then change it to “boot”: [root@studentvm1 ~]# e2label /dev/sda1 MyBoot [root@studentvm1 ~]# e2label /dev/sda1 boot [root@studentvm1 ~]# [188.3880009] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)

As soon as the label is changed, the filesystem is mounted as shown by the resulting message, earlier. Now bring the system up to the graphical target (run level 5): [root@studentvm1 ~]# systemctl isolate graphical.target

Note that it was not necessary to reboot to make the repair or to raise the system from the emergency target to the graphical target. Let’s get back to deconstructing the fstab file. The second column in the /etc/fstab file in Figure19-7 is the mount point on which the filesystem identified by the data in column 1 is mounted. These mount points are empty directories to which the filesystem is mounted. The third column specifies the filesystem type, in this case, EXT4 for most of the entries. The one different entry in Figure19-7 is for the swap partition. Figure19-10 shows an entry for a VFAT device which is usually how USB memory sticks are formatted. The mount point for this device is located at /media/SS-R100.

See Chapter 16.

582

Chapter 19 LABEL=SS-R100

/media/SS-R100

vfat

user,noauto,defaults

Filesystems 00

Figure 19-10. An fstab entry for a USB memory stick showing some alternate configuration possibilities The fourth column of data in the fstab file is a list of options. The mount command has many options, and each option has a default setting. In Figure19-7 the fourth column of fstab indicates that the filesystem is to be mounted using all defaults. In Figure19-10, some of the defaults are overridden. The “user” option means that any user can mount or unmount the filesystem even if another user has already mounted it. The “noauto” option means that this filesystem is not automatically mounted during the Linux startup. It can be manually mounted and unmounted after startup. This is ideal for a removable device like a USB memory stick that may be used for sharing files or transporting them to work on at another location. The last two columns are of numbers. In Figure19-7, the entries for /home are 1 and 2, respectively. The first number is used by the dump command which is one possible option for making backups. The dump command is seldom used for backups anymore, so this column is usually ignored. If by some chance someone is still using dump to make backups, a one (1) in this column means to back up this entire filesystem, and a zero means to skip this filesystem. The last column is also numeric. It specifies the sequence in which fsck is run against filesystems during startup. Zero (0) means do not run fsck on the filesystem. One(1) means to run fsck on this filesystem first. The root partition is always checked first as you can see from the numbers in this column in Figure19-7. The rest of the entries in this column have a value of 2 which means that fsck will not begin running against those filesystems until it has finished with checking the root filesystem. Then all of the filesystems that have a value of 2 can be checked in parallel rather than sequentially so that the overall check can be finished sooner. Although it is generally considered best practice to mount filesystems on mount points directly on the / (root) filesystem, it is also possible to use multilevel mount points. Figure19-11 shows what multilevel mounts look like. For example, the /usr filesystem is mounted on the /usr directory. In Figure19-2 the /usr/local directory is listed. It contains locally created executables, especially scripts in /usr/local/bin and configuration files in / usr/local/etc, as well as libraries, man pages, and more. I have encountered installations where a filesystem, “local”, was mounted on /usr. This gives additional flexibility during Linux upgrades because the /usr/local filesystem did not need to be formatted during an upgrade or reinstallation like the rest of the /usr filesystem. 583

Chapter 19

Filesystems

The root filesystem. . ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── └──

bin -> usr/bin boot dev etc home lib -> usr/lib lib64 -> usr/lib64 lost+found media mnt opt proc root run sbin -> usr/sbin srv sys tmp usr ────────────── var

The /usr filesystem.

The /usr/local filesystem.

The usr filesystem is mounted on the /usr mountpoint. . ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── ├── └──

bin games include lib lib64 libexec local ──────────── lost+found sbin share src tmp -> ../var/tmp

The local filesystem is mounted on the /usr/local mountpoint . ├── ├── ├── ├── ├── ├── ├── ├── ├── └──

bin etc games include lib lib64 libexec sbin share src

Figure 19-11. It is possible to do multilevel mounts although this is not considered a good practice. Note that this illustration shows only the top-level directories of each filesystem

584

Chapter 19

Filesystems

Repairing damaged filesystems Sometimes the filesystem itself is damaged due to improper shutdown or hardware failures, and we need to fix the meta-structure inconsistencies. As mentioned in Experiment 19-2, these may be in the form of incorrect inode or data block counts. You may also encounter orphaned inodes. An orphaned inode is one that has become disconnected from the list of inodes belonging to a directory or cylinder group so that it cannot be found for use. The best and easiest way to run fsck on all filesystems is to reboot the host. Systemd, the system and service manager, is configured to run fsck on all filesystems at startup if there is a nonzero number in the last column of the filesystem entry in /etc/fstab. The fsck program first checks to see if there are any detectable problems which takes very little time. If fsck detects a problem, it then resolves the problems.

EXPERIMENT 19-5 It is not necessary to reboot to perform this experiment, but it is necessary to do it as root. The /var/log/messages files contain entries that record the fact that fsck was run on each filesystem at boot time: [root@studentvm1 log]# cd /var/log ; grep fsck messages <snip> Jan8 17:34:39 studentvm1 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-fsck-root comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Jan8 17:34:39 studentvm1 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-fsck-root comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' <snip>

This pair of messages tells us that fsck was started on the root filesystem and then, presumably because there were no errors or inconsistencies detected, stopped. You should see a pair of messages like these for every filesystem at each boot.

585

Chapter 19

Filesystems

Due to the running of fsck at every startup, there should seldom be reason to run it from the command line, if ever. Despite this, we SysAdmins sometimes find the need to do things that “should never be necessary.” So there is a way to enter rescue mode and run fsck on most filesystems manually.

EXPERIMENT 19-6 On the physical host, download the latest Fedora Server ISO image from the Fedora download page at https://getfedora.org/. Click the Server image, then, on the most recent “Download Fedora Server” page, and select the proper image architecture which should be the x86_64 DVD ISO image. Download this image to the same location as you did the original Fedora live image from which you installed Fedora on StudentVM1 back in Chapter 5. Power off StudentVM1. Using the VirtualBox Manager, open the Settings dialog for the StudentVM1 virtual machine, and click Storage. Select the Optical device, which is probably located on the IDE controller, and then use the Optical Drive icon in the Attributes section of the dialog box to select the new server ISO image. Click the OK button. On the System dialog of Settings, verify that the optical drive is bootable and at the top of the list of boot devices. Click the OK button and then OK again on the main Settings dialog. Boot StudentVM1. On the initial menu, shown in Figure19-12, you can see that this is an installation image and not a live image. This is how you can tell that this is the server ISO image.

586

Chapter 19

Filesystems

Figure 19-12. Choose the “Troubleshooting” menu item on the Fedora Server boot menu

Although I am using the Fedora 29 Server image with a Fedora 29 installation, it is possible to perform a rescue with one Fedora release installed on the host and a different release for the ISO image so long as they are reasonably close. I would recommend the use of a higher release of ISO image with a lower release installed on the host. Of course it is always best to use the same release for rescue as is installed on the host.

587

Chapter 19

Filesystems

Use the down arrow key to select the “Troubleshooting” menu item, and press the Enter key. This opens a menu, shown in Figure19-13, which provides us with several troubleshooting options. The Install Fedora selection would allow installation of Fedora with a very basic graphical mode in the event that the graphics adapter encounters problems with video drivers. The memory test can help identify the failing memory DIMM, and I have used it on a couple occasions. You could also boot from the local drive, that is, the operating system installed on the hard drive or SSD, or simply return to the main menu.

Figure 19-13. Select “Rescue a Fedora system,” and press Enter

588

Chapter 19

Filesystems

Select, Rescue a Fedora system, and press Enter to proceed with the boot process. In Figure19-14 we see the Rescue menu. Menu item 1 causes the Rescue environment to locate all of the accessible filesystems on the hard drive and mount them on the /mnt/ sysimage/ directory. This makes it possible to explore the content and integrity of the filesystems and to modify configuration files that may be causing problems. In this Rescue environment, the hard drive filesystems are exposed through the /dev filesystem just as they would be when booting directly from the hard drive. Therefore it is possible to run fsck -n to identify filesystems with problems. With the exception of the root (/) filesystem, you can then unmount those filesystems with inconsistencies, run fsck to correct the problems, and then remount them. After all problematic filesystems have been corrected, rebooting from the hard drive presents a system with no filesystem inconsistencies.

Figure 19-14. Select menu item 1 to continue to a rescue environment 589

Chapter 19

Filesystems

Read the information on the screen as shown in Figure19-14. This tells us what the menu options will do. Menu item 3 would take you to a shell, but the filesystems on the HDD or SSD would not be accessible because they would not have device files in the /dev directory. To continue to the rescue shell, type 1. It is not necessary to press the Enter key. Figure19-15 shows the message telling where the system filesystems will be located, /mnt/sysimage, how to reboot the system when you are finished, and how to use the chroot command to make the /mnt/sysimage directory the top level system directory. More about chroot later.

Figure 19-15. Read the information about the rescue shell and then press Enter to get to the rescue shell

590

Chapter 19

Filesystems

The rescue shell is limited. Many tools available for a Bash shell, the man pages, the normal $PATH, and other environment variables are not available. A limited version of Vim that corresponds to the old vi is the only editor available. Note that the PWD is not displayed as part of the command prompt. Command-line recall and editing is available. From this rescue shell, we can run fsck against all of the filesystems except for root (/). Before we do that, there is another utility that enabled me to record the steps I took while in the rescue environment, the script command. On a host that uses a console or rescue shell and not a GUI desktop, and in which the SSHD server cannot be run, it is very difficult copy the screen in a text format so that I can use that text and just paste it in a book or article. The script utility, which is part of the util-linux package and thus one of the core utilities, allows us to record the complete session and store the results in a text file which can later be copied into a document. Now that we are in the rescue shell, let’s start the script utility. The output file is specified on the /tmp filesystem which is currently mounted on /mnt/sysimage/tmp. By placing the text file output from the script program there during rescue mode, it will be available in /tmp after a normal startup. Here we have another good reason to make /tmp a separate filesystem rather than part of the root (/) filesystem: bash-4.4# script /mnt/sysimage/tmp/chapter-19.txt Script started on 2019-01-12 08:36:33+00:00

We are now running in the script command’s recording environment; everything typed at the command line and sent to STDOUT is recorded in the file we specified. Take a quick look at the filesystem directory tree structure while in rescue mode: bash-4.4# lsblk -i NAMEMAJ:MIN RMSIZE RO loop07:00 481.2M1 loop17:10 2G1 |-live-rw253:00 2G0 `-live-base253:10 2G1 loop27:2032G0 `-live-rw253:00 2G0 sda8:0060G0 |-sda18:10 1G0 `-sda28:2059G0 |-fedora_studentvm1-root 253:20 2G0

TYPE MOUNTPOINT loop loop dm / dm loop dm / disk part /mnt/sysimage/boot part lvm/mnt/sysimage

591

Chapter 19

Filesystems

|-fedora_studentvm1-home 253:30 2G0 |-fedora_studentvm1-tmp253:40 5G0 |-fedora_studentvm1-usr253:5015G0 |-fedora_studentvm1-var253:6010G0 `-fedora_studentvm1-swap 253:70 4G0 sr011:012.9G0

lvm/mnt/sysimage/home lvm/mnt/sysimage/tmp lvm/mnt/sysimage/usr lvm/mnt/sysimage/var lvm[SWAP] rom/run/install/repo

Unmount the /home filesystem, and verify that it is no longer mounted: bash-4.4# umount /mnt/sysimage/home/ bash-4.4# lsblk -i NAMEMAJ:MIN RMSIZE RO loop07:00 481.2M1 loop17:10 2G1 |-live-rw253:00 2G0 `-live-base253:10 2G1 loop27:2032G0 `-live-rw253:00 2G0 sda8:0060G0 |-sda18:10 1G0 `-sda28:2059G0 |-fedora_studentvm1-root 253:20 2G0 |-fedora_studentvm1-home 253:30 2G0 |-fedora_studentvm1-tmp253:40 5G0 |-fedora_studentvm1-usr253:5015G0 |-fedora_studentvm1-var253:6010G0 `-fedora_studentvm1-swap 253:70 4G0

TYPE MOUNTPOINT loop loop dm / dm loop dm / disk part /mnt/sysimage/boot part lvm/mnt/sysimage lvm lvm/mnt/sysimage/tmp lvm/mnt/sysimage/usr lvm/mnt/sysimage/var lvm[SWAP]

sr011:012.9G0 rom/run/install/repo

Run fsck on the /home filesystem. We need to use the -f option to force fsck to perform a complete check even though it appears to be clean. We also use the -V option to produce verbose output. Your results may be different from these: bash-4.4# fsck -fV /dev/mapper/fedora_studentvm1-home fsck from util-linux 2.32.1 e2fsck 1.44.3 (10-July-2018) Pass Pass Pass Pass

592

1: 2: 3: 4:

Checking Checking Checking Checking

inodes, blocks, and sizes directory structure directory connectivity reference counts

Chapter 19

Filesystems

Pass 5: Checking group summary information 289 inodes used (0.22%, out of 131072) 0 non-contiguous files (0.0%) 0 non-contiguous directories (0.0%) # of inodes with ind/dind/tind blocks: 0/0/0 Extent depth histogram: 279 26578 blocks used (5.07%, out of 524288) 0 bad blocks 1 large file 225 53 0 0 0 3 2 0 -----------283

regular files directories character device files block device files fifos links symbolic links (2 fast symbolic links) sockets files

And exit from the script command’s recording environment: bash-4.4# exit exit Script done on 2019-01-12 08:40:37+00:00

Had there been any errors or inconsistencies in the /home filesystem they would have been corrected by fsck. Power off the VM, remove the server DVD ISO image from the optical disk virtual drive, and reboot StudentVM1.

593

Chapter 19

Filesystems

Finding lost files Files can get lost by the filesystem and by the user. This can also happen during fsck regardless of when or how it is initiated. One reason this happens is that the directory entry for the file that points to the file inode is damaged and no longer points to the inode for the file. You would probably see messages about orphaned inodes during startup when this occurs. These files are not really lost. The fsck utility has found the inode, but there is no corresponding directory entry for that file. The fsck utility does not know the name of the file or in what directory it was listed. It can recover the file, all it needs to do is make up a name and add the name to a directory along with a pointer to the inode. But where does it place the directory entry? Look in the lost+found directory of each filesystem to locate recovered files that belong to that filesystem. These lost files are moved to the lost+found directory simply by creating a directory entry for them in lost+found. The file names are seemingly random and give no indication of the types of files they are. You will have to use other tools such as file, stat, cat, and string to make some sort of determination so that you can rename the file with a meaningful name and extension and move it to an appropriate directory.

Creating anew filesystem I have had many occasions when it has become necessary to create a new filesystem. This can be simply because I need a completely new filesystem for some specific purpose, or it can be due to the need to replace an existing filesystem that is too small or damaged. This exercise takes you through the process of creating a new partition on an existing hard drive, creating a filesystem and a mount point and mounting the new filesystem. This is a common task, and you should become familiar with how to perform it. In many cases you will do this by adding a new hard drive with plenty of space. In this exercise we will use some space left free for this purpose. This exercise is about raw partitions and filesystems and not about using logical volume management. We will cover LVM and adding space to logical volumes in Chapter 1 of Volume 2.

594

Chapter 19

Filesystems

Finding space Before we can add a raw partition to our host, we need to identify some available disk space. We currently have a single virtual hard drive available on our VM, /dev/sda. Let’s see if there is some space available for a new partition on this device.

EXPERIMENT 19-7 Perform this experiment as root on StudentVM1. Use the fdisk command to determine whether any free space exists on /dev/sda: [root@studentvm1 ~]# fdisk -l /dev/sda Disk /dev/sda: 60 GiB, 64424509440 bytes, 125829120 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0xb449b58a DeviceBootStart EndSectors Size Id Type /dev/sda1* 2048 2099199 2097152 1G 83 Linux /dev/sda22099200 125829119 12372992059G 8e Linux LVM [root@studentvm1 ~]#

We can do a quick calculation using the number of sectors shown in the preceding data. The first line of output shows the total number of sectors on the device is 125,829,120, and the ending sector of /dev/sda2 is 125,829,119 which is a difference of one sector– not nearly enough to create a new partition. We need another option if we want to add a new partition. Notice the partition types in the ID column shown in Experiment 19-7. Partition type 83 is a standard Linux partition. Type 82 would be a Linux swap partition. Type 5 is an extended partition, and type 8e is a Linux LVM partition. The fdisk program does not provide any direct information on the total size of each partition in bytes, but that can be calculated from the available information.

595

Chapter 19

Filesystems

Add anew virtual hard drive Because the existing virtual hard drive has no room for a new partition, we need to create a new virtual hard drive. This is easy to do with VirtualBox but may require that the virtual machine be shut down to reconfigure the SATA controller.

EXPERIMENT 19-8 On the physical host desktop, open the VirtualBox Manager if it is not already. In Figure19-16 check to see if there is a SATA port available so we can add a new virtual disk drive while the VM is running. We did set the number of SATA ports to 5in Chapter 4, but verify this anyway.

Figure 19-16. Verify that the port count for the SATA controller is 5 or more 596

Chapter 19

Filesystems

We will need some additional drives in Chapter 1 of Volume 2, as well. Let’s add the new virtual disk device while the VM is up and running. This procedure is equivalent to installing a new hot-plug hard drive in a physical hardware system while it is running. Power on the VM, and log in to the GUI desktop as the student user. Open the Storage Settings menu, and click the Add hard disk icon as shown in Figure19-17 to create a new disk device on the SATA controller.

Figure 19-17. Click the Add hard disk icon to add a new drive to the SATA controller Click the OK button, and then the Create new disk button. The next dialog is a choice of hard disk file type. Use the default of VDI which is a VirtualBox Disk Image. Press the Next button. We want this disk to be dynamically allocated per the default, so do not make any changes on this dialog and press Next to continue. Use the dialog in Figure19-18 to set the virtual disk name to StudentVM1-1 and the disk size to 20GB. 597

Chapter 19

Filesystems

Figure 19-18. Enter the name of the virtual disk as StudentVM1-1, and set the size to 20GB Press the Create button to create the new virtual hard drive. The new device now shows up on the list of storage devices in the Storage Settings dialog box. Press OK to close the Settings dialog. We have now added a second virtual hard drive to the StudentVM1 virtual host. In Experiment 19-8 we created a new 20GB virtual hard drive. The drive is now ready for us to partition and format.

598

Chapter 19

Filesystems

EXPERIMENT 19-9 Open a terminal session and su – to root. Display the list of current hard drives and partitions: [root@studentvm1 ~]# lsblk -i NAMEMAJ:MIN RMSIZE RO TYPE MOUNTPOINT sda8:0060G0 disk |-sda18:101G0 part /boot `-sda28:2059G0 |-fedora_studentvm1-root 253:002G0 |-fedora_studentvm1-swap 253:104G0 |-fedora_studentvm1-usr253:20 15G0 |-fedora_studentvm1-home 253:302G0 |-fedora_studentvm1-var253:40 10G0 `-fedora_studentvm1-tmp253:505G0 sdb8:16020G0 sr011:01 1024M0

part lvm/ lvm[SWAP] lvm/usr lvm/home lvm/var lvm/tmp disk rom

The new virtual hard drive is /dev/sdb. Even though it is not physical hardware, we can get more detail about the device in order to further verify that it is the correct one: [root@studentvm1 ~]# smartctl -x /dev/sdb smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.10-300.fc29.x86_64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model:VBOX HARDDISK Serial Number:VB99cc7ab2-512a8e44 Firmware Version: 1.0 User Capacity:21,474,836,480 bytes [21.4 GB] Sector Size:512 bytes logical/physical Device is:Not in smartctl database [for details use: -P showall] ATA Version is:ATA/ATAPI-6 published, ANSI INCITS 361-2002 Local Time is:Sun Jan 13 15:55:00 2019 EST SMART support is: Unavailable- device lacks SMART capability. AAM feature is:Unavailable APM feature is:Unavailable Rd look-ahead is: Enabled

599

Chapter 19

Filesystems

Write cache is:Enabled DSN feature is:Unavailable ATA Security is:Unavailable Wt Cache Reorder: Unavailable

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. We have determined that we have a 20GB (virtual) hard drive, /dev/sdb. The next step is to create a partition, format it, and add a partition label. We use the fdisk utility to create a new partition: [root@studentvm1 ~]# fdisk /dev/sdb Welcome to fdisk (util-linux 2.32.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table. Created a new DOS disklabel with disk identifier 0xd1acbaf8. Command (m for help):

Because this device was just created, it has no partition table. Let’s create a single new partition of 2GB in size. We do not need a lot of space for this experiment, so the partition is small. Press the n key to begin creation of a new partition: Command (m for help): n Partition type pprimary (0 primary, 0 extended, 4 free) eextended (container for logical partitions)

Enter p to create a primary partition: Select (default p): p

Just press Enter to create this as partition number 1: Partition number (1-4, default 1):

First sector (2048-41943039, default 2048): Last sector, +sectors or +size{K,M,G,T,P} (2048-41943039, default 41943039): +2G

600

Chapter 19

Filesystems

Created a new partition 1 of type 'Linux' and of size 2 GiB.

Now enter the p command to print the current partition table: Command (m for help): p Disk /dev/sdb: 20 GiB, 21474836480 bytes, 41943040 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0xd1acbaf8 DeviceBoot StartEnd Sectors Size Id Type /dev/sdb12048 4196351 41943042G 83 Linux

Press w to write the revised partition table to the disk. The existing partition table, if any, is not altered until the data is written to the disk: Command (m for help): w The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks. [root@studentvm1 ~]#

Create an EXT4 filesystem on the new partition. This won’t take long because of the small size of the partition. By default the EXT4 filesystem fills the partition; however it is possible to specify a size smaller than the partition for the size of the filesystem: [root@studentvm1 ~]# mkfs -t ext4 /dev/sdb1 mke2fs 1.44.3 (10-July-2018) Creating filesystem with 524288 4k blocks and 131072 inodes Filesystem UUID: ee831607-5d5c-4d54-b9ba-959720bfdabd Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done [root@studentvm1 ~]#

Let’s add a partition label: 601

Chapter 19

Filesystems

[root@studentvm1 ~]# e2label /dev/sdb1 [root@studentvm1 ~]# e2label /dev/sdb1 TestFS [root@studentvm1 ~]# e2label /dev/sdb1 TestFS [root@studentvm1 ~]#

Create a mount point on the filesystem directory tree: [root@studentvm1 ~]# mkdir /TestFS [root@studentvm1 ~]# ll / Mount the new filesystem: [root@studentvm1 ~]# mount /TestFS/ mount: /TestFS/: can't find in /etc/fstab. [root@studentvm1 ~]#

This error occurred because we did not create an entry for the new filesystem in /etc/fstab. But let’s mount it manually first: [root@studentvm1 ~]# mount -t ext4 /dev/sdb1 /TestFS/ [root@studentvm1 ~]# lsblk -i NAMEMAJ:MIN RMSIZE RO TYPE MOUNTPOINT sda8:0060G0 disk |-sda18:101G0 part /boot `-sda28:2059G0 part |-fedora_studentvm1-root 253:002G0 lvm/ |-fedora_studentvm1-swap 253:104G0 lvm[SWAP] |-fedora_studentvm1-usr253:20 15G0 lvm/usr |-fedora_studentvm1-home 253:302G0 lvm/home |-fedora_studentvm1-var253:40 10G0 lvm/var `-fedora_studentvm1-tmp253:505G0 lvm/tmp sdb8:16020G0 disk `-sdb18:170 2G0 part /TestFS sr011:01 1024M0 rom [root@studentvm1 ~]#

It is not necessary to specify the filesystem type as we did here because the mount command is capable of determining the common filesystem types. You may need to do this if the filesystem is one of the more obscure types. 602

Chapter 19

Filesystems

Unmount the filesystem: [root@studentvm1 ~]# umount /TestFS

Now add the following entry for our new filesystem to the bottom of the /etc/fstab file: /dev/sdb1/TestFS

ext4defaults1 2

Now mount the new filesystem: [root@studentvm1 ~]# mount /TestFS [root@studentvm1 ~]# ll /TestFS/ total 16 drwx------. 2 root root 16384 Jan 14 08:54 lost+found [root@studentvm1 ~]# lsblk -i NAMEMAJ:MIN RMSIZE RO TYPE MOUNTPOINT sda8:0060G0 disk |-sda18:101G0 part /boot `-sda28:2059G0 part |-fedora_studentvm1-root 253:002G0 lvm/ |-fedora_studentvm1-swap 253:104G0 lvm[SWAP] |-fedora_studentvm1-usr253:20 15G0 lvm/usr |-fedora_studentvm1-home 253:302G0 lvm/home |-fedora_studentvm1-var253:40 10G0 lvm/var `-fedora_studentvm1-tmp253:505G0 lvm/tmp sdb8:16020G0 disk `-sdb18:170 2G0 part /TestFS sr011:01 1024M0 rom [root@studentvm1 ~]#

All of the pertinent data about the filesystem is recorded in fstab, and options specific to this filesystem can be specified as well. For example, we may not want this filesystem to mount automatically at startup, so we would set that option as noauto,defaults. Unmount the TestFS filesystem: [root@studentvm1 ~]# umount /TestFS

Change the line for this new filesystem in /etc/fstab so it looks like the following: /dev/sdb1/TestFS

ext4noauto,defaults1 2

Mount the filesystem manually to verify that it works as expected. Now reboot the VM and verify that the /TestFS filesystem does not mount automatically. It should not. 603

Chapter 19

Filesystems

Other filesystems There are many filesystems besides EXT4 and its predecessors. Each of these has its own advantages and drawbacks. I have tried several, like XFS, ReiserFS, and BTRFS, but I have found that the EXT filesystems have always been perfect for my needs. Our student virtual machines will not provide a real test to help determine which filesystem might be better for our needs, but let’s create a filesystem with BTRFS just to experiment with.

EXPERIMENT 19-10 Perform this experiment as root. We still have space on the /dev/sdb virtual drive, so add another partition, /dev/sdb2, with a size of 2GB on that drive. Then format the new partition partition as BTRFS: [root@studentvm1 ~]# fdisk /dev/sdb Welcome to fdisk (util-linux 2.32.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Command (m for help): n Partition type pprimary (1 primary, 0 extended, 3 free) eextended (container for logical partitions) Select (default p):

Partition number (2-4, default 2): First sector (4196352-41943039, default 4196352): Last sector, +sectors or +size{K,M,G,T,P} (4196352-41943039, default 41943039): +2G Created a new partition 2 of type 'Linux' and of size 2 GiB. Command (m for help): p Disk /dev/sdb: 20 GiB, 21474836480 bytes, 41943040 sectors Units: sectors of 1 * 512 = 512 bytes

604

Chapter 19

Filesystems

Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x0c2e07ab DeviceBootStartEnd Sectors Size Id Type /dev/sdb12048 4196351 41943042G 83 Linux /dev/sdb24196352 8390655 41943042G 83 Linux Command (m for help): w The partition table has been altered. Syncing disks. [root@studentvm1 ~]# mkfs -t btrfs /dev/sdb2 btrfs-progs v4.17.1 See http://btrfs.wiki.kernel.org for more information. Label:(null) UUID:54c2d286-caa9-4a44-9c12-97600122f0cc Node size:16384 Sector size:4096 Filesystem size:2.00GiB Block group profiles: Data:single8.00MiB Metadata:DUP 102.38MiB System:DUP 8.00MiB SSD detected:no Incompat features:extref, skinny-metadata Number of devices:1 Devices: IDSIZEPATH 12.00GiB/dev/sdb2 [root@studentvm1 ~]#

Mount the new BTRFS filesystem on the temporary mount point, /mnt. Create or copy some files to /mnt. After you have experimented with this filesystem for a bit, unmount it.

605

Chapter 19

Filesystems

From a functional standpoint, the BTRFS filesystem works the same way as the EXT4 filesystem. They both store data in files, use directories for file organization, provide security using the same file attributes, and use the same file management tools.

Chapter summary In this chapter we have looked at the three meanings of the term, “filesystem,” and explored each in detail. A filesystem can be a system and metadata structure such as EXT4 used to store data on a partition or logical volume of some storage medium; a well- defined, logical structure of directories that establishes an organizational methodology for data storage as set forth in the Linux Filesystem Hierarchical Standard (LFHS); and a unit of data storage as created on a partition or logical volume which may be mounted on a specific, defined directory as part of the LFHS. These three uses of the term, “filesystem,” are commonly used with overlapping meanings which contributes to potential confusion. This chapter separates and defines the various uses of the term and the application of the term to specific functions and data structures.

Exercises Perform these exercises to complete this chapter: 1. What information about a file is contained in the inode? 2. What information about a file is contained only in the directory entry? 3. What is the block size in the partitions on StudentVM1? 4. Calculate the size of a cylinder group on all partitions of StudentVM1. Are they all the same? 5. How would you discover filesystem inconsistencies such as orphaned inodes or incorrect counts of free inode and data blocks? 6. Describe the complete process required to resolve filesystem inconsistencies. 606

Chapter 19

Filesystems

7. Where should well-designed application software be installed in the Linux filesystem? 8. When installing locally created scripts, in which directory should the script itself be installed? 9. When installing locally created scripts, in which directory should the configuration files, if any, be installed? 10. We still should have some free space on the second virtual hard drive, /dev/sdb, which we added to the StudentVM1 host. Use 1GB of that to create a new partition with an XFS filesystem on it. Create a mount point, /var/mystuff, and configure it to mount automatically on boot. Ensure that it mounts manually, and then reboot to verify that it mounts on boot. 11. What happens if we unmount the /TestFS filesystem and create a file in the /TestFS directory which is a mount point for that filesystem? Can the file be created, some content added, and then be viewed? 12. What happens to the test file created in the previous exercise when the /TestFS filesystem is mounted? 13. How does the “user” option differ from the “users” option for the mount command?

607

B ibliography B ooks Binnie, Chris, Practical Linux Topics, Apress 2016, ISBN 978-1-4842-1772-6 Both, David, The Linux Philosophy for SysAdmins, Apress, 2018, ISBN 978-1-48423729-8 Gancarz, Mike, Linux and the Unix Philosophy, Digital Press– an imprint of Elsevier Science, 2003, ISBN 1-55558-273-7 Kernighan, Brian W.; Pike, Rob (1984), The UNIX Programming Environment, Prentice Hall, Inc., ISBN 0-13-937699-2 Libes, Don, Exploring Expect, O’Reilly, 2010, ISBN 978-1565920903 Nemeth, Evi [etal.], The Unix and Linux System Administration Handbook, Pearson Education, Inc., ISBN 978-0-13-148005-6 Matotek, Dennis, Turnbull, James, Lieverdink, Peter; Pro Linux System Administration, Apress, ISBN 978-1-4842-2008-5 Raymond, Eric S., The Art of Unix Programming, Addison-Wesley, September 17, 2003, ISBN 0-13-142901-9 Siever, Figgins, Love & Robbins, Linux in a Nutshell 6th Edition, (O'Reilly, 2009), ISBN 978-0-596-15448-6 Sobell, Mark G., A Practical Guide to Linux Commands, Editors, and Shell Programming Third Edition, Prentice Hall; ISBN 978-0-13-308504-4 van Vugt, Sander, Beginning the Linux Command Line, Apress, ISBN 978-1-43026829-1 Whitehurst, Jim, The Open Organization, Harvard Business Review Press (June 2, 2015), ISBN 978-1625275271 Torvalds, Linus and Diamond, David, Just for Fun, HarperCollins, 2001, ISBN 0-06-662072-4

609

BIBLIOGRAPHY

W eb sites BackBlaze, Web site, What SMART Stats Tell Us About Hard Drives, www.backblaze.com/ blog/what-smart-stats-indicate-hard-drive-failures/ Both, David, 8 reasons to use LXDE, https://opensource.com/article/17/3/8reasons-use-lxde Both, David, 9 reasons to use KDE, https://opensource.com/life/15/4/9reasons-to-use-kde Both, David, 10 reasons to use Cinnamon as your Linux desktop environment, https://opensource.com/article/17/1/cinnamon-desktop-environment Both, David, 11 reasons to use the GNOME 3 desktop environment for Linux, https:// opensource.com/article/17/5/reasons-gnome Both, David, An introduction to Linux network routing, https://opensource.com/ business/16/8/introduction-linux-network-routing Both, David, Complete Kickstart, www.linux-databook.info/?page_id=9 Both, David, Making your Linux Box Into a Router, www.linux-databook. info/?page_id=697 Both, David, Network Interface Card (NIC) name assignments, Both, David, Using hard and soft links in the Linux filesystem, www.linux-databook. info/?page_id=5087 Both, David, Using rsync to back up your Linux system, https://opensource.com/ article/17/1/rsync-backup-linux Bowen, Rich, RTFM? How to write a manual worth reading, https://opensource. com/business/15/5/write-better-docs Charity, Ops: It's everyone's job now, https://opensource.com/article/17/7/ state-systems-administration Dartmouth University, Biography of Douglas McIlroy, www.cs.dartmouth. edu/~doug/biography DataBook for Linux, www.linux-databook.info/ Digital Ocean, How To Use journalctl to View and Manipulate Systemd Logs, www.digitalocean.com/community/tutorials/how-to-use-journalctl-to-viewand-manipulate-systemd-logs Edwards, Darvin, Electronic Design, PCB Design And Its Impact On Device Reliability, www.electronicdesign.com/boards/pcb-design-and-its-impact-devicereliability Engineering and Technology Wiki, IBM 1800, http://ethw.org/IBM_1800 610

Bibliography

Fedora Magazine, Tilix, https://fedoramagazine.org/try-tilix-new-terminalemulator-fedora/ Fogel, Kark, Producing Open Source Software, https://producingoss.com/en/ index.html Free On-Line Dictionary of Computing, Instruction Set, http://foldoc.org/ instruction+set Free Software Foundation, Free Software Licensing Resources, www.fsf.org/ licensing/education gnu.org, Bash Reference Manual– Command Line Editing, www.gnu.org/software/ bash/manual/html_node/Command-Line-Editing.html Harris, William, How the Scientific Method Works, https://science. howstuffworks.com/innovation/scientific-experiments/scientific-method6.htm Heartbleed web site, http://heartbleed.com/ How-two Forge, Linux Basics: How To Create and Install SSH Keys on the Shell, www.howtoforge.com/linux-basics-how-to-install-ssh-keys-on-the-shell Kroah-Hartman, Greg , Linux Journal, Kernel Korner– udev– Persistent Naming in User Space, www.linuxjournal.com/article/7316 Krumins, Peter, Bash emacs editing, www.catonmat.net/blog/bash-emacs-editingmode-cheat-sheet/ Krumins, Peter, Bash history, www.catonmat.net/blog/the-definitive-guide-tobash-command-line-history/ Krumins, Peter, Bash vi editing, www.catonmat.net/blog/bash-vi-editing-modecheat-sheet/ Kernel.org, Linux allocated devices (4.x+ version), www.kernel.org/doc/html/ v4.11/admin-guide/devices.html Linux Foundation, Filesystem Hierarchical Standard (3.0), http://refspecs. linuxfoundation.org/fhs.shtml Linux Foundation, MIT License, https://spdx.org/licenses/MIT The Linux Information Project, GCC Definition, www.linfo.org/gcc.html Linuxtopia, Basics of the Unix Philosophy, www.linuxtopia.org/online_books/ programming_books/art_of_unix_programming/ch01s06.html LSB Work group- The Linux Foundation, Filesystem Hierarchical Standard V3.0, 3, https://refspecs.linuxfoundation.org/FHS_3.0/fhs-3.0.pdf Opensource.com, https://opensource.com/ Opensource.com, Appreciating the full power of open, https://opensource.com/ open-organization/16/5/appreciating-full-power-open 611

BIBLIOGRAPHY

Opensource.com, David Both, SpamAssassin, MIMEDefang, and Procmail: Best Trio of 2017, Opensource.com, https://opensource.com/article/17/11/spamassassinmimedefang-and-procmail Opensource.org, Licenses, https://opensource.org/licenses opensource.org, The Open Source Definition (Annotated), https://opensource. org/osd-annotated OSnews, Editorial: Thoughts on Systemd and the Freedom to Choose, www.osnews. com/story/28026/Editorial_Thoughts_on_Systemd_and_the_Freedom_to_Choose Peterson, Christine, Opensource.com, How I coined the term ‘open source’, https:// opensource.com/article/18/2/coining-term-open-source-software Petyerson, Scott K, The source code is the license, Opensource.com, https:// opensource.com/article/17/12/source-code-license Princeton University, Interview with Douglas McIlroy, www.princeton.edu/~hos/ frs122/precis/mcilroy.htm Raspberry Pi Foundation, www.raspberrypi.org/ Raymond, Eric S., The Art of Unix Programming, www.catb.org/esr/writings/ taoup/html/index.html/ Wikipedia, The Unix Philosophy, Section: Eric Raymond’s 17 Unix Rules, https:// en.wikipedia.org/wiki/Unix_philosophy#Eric_Raymond%E2%80%99s_17_Unix_Rules Raymond, Eric S., The Art of Unix Programming, Section The Rule of Separation, www.catb.org/~esr/writings/taoup/html/ch01s06.html#id2877777 Understanding SMART Reports, https://lime-technology.com/wiki/ Understanding_SMART_Reports Unnikrishnan A, Linux.com, Udev: Introduction to Device Management In Modern Linux System, www.linux.com/news/udev-introduction-device-management-modernlinux-system Venezia, Paul, Nine traits of the veteran Unix admin, InfoWorld, Feb 14, 2011, www.infoworld.com/t/unix/nine-traits-the-veteran-unix-admin276?page=0,0&source=fssr Wikipedia, Alan Perlis, https://en.wikipedia.org/wiki/Alan_Perlis Wikipedia, Christine Peterson, https://en.wikipedia.org/wiki/Christine_ Peterson Wikipedia, Command Line Completion, https://en.wikipedia.org/wiki/Commandline_completion Wikipedia, Comparison of command shells, https://en.wikipedia.org/wiki/ Comparison_of_command_shells 612

Bibliography

Wikipedia, Dennis Ritchie, https://en.wikipedia.org/wiki/Dennis_Ritchie Wikipedia, Device File, https://en.wikipedia.org/wiki/Device_file Wikipedia, Gnome-terminal, https://en.wikipedia.org/wiki/Gnome-terminal Wikipedia, Hard Links, https://en.wikipedia.org/wiki/Hard_link Wikipedia, Heartbleed, https://en.wikipedia.org/wiki/Heartbleed Wikipedia, Initial ramdisk, https://en.wikipedia.org/wiki/Initial_ramdisk Wikipedia, Ken Thompson, https://en.wikipedia.org/wiki/Ken_Thompson Wikipedia, Konsole, https://en.wikipedia.org/wiki/Konsole Wikipedia, Linux console, https://en.wikipedia.org/wiki/Linux_console Wikipedia, List of Linux-supported computer architectures, https://en.wikipedia. org/wiki/List_of_Linux-supported_computer_architectures Wikipedia, Maslow's hierarchy of needs, https://en.wikipedia.org/wiki/ Maslow%27s_hierarchy_of_needs Wikipedia, Open Data, https://en.wikipedia.org/wiki/Open_data Wikipedia, PHP, https://en.wikipedia.org/wiki/PHP Wikipedia, PL/I, https://en.wikipedia.org/wiki/PL/I Wikipedia, Programma 101, https://en.wikipedia.org/wiki/Programma_101 Wikipedia, Richard M.Stallman, https://en.wikipedia.org/wiki/Richard_ Stallman Wikipedia, Rob Pike, https://en.wikipedia.org/wiki/Rob_Pike Wikipedia, rsync, https://en.wikipedia.org/wiki/Rsync Wikipedia, Rxvt, https://en.wikipedia.org/wiki/Rxvt Wikipedia, SMART, https://en.wikipedia.org/wiki/SMART Wikipedia, Software testing, https://en.wikipedia.org/wiki/Software_testing Wikipedia, Terminator, https://en.wikipedia.org/wiki/Terminator_(terminal_ emulator) Wikipedia, Tony Hoare, https://en.wikipedia.org/wiki/Tony_Hoare Wikipedia, Unit Record Equipment, https://en.wikipedia.org/wiki/Unit_ record_equipment Wikipedia, Unix, https://en.wikipedia.org/wiki/Unix Wikipedia, Windows Registry, https://en.wikipedia.org/wiki/Windows_Registry Wikipedia, Xterm, https://en.wikipedia.org/wiki/Xterm WikiQuote, C._A._R._Hoare, https://en.wikiquote.org/wiki/C._A._R._Hoare WordPress, Home page, https://wordpress.org/

613

Index A Alias, 103, 298, 425, 442, 498, 508–511 command, 230 host, 298 user, 298 ASCII, 235, 268, 274, 355, 369, 375, 430, 444, 456, 458, 459, 492, 533, 534, 556 ASCII plain text, 273, 534 Automate everything, 48, 54–55 Automation, 54, 55, 182, 448

B Backblaze study of hard drive failure rates, 385 Backup shell script, 349 Bash, 20, 53, 54, 59, 63, 150, 182, 183, 198, 200, 201, 212, 220, 274–276, 280, 296, 297, 418–429, 440, 448, 493– 499, 509, 527 tab completion, 212–214 configuration files /.bash_history, 191, 247, 577 /.bash_logout, 191, 247, 493 /.bash_profile, 191, 247, 493–495, 499, 500, 502–504, 534 /.bashrc, 191, 247, 280, 495, 496, 499, 503, 504, 508, 510

/etc/bashrc, 468, 493, 494, 498–504, 519, 525 /etc/profile, 496–499, 501–504, 510, 519, 525, 527, 528 environment, 493 external commands, 421, 427 global configuration directory /etc/profile.d, 496 history, 220 internal commands, 418, 424, 427 shell options, 418–419 sourcing files, 496 syntax, 20, 198, 276 user configuration, 191, 212, 280, 468, 491, 493–498, 510, 511 variables, 420, 429, 491, 499, 506 Bash commands compound, 418, 429 external, 27 internal, 421 Basic I/O System (BIOS), 188, 372, 374, 441, 452–454 POST, 452–454 Bell Labs, 37, 38, 42, 225, 226 Binary executable, 311, 423, 534 Books “Just for Fun”, 6, 41 “Linux and the Unix Philosophy”, 3, 5, 40

615

Index

Books (cont.) “The Art of Unix Programming”, 5, 40, 50, 241 “The Unix Philosophy”, 3, 5, 40, 47, 50, 57, 62, 241, 263 Boot, 8, 52, 104, 109, 110, 118–123, 129–132, 136, 146–149, 162, 188, 197, 256, 323, 330, 372, 373, 451–489, 555, 561, 562, 571, 578–580, 587 Boot record, 52, 197, 254–256, 453–459, 462 Bourne again shell, 53, 183, 198 Bowen, Rich, 64 Brace expansion, 395, 433–435, 448 BSD, 38, 39 Bug reports, 67, 470

C CD-ROM, 26, 36, 96, 453 Characters meta-, 433, 440 sets, 418, 438–439 special pattern, 433, 435–438, 445 Cisco, 15, 331 Classroom, 14, 71, 326 CLI, 31, 51, 53, 181–185, 202, 204–205, 212, 223, 239, 254, 395, 420, 437, 487, 493, 558, 559 Code proprietary, 2–4, 7, 42, 226, 454 sharing, 186, 228, 522, 538, 583 source, 3, 4, 6, 7, 12, 38, 44, 58, 60, 242, 311, 332 Command, 185 Command line history, 184 interface, 5, 52, 184–185 recall and editing, 220–223 616

Command prompt, 75, 125, 184, 190, 193, 199, 203, 207, 297, 355, 356, 371, 499, 591 Comments, 298, 480, 496, 500, 501 Configuration files and directories /.bash_history, 191, 247, 577 /.bash_logout, 191, 247, 493 /.bash_profile, 191, 247, 493–495, 499, 500, 502–504, 534 /.bashrc, 191, 247, 280, 495, 496, 499, 503, 504, 508, 510 /.ssh, 577 /etc/, 298, 300, 445 /etc/aliases, 65 /etc/bashrc, 468, 493, 494, 498–504, 519, 525 /etc/default, 465 /etc/default/grub, 464, 465, 467 /etc/fstab, 101, 131, 471, 561, 566, 571, 572, 578–585, 602, 603 /etc/group, 103, 298, 300, 522, 524, 525 /etc/passwd, 219, 286, 488 /etc/profile, 496–499, 501–504, 510, 519, 525, 527, 528 /etc/profile.d, 280 /etc/profile.d/bash_completion.sh, 497 /etc/profile.d/colorgrep.sh, 497 /etc/profile.d/colorls.sh, 497 /etc/profile.d/colorsysstat.sh, 498 /etc/profile.d/colorxzgrep.sh, 498 /etc/profile.d/colorzgrep.sh, 498 /etc/profile.d/less.sh, 498 /etc/profile.d/mc.sh, 498 /etc/profile.d/myBashConfig.sh, 498 /etc/profile.d/vim.sh, 498 /etc/selinux/config, 279 /etc/selinux/targeted/seusers, 279

Index

/etc/shadow, 303 /etc/skel, 493 /etc/sudoers, 221, 294, 297 /etc/sysconfig/network-scripts/ ifcfg-enp0s3, 294, 295, 363, 441, 442 /etc/sysconfig/network-scripts/ ifcfg-enp0s8, 362, 442 /etc/systemd, 471, 476, 482, 486 /etc/systemd/system, 475, 476, 482 /etc/systemd/system/default.target, 471, 477 /etc/systemd/system/ multi-user.target.wants/ sysstat.service, 475 /etc/system-release, 465, 466 /etc/xdg/xfce4/xinitrc, 193 /etc/yum.repos.d, 329 Console, 21, 75, 184, 187–189, 192, 247, 253, 304, 473, 477 virtual, 21, 53, 182, 184, 188–195, 197, 201, 202, 223, 224, 264, 409, 477, 483, 488, 490, 492, 504, 511, 525, 558 CPU, 9, 14, 25, 26, 28–31, 33, 34, 136, 234, 343, 346, 375, 387, 390, 453 usage, 154, 342, 344–347, 349–356, 364, 388 cron, 57 crontab, 63, 280 Cruft, 331, 558 cleaning code in scripts, old, 331 files, old, 331 packages, 331 programs, old or unused, 331 Customer Engineer, 39

D Data, 10, 11, 25, 26, 29, 34, 385–387, 572, 573 center, 9, 374 loss, 65, 563 open format, 48, 57–58 random, 261, 262, 269, 271, 437, 438, 447 stream, 42, 48, 50–52, 57, 62, 218–220, 225, 239–271, 320, 322, 355, 417, 433, 440, 442, 443, 446, 447, 456, 524, 530, 565 DEC, 3 PDP-7, 38, 39, 44 VAX/VMS, 3 VT100, 185, 186 Dependency, 277, 310 hell, 309–310, 315, 318 Desktop GNOME, 94, 104, 153, 154, 159, 196, 204, 395, 413, 479, 481, 484, 498 KDE, 10, 11, 104, 153, 154, 156–158, 170, 195, 204, 230, 277, 395, 410, 411, 444, 445, 479, 481, 484 LXDE, 10, 11, 158, 195, 395 Xfce, 11, 24, 104, 117, 145, 149, 153– 179, 196, 207, 226, 399, 485 Developer, 3, 5–7, 37, 40, 42, 47, 48, 55, 63, 64, 182, 187, 204, 275, 277, 278, 281, 371, 397, 478, 523, 552, 556, 557 Device data flow, 457, 458, 497, 498, 505, 582 disk, 97, 100, 112, 243, 597 special file null, 420, 457, 458, 498, 505, 582, 605 pts, 193, 194, 196, 197, 200, 229 random, 247 617

Index

Device (cont.) stty, 221, 227, 228 tty2, 192–195 tty3, 192–194 urandom, 247, 262, 267, 270 zero, 247 DevOps, 15, 40 Display manager, 157, 203, 284, 480, 482, 486–488 gdm, 479, 480, 483 kdm, 479, 483 lightdm, 157, 209, 360, 363, 475, 483, 484, 486 lxdm, 480, 483, 486 sddm, 480 xdm, 480, 486 DNF, 79, 162, 177, 183, 221, 269, 309–312, 315–318, 320–322, 324–327, 329–333, 359, 375, 396, 416, 437, 463, 464, 482, 483, 486, 558 Documentation philosophy, 49, 64, 228 process, 340 template, 200, 208, 216 Drive hard, 13, 14, 25, 26, 29, 31, 32, 35, 51, 52, 57, 59, 65, 81, 90, 91, 93, 95, 97, 98, 103, 104, 107, 108, 117, 123, 124, 126, 127, 129, 137, 144, 147, 197, 261, 262, 271, 366, 369, 375, 461, 489, 533, 541, 550, 557, 558, 562, 563, 566, 569–575, 588, 589, 594–600 optical, 118, 120, 121, 146, 586 partitioning, 123, 126 solid state, 26, 136, 514, 549 SSD, 14, 129, 136, 453, 549, 558, 573, 588, 590, 605 618

USB, 98, 100, 243, 244, 246, 250, 252, 254, 255, 258, 259, 261, 270, 296, 377, 453 DVD, 26, 122, 453, 486, 593

E Editor, 63, 273, 276, 280, 492 emacs, 63, 276 favorite, 63, 280 gedit, 276 Kate, 277 Leafpad, 277 text, 273–280 vi, 63, 275, 277, 280, 295, 298, 509 Vim, 63, 274–278, 280, 509 xed, 277 xfw, 277 Elegance, 8, 49, 61, 153 computer, 8, 156 hardware, 61 power and grounding, 62 software, 61 Elegant, 8, 49, 61, 133, 156 End User License Agreement, 470 Environment variables, 177, 210, 228, 284, 299, 491, 492, 495, 496, 504, 591

F Fedora, 12, 14, 20, 64, 75, 114, 117, 121, 123–125, 138, 140, 146, 175, 275, 309, 315, 320, 327, 358, 375, 462, 466, 473, 487, 495, 558, 560, 562 release, 104, 124, 328, 359, 587 30, 457, 465 29, 74, 117, 139, 149, 328, 587

Index

FHS (Filesystem Hierarchical Structure), 52, 53, 95, 536, 553, 556, 557, 606 File compatibilty, 52 cpuinfo, 370, 371 device, 51, 52, 186, 195–197, 240, 245, 250, 262 device special, 52, 197, 229, 244, 579 driver, 52, 461, 463 finding lost, 594 format ASCII text, 274, 369, 375, 444, 456, 458, 492, 533, 534, 556 binary, 192, 255, 423, 456, 534 closed, 57 open, 48, 57 globbing, 321, 435, 436, 438, 440, 448, 531 handle, 51, 241, 242 meminfo, 370, 371 meta-structures, 533, 538, 562 multiple hard links, 542 naming, 54 ownership, 290, 431, 516–520, 522 permissions, 209, 514, 517, 519–522, 529 sudoers, 295–301 timestamps atime, 216, 217, 532, 534, 536 ctime, 216, 532, 534, 535 mtime, 215, 532, 534, 535 File manager Dolphin, 156 Konqueror, 156 Midnight Commander, 312, 313, 318, 325, 332 Thunar, 155, 156, 169, 173, 174

Filesystem creating, 542, 560, 561, 594–595 definition, 58, 130, 550 directory structure /dev, 589 /etc, 101, 298, 445, 471, 491, 498, 522, 556 /home, 59, 130, 368, 561, 583, 592, 593 /mnt, 245, 249, 252, 253, 555, 605 /opt, 541, 557, 558 /proc, 341, 358, 365, 369–373, 392 / (root), 129, 557, 558, 583 /sys, 373 /tmp, 104, 105, 119, 174, 207, 217, 253, 291, 312, 317, 368, 423, 505, 519, 545, 557, 558, 591 /usr, 290, 557, 558, 583 /usr/local/bin, 206, 300, 314, 422, 423, 557, 583 /usr/local/etc, 423, 557, 583 /var, 541, 556–558 full, 132, 556, 557 Hierarchical Standard, 52, 53, 536, 553, 554, 556, 606 inode, 533, 541, 546, 567, 569–570, 594 journal, 570–572 Linux, 48, 52, 58, 130, 207, 461, 536, 552, 553, 557 namespace, 551 separate for data, 58, 129, 541, 555, 558, 591 types, 36, 131, 550, 556, 559–561, 576, 582, 602 BTRFS, 130, 550, 560, 604, 606 CDFS, 36 EXT3, 130, 550, 560, 562, 570–572 619

Index

Filesystem (cont.) EXT4, 100, 130–132, 455, 461, 533, 550, 556, 560, 562, 563, 570–572, 576, 582, 604, 606 FAT32, 245, 255, 256 HPFS, 36 NFS, 36, 556 VFAT, 36, 242, 561, 582, 583 XFS, 130, 461, 550, 556, 561, 604, 607 Filter, 51, 240, 263, 268, 269, 378, 534 Finding files, 445–448 Firewall, 10, 14 FOSS, 2, 12 Fragmentation, 535, 552, 569, 570, 572–578 effects on disk performance, 575 Free open source software, 2, 12 Free Software Foundation, 42

G Gancarz, Mike, 5, 40 Getty, 487, 488 GID, 194, 286, 287, 293, 518, 523, 526 GNU core utilities, 36, 41–43, 182, 205, 225, 226, 230, 236, 247 coreutils, 42, 43, 225–230 General Public License, 44 GPL, 5, 44 GNU/Linux, 205, 478 GPU, see Graphics Processing Unit (GPU) Graphical User Interface (GUI), 181, 204, 476, 480, 481 desktop Cinnamon, 10, 195, 395 GNOME, 104, 153, 154, 157, 158, 195, 196, 204, 395, 412, 479, 481, 484 620

KDE, 10, 11, 104, 153, 156, 157, 170, 195, 204, 277, 395, 410, 412, 479, 481, 484 LXDE, 10, 11, 158, 195, 395 Graphics Processing Unit (GPU), 25, 28 Group, 209, 286, 517, 520, 522, 525, 526, 528, 532, 546 wheel, 300, 301 Group ID, 194, 286, 445, 518, 523, 526, 533 GRUB, 178, 188, 255, 322, 323, 330, 453, 454, 456, 459, 461, 463–468, 470, 478, 489 stage 1, 454–459 stage 1.5, 455, 459, 461 stage 2, 461–464 GRUB2, 451, 454, 461, 462, 571 GUI, see Graphical User Interface (GUI)

H Hard drive, 13, 26, 35, 65, 90, 93, 98, 124, 126, 136, 254, 261, 369, 375, 377, 385, 533, 558, 572, 573, 589, 594, 596 crashes, 10, 11, 412, 558, 563 Hardware architecture, 59 Help facility, 178, 278, 364 option (-h), 102, 236, 251, 390 Hex, 274, 520 Hierarchy, 48, 49, 361, 553 Linux Philosophy for SysAdmins, 49 of needs, 48 Hiring the right people, 8

Index

Host naming, 157 StudentVM1, 106, 109, 125, 146, 148, 343, 405, 475, 578, 595, 598 StudentVM2, 405 HTML, 64, 444, 448 Hyper-Threading, 29, 73

I, J IBM Customer Engineer, 339 1401 mainframe, 44 PC DOS, 44, 386 training course, 24, 66, 67 inode, 533, 537, 541, 542, 564, 567, 569–570 direct, 569 indirect, 569, 570 Intel Core i7, 14, 73 Core i9, 27, 234 Interface captive user, 182 non-restrictive, 181 restrictive, 4, 181 ISO Fedora Xfce, 104, 117 file, 121 image, 52, 104–105, 118–120, 146, 177, 453, 576, 586, 587, 593

K Kernel, 10, 30–32, 34, 35, 41, 43, 67, 79, 131, 136, 177, 188, 226, 243, 322, 330, 347, 353, 369, 371, 418, 454, 462, 464, 467, 469, 470, 489, 552, 571

Konsole, 156, 196, 396, 398, 409–412 Kromhout, Bridget, 68 KVM, 184, 188, 189

L Languages compiled, 59 interpreted, 418 scripting, 59, 275, 276 shell, 198, 418 Libre Office, 6, 7, 12, 14, 32, 60, 64, 129, 140, 173, 205, 274, 275, 310, 327 Link, 78, 201, 290, 471, 476, 486, 536–537 hard, 517, 534, 537–545 multiple hard links, 542 soft, 475, 513, 543–546 symbolic, 445, 471, 475, 482, 537, 544 symlink, 475, 482, 543–546 Linux, 1–2, 5–6 boot, 188, 323, 373, 451–489 command line, 1, 22, 32, 52, 53, 181–223, 240, 395 directory tree, 52, 132, 206 distribution CentOS, 64, 117, 375, 454, 561 Fedora, 10, 12, 14, 20, 64, 74, 75, 77, 79, 104, 106, 114–126, 144, 188, 320, 454, 479, 558, 587 Red Hat, 64, 117, 129, 139, 188, 309, 315, 375, 451, 454, 517 RHEL, 64, 117, 315, 327, 372 history, 6, 37, 41 installation, 137, 184 kernel, 7, 30, 31, 34, 35, 41, 43, 44, 60, 140, 226, 230, 372, 452, 454, 462, 463, 470, 562 621

Index

Linux (cont.) startup systemd, 471–478 SystemV, 471–473 supercomputers, 11, 41 unified directory structure, 557–559 Live image Fedora, 118–123 Log files maillog, 209, 210 messages, 36, 441, 442 Logical Volume Management (LVM), 132, 133, 234, 421, 594 volume group, 132, 133 logical, 132, 594 physical, 132 LVM, see Logical Volume Management (LVM)

M Man pages, 20, 150, 178, 230, 231, 301, 311, 324, 341, 374, 427, 478, 579, 583, 591 Maslow, 48 Master boot record (MBR), 453–455, 459 McIlroy, Doug, 42, 50, 225, 241, 262, 263 Memory cost, 550 CRT, 550 RAM, 13, 14, 25, 26, 30, 32, 44, 73, 106, 112, 136–138, 353, 354, 373, 386, 404, 475, 549–550 type, 136 virtual, 30, 32, 137, 138, 140, 348, 365, 398, 402, 404, 409, 412 622

Mentor BRuce, 378, 599 Meta-characters, 395, 433, 440 Microsoft windows, 3 Midnight Commander, 312, 313, 318, 325, 332 MINIX, 5, 41, 230, 560, 562, 570 MOTD, 314 Motherboard, 14, 25–26, 28, 30, 44, 73, 372, 373, 375, 452 Mount point, 58, 129–132, 136, 140, 245, 250, 541, 550, 555, 558, 561, 566, 578, 582, 583, 594, 602 Multitasking, 31–35, 37, 42, 225

N Namespace, 551 Network interface, 84 interface card (NIC), 25, 26, 73, 441 interface configuration file, 84 Network Address Translation (NAT), 88, 89, 113, 114 Network, 88, 89, 113, 114 NTP, 556

O Octal, 267, 291, 456, 457, 520–522, 526, 529 Open data, 58 Open Document Format (ODF), 326 Open Source definition, 60 GPL2, 44 initiative, 13

Index

license, 5, 61 project, 49, 67 software, 1, 2, 6, 7, 12–14, 22, 58, 60–61 Opensource.com, 13, 60, 64, 67 Operating system, 2–5, 7, 8, 10–18, 22–45, 48, 50, 53, 55, 59, 63, 66, 74, 81, 99, 106, 107, 136, 175, 183, 196–198, 204, 205, 226, 276, 280, 284, 286, 354, 369, 372, 421, 441, 452, 454, 469, 538, 552, 553, 558, 561, 562 DEC VAX/VMS, 3 definition, 30–31 distributions CentOS, 64, 117, 188, 309, 315, 327, 372, 375, 454, 561 Fedora, 10, 12, 14, 20, 64, 74, 75, 77, 79, 104, 106, 114–126, 144, 188, 320, 454, 479, 558, 587 RHEL, 64, 117, 140, 309, 315, 327, 372, 454 Ubuntu, 300–302, 304 flexibility, 9–10

P, Q Packages, 60, 79, 82, 126, 183, 287, 299, 310, 311, 315–320, 323–331, 410 installing, 317–320 orphan, 324 removing, 324–326 RPM, 299, 309–312, 315–317, 320, 323, 332 Partition size, 128, 139 Path, 92, 497 PCI, 373, 374, 376, 442, 443 PCI Express (PCIe), 25

Permissions, 209, 227, 288, 290, 430, 431, 520–522, 529–531, 569 applying, 531–532 directory, 522 file, 520–522 group, 520, 522, 526 user, 520 Philosophy Linux, 13, 37, 40, 47–69, 103, 182, 185, 275, 280, 335, 339 Unix, 5, 40–41, 47, 50, 57, 62, 241, 263 Unix and Linux, 37, 48 Pike, Rob, 305 Pipe, 35, 242, 256, 262–264, 354, 387 Pipeline the pipeline challenge, 51, 240, 264– 265, 270, 429 Plain text, 55, 57, 58, 273 Pointy-Haired Bosses (PHB), 42, 47, 64, 182, 226, 293, 336 Portability, 59 Portable, 59–60 Power unleashing vs. harnessing, 66 Power-on self-test (POST), 452–454 Present working directory (PWD), 78, 79, 103, 184, 206, 207, 215, 249, 284, 290, 522 Printer driver, 35, 51, 240 USB, 30 Privilege, 72, 141, 142, 284, 293, 301, 302, 305 Privilege escalation, 293 Problem determination, 205, 263, 285, 335, 337, 340–342 resolution, 337 623

Index

Problem solving, 7, 41, 189, 335–392, 556 five steps action, 340 knowledge, 337–338 observation, 338–339 reasoning, 339–340 test, 340–341 symptom-fix, 336 using the scientific method, 336 Procedure, 8, 74, 82, 140, 311, 320, 341, 418, 422, 500, 597 naming, 496 Process(es), 7, 29, 31, 32, 34, 40, 42, 93, 105, 125, 132, 136, 141, 188, 197, 201, 205, 248, 264, 270, 283, 311, 336, 337, 339–342, 348, 353, 354, 372, 452, 453, 464, 470, 471, 473, 478, 487, 561, 578, 579, 589, 594 interprocess communication (IPC), 35 Processor, 9, 26–29 AMD, 9, 25 Intel, 29, 73, 77 Python, 59, 326

R Random, 56, 163, 164, 261, 262, 437, 438, 447, 459, 473, 534, 594 Random-access memory (RAM), 13, 14, 25, 26, 30, 32, 44, 73, 106, 111, 112, 136–140, 348, 353, 354, 356, 369, 373, 386, 402, 404, 415, 421, 453, 455, 462, 470, 475, 549, 550 Randomness, 261–262 Raspberry Pi, 9, 11, 16, 25, 60 Raymond, Eric S., 5, 40 624

Recovery, 8, 59, 65, 73, 188, 275, 463, 465, 467, 468, 470, 571, 572, 578 mode, 8, 188, 275, 467–470, 489, 578 Redirection, 51, 62, 239, 240, 265–268, 270 Repository adding, 327–330 EPEL, 327 Fedora, 327–329 RPMFusion, 327–329 Requirements, 13, 56, 66, 73, 74, 242, 277, 313, 396–415, 473 Ritchie, Dennis, 5, 17, 38–40, 42, 61, 187, 225 Router StudentVM2, 405 Virtual, 88 RPM, 63, 73, 299, 309–316, 320, 323, 329 groups, 326–327 smartmontools, 377, 378 sysstat, 386 utils-1.0.0-1.noarch.rpm, 312–314

S SAR, 386, 390 SATA, 26, 112, 113, 379, 596, 597 ports, 113 setting, 113 Scientific Method, 336, 337 screen, 201–202 Screen saver, 163, 164 Script, 6, 55, 57, 201, 275, 358, 429, 493–495, 508, 528, 591 cpuHog, 349, 351, 352, 361, 423, 429 Secure Shell (SSH), 12, 21, 53, 173, 184, 188, 193, 201, 202, 292, 404, 405, 492, 558

Index

Self-Monitoring, Analysis and Reporting Technology (SMART), 65, 375, 377, 385 Reallocated_Sector_Ct, 380, 381, 385 Reported_Uncorrect, 385 reports, 385 self-assessment test, 379, 385 SELinux, 194, 278–279, 286 Sets, 438 Settings manager, 157, 165, 167, 171 Shell, 198, 201, 417–448, 491–510 Bash, 20, 150, 198, 200, 274, 395, 417, 418, 424, 448, 491, 494, 527 Korn, 53, 183, 198, 199 ksh, 36, 53, 183, 198, 199 login, 284, 492, 495, 525 nologin, 231 non-login, 493, 495, 510 program, 48, 54, 198, 421, 494, 557 scripts comments, 200 cpuHog, 349, 423, 428, 448 do Updates, 177 maintenance, 273 mymotd, 314 naming, 54 test1, 434 secure, 21, 184, 201 tcsh, 198 Z, 198–200 zsh, 183, 198, 199 Signals, 342, 357–359 SIGINT (2), 358 SIGKILL (9), 348, 358 SIGTERM (15), 348, 357, 358

Snapshot, 146–148 Software open source, 2, 6, 7, 12–14, 58, 60–61 proprietary, 2–4 rights, 12–13 Solaris, 63, 280 Special pattern characters, 433, 435–438 Spock, 66 Standard Input/Output (STDIO), 48, 50–52, 62, 182, 239–242, 247, 263, 265, 270, 271 STDERR, 241, 242, 263 STDIN, 241, 263 STDOUT, 51, 239, 241, 242, 263 Storage devices hard drive, 13, 14, 25, 26, 32, 52, 57, 65, 73, 90, 91, 103, 107, 108, 126, 129, 130, 136, 137, 261, 262, 377–386, 455, 533, 550, 558, 562, 569, 571–576, 596–603 HDD, 549, 590 RAM, 13, 14, 25, 26, 32, 44, 73, 106, 112, 136–140, 344, 353, 354, 369, 386, 402, 404, 421, 453, 462, 470, 549, 550 SSD, 14, 26, 129, 136, 453, 549, 558, 573, 588, 590 USB external drive, 35, 81, 90, 91, 93, 95, 114, 453, 558 USB thumb drive, 20, 52, 104, 118, 122, 242–246, 254, 296, 558 Stream data, 51, 241 standard, 62 text, 50, 239, 241 StudentNetwork, 89, 90, 113, 114 sudo bypass, 302–304 625

Index

Supercomputers, 10, 11, 25, 41 Swap file, 137, 471, 473 partition, 32, 136–138, 140, 582, 595 space, 32, 136–140, 353, 354, 390 Switch user, 176, 284, 285, 519, 525, 527 lazy, 54, 56, 59, 166, 214, 280, 296, 304, 305, 424, 508 productivity, 2 System Activity Reporter, 386 System Administrator, 2, 5, 15, 40, 42, 48, 63, 68, 225, 275, 280, 283, 293, 295, 301, 302, 375, 516 systemd, 63, 136, 470–478, 482, 488, 561, 585 default target, 471, 475–478 service, 471–473, 478, 483, 488 targets, 471, 472 SystemV, 63, 451, 468, 471–473, 478, 561

T Tab completion, 212, 214 Tanenbaum, Andrew, S., 5, 562 tar tarball, 311 Teletype ASR 33 Tenets Always use shell scripts, 35 Automate everything, 54–55 Follow your curiosity, 65 Test early, test often, 55–56 There is no should, 66–67 Use the Linux FHS, 52–53 Terminal, 21, 32, 50–53, 79, 94, 140, 155–157, 168, 169, 173, 175, 626

183–187, 195–198, 201–205, 240, 275, 305, 348, 364, 391, 395–415, 487, 493 console, 21, 53, 75, 184, 187, 247 dumb, 185, 186, 487 emulator Konsole, 410–412 LXTerminal, 402–404 rxvt, 398 Terminator, 412–415 Tilix, 404–410 xfce4-terminal, 398–402 Xterm, 396 pseudo, 51, 196–197, 240 session, 21, 32, 33, 50, 52, 78, 81, 173, 175, 176, 184, 197, 201, 202, 205, 240, 305, 342, 391, 396, 399, 404, 406, 409, 410 Teletype, 187 TTY, 186, 487 Test plan, 56 sample, 6 Testing automated, 9, 332 fuzzy, 56 in production, 56 Thinking critical, 67 Thompson, Ken, 5, 37–40, 42, 50, 187, 225, 241, 440 Thrashing, 137–139, 353 Tilix, 21, 156, 196, 204, 398, 404–410 Time sharing, 186 Torvalds, Linus, 1, 2, 5, 6, 41, 43, 44, 52, 226, 562 Transformer, 51, 52, 239, 240, 242, 263, 264, 268

Index

udev, 473 UID, 286, 287, 293, 371, 445, 518 Unics, 37–39, 42, 44, 226 Universal interface, 50–51, 241 Unix, 5, 7, 12, 17, 37–42, 44, 47, 48, 50, 57, 61, 66, 183, 185, 186, 188, 225, 226, 240, 241, 275, 305, 480, 517, 522, 562 Updates, 2, 33, 79, 156, 161, 162, 177, 250, 299, 470, 491, 571 installing, 79, 175–178, 299, 315, 320–323 Upgrade, 13, 14, 59, 310, 320, 536, 537, 583 USB bus, 25 external backup drive, 26, 90 Live, 104, 118, 317, 453 thumb drive prepare, 242–247 User ID, 75, 194, 205, 286, 304, 445, 492, 497, 522, 525, 569 non-root, 8, 19, 75, 76, 103, 140, 142, 150, 191, 256, 284, 287–289, 292, 293, 302–305, 522, 523 privileged, 72 root, 8, 72, 285, 290, 367 student, 157, 199, 207, 215, 284, 288, 294, 295, 303, 349, 371, 431, 520, 521 UID, 286 unprivileged, 224, 293, 527 Utilities core, 7, 36, 41–45, 51, 52, 62, 73, 182, 198, 205, 225–236, 239, 240, 275, 421, 556, 591 GNU, 42, 43, 45, 226, 230

Variables $?, 429, 492 content, 57 environment, 177, 206, 210, 284, 299, 422, 491, 492, 495, 504, 505, 508, 591 $HOSTNAME, 420 $MYVAR, 420, 506, 507 naming, 57 $PATH, 177, 206, 284, 285, 421, 422, 493, 496, 503 $PWD, 207, 210, 212, 215 $SHELL, 440 VirtualBox Manager, 81, 85–87, 105, 121, 146, 148, 586, 596 Virtual drive, 146, 593, 604 Virtual Machine (VM), 14, 18, 44, 73, 86, 90, 93, 95, 105, 106, 109, 114, 121, 132, 146, 147, 159, 287, 453, 470, 596, 604 Virtual Memory, 30, 32, 137, 138, 140, 348, 365, 402, 404, 409, 412 Virtual Network, 18, 71, 72, 74, 86, 88–90, 405 VM, see Virtual Machine (VM) Volume group, 129, 131–135, 234, 559 logical, 58, 130–134, 234, 236, 421, 461, 533, 541, 550–552, 558, 559, 576, 594, 606

W Window manager, 172, 478–481, 484–487, 489 Compiz, 482, 483 Fluxbox, 482, 483 FVWM, 482–485 627

Index

Window manager (cont.) twm, 482 xfwm4, 482 Windows closed, 3, 4, 9 Workspace switcher, 154, 174, 175

X Xfce desktop, 11, 24, 104, 117, 145, 153–179, 196, 206, 399, 485 panel, 145 Xterm, 196, 396, 484, 485

Y, Z YUM, 315–316, 332

Command list adventure, 44 alias, 103, 199, 298, 299, 422, 445, 491, 493, 508–510 atop, 342, 344, 357, 358, 361, 364, 366, 391 awk, 263, 530 bash, 20, 36, 43, 53, 54, 63, 150, 181, 182, 198, 212, 226, 275, 276, 280, 418, 420, 421, 424, 427, 429, 498, 507 cal, 219, 231, 233 cat, 210, 217, 222, 268, 281, 331 cd, 207, 211, 215 chgrp, 103, 518, 527 chmod, 103, 291, 529 chown, 518 column, 230 cp, 36, 52, 215 date, 219, 297 dd, 52, 254–256, 259, 271, 456, 457 628

df, 95, 102, 235, 250, 251 dmesg, 97, 218, 243, 246, 441 dmidecode, 372, 373 dnf, 79, 315, 316, 322, 332 du, 227 dumpe2fs, 564, 565 e2label, 100 echo, 52, 267, 421, 428 egrep, 444 emacs, 63, 276, 280 exit, 285, 506 export, 506 fdisk, 244, 559, 595, 600 file, 534 find, 445, 447, 448, 542 for, 515 free, 365, 366 fsck, 573, 585, 594 getopts, 427, 428 gpasswd, 523 grep, 102, 210, 222, 268, 269, 431, 440–442, 445 groupadd, 523 grub2-mkconfig, 465, 467, 470 hddtemp, 375 history, 220, 222 htop, 342, 357, 359, 401 hwclock, 231 id, 194, 286 info, 205, 226 iostat, 366, 367 iotop, 367, 368 killall, 485 ksh, 36, 53, 198 ll, 103, 199, 400, 509, 530 ln, 540 ls, 36, 57, 185, 206, 217, 222, 436, 444, 445, 508

Index

lsblk, 234, 236, 355, 468 lscpu, 26–28, 44, 234 lshw, 372, 373 lspci, 373, 374 lsusb, 373, 374 lvs, 134 make, 311 man, 20, 150, 178, 205, 233 mandb, 178, 323 mkdir, 211, 212 mkfs, 230 mount, 43, 230, 231, 561, 571, 583, 602 nice, 342 od, 456 passwd, 219, 286 printenv, 505 pwd, 206, 215 pwgen, 437 renice, 231 rm, 217, 249, 436, 545, 546 rpm, 310–315 runlevel, 468 sar, 386, 387, 390, 391 screen, 189, 202, 203, 493, 507 script, 591 sed, 465, 466 sensors, 375 sensors-detect, 375 seq, 249, 515 set, 418, 419, 438 shred, 262 smartctl, 375, 377 sort, 263, 275, 447 stat, 217, 534 strings, 534 su, 78, 177, 284, 285, 305, 306, 519 sudo, 285, 293, 294, 297, 299–302, 305 systemctl, 477, 478, 483, 490

tail, 263, 524 tar, 311 time, 435 top, 342, 350, 357, 358, 366, 393, 413 touch, 431, 527 tune2fs, 576 type, 17, 53, 183, 185, 427, 428 umask, 527, 528 umount, 43, 230, 231, 270 uniq, 263, 264 unlink, 546 unset, 426, 507 useradd, 516, 523 usermod, 523 vgs, 295, 307 vi, 63, 298, 509 vim, 218, 275, 278, 509, 510, 527 w, 99, 192, 229 watch, 250, 251, 365, 366 which, 316 who, 193, 194, 229 who am i, 194 yes, 248, 250 yum, 315–316 zsh, 198

List of operators #!, 274, 349 &, 440 &&, 429–432 ∗, 321 ;, 440 >>, 101, 266, 268, 440 ? [], 438 {}, 440, 542 |, 263 ||, 429, 431, 432 629

Using and Administering Linux: Volume 1: Zero to SysAdmin: Getting Started - PDF Free Download (2024)

References