2-2: The Linux Filesystem – Bioinformatics Web Development

On approaching Linux systems for the first time, it is essential to understand how files and directories are organized in a Linux system and how to refer to a particular file or directory in order to perform actions. In order to refer to file or directory we need to know it’s relative or absolute path in the filesystem.

The Linux root directory

Everything in the Linux filesystem is included within a top directory (you can think of a directory just as a folder that contains other folders or files) that is called the Root directory.

Table 2-2-1: Subdirectories of the root directory in Linux systems
Source: This page of The Linux Documentation Project

Directory	Content
/bin	Common programs, shared by the system, the system administrator and the users.
/boot	The startup files and the kernel, `vmlinuz`. In some recent distributions also `grub` data. Grub is the GRand Unified Boot loader and is an attempt to get rid of the many different boot-loaders we know today.
/dev	Contains references to all the CPU peripheral hardware, which are represented as files with special properties.
/etc	Most important system configuration files are in `/etc`, this directory contains data similar to those in the Control Panel in Windows
/home	Home directories of the common users.
/initrd	(on some distributions) Information for booting. Do not remove!
/lib	Library files, includes files for all kinds of programs needed by the system and the users.
/lost+found	Every partition has a `lost+found` in its upper directory. Files that were saved during failures are here.
/misc	For miscellaneous purposes.
/mnt	Standard mount point for external file systems, e.g. a CD-ROM or a digital camera.
/net	Standard mount point for entire remote file systems
/opt	Typically contains extra and third party software.
/proc	A virtual file system containing information about system resources. More information about the meaning of the files in `proc` is obtained by entering the command man `proc` in a terminal window. The file `proc.txt` discusses the virtual file system in detail.
/root	The administrative user’s home directory. Mind the difference between /, the root directory and /root, the home directory of the root user.
/sbin	Programs for use by the system and the system administrator.
/tmp	Temporary space for use by the system, cleaned upon reboot, so don’t use this for saving any work!
/usr	Programs, libraries, documentation etc. for all user-related programs.
/var	Storage for all variable files and temporary files created by users, such as log files, the mail queue, the print spooler area, space for temporary storage of files downloaded from the Internet, or to keep an image of a CD before burning it.

Absolute paths

For path purposes, the root directory is indicated with a slash. The absolute path of the root directory is

An absolute path, is a path that indicates the position of a file or directory with respect to the root directory. Therefore, absolute paths always start with a slash. For example the absolute path of the folder “etc” contained directly within the root folder in all Linux system is:

/etc

where the slash indicates the root folder.

The Linux Shell – Using the Terminal

Let’s start using a Linux shell to explore the filesystem. In Ubuntu, a Terminal window can be opened with the following keyboard shortcut:

Ctrl + Alt + T

Or, in Ubuntu Unity: Dash -> Search Terminal

Or, in Gnome: Applications menu -> Accessories -> Terminal

Figure 2-2-1: Linux Terminal to access the Shell

When we open the Terminal, we see a line that ends with a dollar sign $. This line is called the prompt, this is where we can type commands to interact with the computer.

Typing Shell Commands – The pwd Present Working Directory command

As a first command, let us type “pwd”:

andrea@ubuntu:~$ pwd

and press the enter key

pwd stands for “Present Working Directory” and returns the absolute path of the current directory, the folder in which we are virtually currently located within the file system, as show in figure 2-2-2. In this case, we are in the user “andrea” home directory. In Linux, every user is assigned a home folder within the

/home

directory, with the same name as the user. So the user “andrea” has a folder called “andrea”, located in a “home” folder, contained in the root directory “/”:

/home/andrea

Absolute Path Anatomy

In this absolute path of the “andrea” folder, as in all absolute paths, the first slash indicates the root folder, while the subsequent slashes in the path are just separators to delimitate the directories names.

Figure 2-2-3: The meaning of the slashes in an absolute path. While the first slash indicates the root directory, the subsequent slashes are separators for the directories or file names

When a user opens a terminal window, either locally or from a remote location (by connecting to the server by SSH, this will be discussed later on in this chapter), the default location within the file system is his home directory /home/username.

The ls shell command to list directory contents

Let’s have a look to the contents of the Linux root directory by using the “ls” (LiSt) command. The “ls” command can take as an optional argument the name of the directory of which we wish to list the contents. So if we wish to list the contents of the Root directory we can type:

andrea@ubuntu:~$ ls /

Without arguments, it will list the contents of the present working directory, the one we “are in” when issuing the ls command.

Figure 2-2-4: Listing the contents of the Root directory with the ls command

You can check that the listed directories and files are the same we can see in the graphical user interface on the same machine. In the graphical user interface, the Root folder is called “File System”.

Figure 2-2-5: Listing the contents of the Root directory with the Ubuntu graphical interface

As you can see, the Root directory contains a number of files and folders. These are created at the time of system installation. During this course we will become familiar with some of these locations. For now, we can remind that the home folder in the root directory contains the users folders, one for each user, with the same name as the user. The etc folder contains important configuration files, for the various software packages installed on Linux. For example, the configuration files for the Apache web server software are located in:

/etc/apache2/

while the php.ini file, important for the configuration of PHP, is located in a directory whose name is, or starts with, “php”, located within the /etc directory. The exact location varies according to the PHP version installed on your Linux OS. For example, for PHP 5 the location was:

/etc/php5/apache2/php.ini

while for PHP 7, the location is:

/etc/php/7.0/apache2/php.ini

Actually the configuration files for Apache and PHP will only be present on the machine after we have installed a LAMP server (Linux, Apache, PHP, MySql server).

The cd shell command to change directory

Let us introduce the Change Directory “cd” command, that allows us to change our present working directory and move around in the file system. If called with no arguments, the cd command will return us to our default directory, our home directory. Lost in the file system? Here is a neat trick: just type cd and hit the return key to go back home.

andrea@ubuntu:~$ cd

We can call “cd” by supplying an absolute or relative (we’ll come to those in a moment) path to a directory in the filesystem. For example:

andrea@ubuntu:~$ cd /etc

will move us to the /etc directory.

We can then list the current directory contents with “ls”, with no arguments. This shell session is shown in the next figure. Note how, when we move to the /etc directory with the cd command, the propt line changes from:

andrea@ubuntu:~$

andrea@ubuntu:/etc$
to reflect our new current directory location.

Please note, in the listing of the /etc content, the apache2 folder (only available after the apache web server installation). As already mentioned before, this is the directory that contains all the apache web server configuration files. We will spend some time discussing those files and the apache configuration later in this chapter.

Linux Relative paths

The absolute path is the path of a file with respect to the root (/) folder. All absolute paths start with a slash. If a path starts with a slash, it is an absolute path. It’s good to have some fixed reference points in life, this is one.

A relative path is the position of a file/directory with respect to another directory different from root. It could be our present working directory or to the directory a script is executing from, for example. Relative paths never start with a slash. Let us consider the example situation in the next figure.

The figure represents a linux filesystem with a root directory “/”, a number of child directories, including “etc” and “home” that we are already somewhat familiar with as the first contains configuration files for a number of Linux applications, including the web server apache (when installed), and the second contains the user’s home directories.

In this example we have four users: andrea (home directory /home/andrea), joe (home directory /home/joe), anne and carl. Each user’s home directory also contains a folder called “seqs” that in turn contains two subfolders called “dna” and “protein”. In the figure we have only represented the “seqs” folders of joe and carl.

Let’s imagine joe logs into the machine by remote ssh. He will see a prompt similar to this one:

joe@ubuntu:~$

the absolute path of his present working directory will be:

/home/joe

the relative path of joe’s present working directory will be, instead, just a dot:

the relative path of the /home/joe parent directory, that is the /home directory, relatively to joe’s home directory /home/joe is indicated by two dots:

The command

cd ..

brings us “up one directory” in the filesystem hierarchy.

So, back to joe’s example, if joe now types:

joe@ubuntu:~$ cd ..

he will change his working directory from /home/joe to /home. This will reflect in the prompt:

joe@ubuntu:/home$ pwd

/home

joe@ubuntu:/home$

joe can now type “cd” without arguments to go back to his home directory /home/joe:
joe@ubuntu:/home$ cd

joe@ubuntu:~$

So we now know how to use the “cd” command in combination with a relative path in order to move up one directory (cd ..) or stay where we are (cd .). You may notice that latter one is exquisitely useless in practice, although it makes sense to mention it here, for educational purposes. Please note that each directory can contain several other directories, called the child directories, but can only have one single parent directory.

How can joe move to his seqs folder (see figure 2-2-7)? What is the relative path of /home/joe/seqs with respect to /home/joe?

This is especially easy, it is simply:

seqs

or, which is equivalent:

./seqs

in which the dot stands for the current directory, the slash is a separator, and seqs is the seqs folder.

You should start to get the feel of what a relative path is. Let’s further clarify this through some examples, in which pwd is the our present location (present working directory) and target is the directory of which we need to know the relative path, with respect to our present location in the filesystem.

pwd: /home/joe
target: /home/joe/seqs/protein
relative path: seqs/protein

pwd: /home/joe
target: /home/carl
relative path: ../carl

pwd: /home/joe/seqs/protein
target: /home/carl/seqs/protein
relative path: ../../../carl/seqs/protein

pwd: /
target: /ect/apache2
relative path: etc/apache2

pwd: /home/joe/seqs/dna
target: /home/joe/seqs/protein
relative path: ../protein

We encourage you to carefully understand each one of these examples, also by referring to figure 2-2-7 to get a “visual” scheme of what is going on.

Since the absolute path is able to locate univocally a file/directory in the filesystem, you may wonder why there is the need for such a thing as a relative path and all these dots notations.

The answer is that this is extremely useful when you want to make things “portable” for example from one computer to another, from a user account to another, or simply to change the location of a folder within your own account.

Portability: “Relating to or being software that can run on two or more kinds of computers or with two or more kinds of operating systems.”

It is quite common to have all the files composing a software project, grouped in a folder. For example a folder “website1” could contain a number of .html files for a website, and a subfolder called “images” with all the image files for the same website. The html files contain pointers to image files in the “images” folder. If these pointers are coded as relative paths

images/image1.png

instead of as absolute paths

/home/joe/public_html/website1/images/image1.png

and similarly, all the links between the web pages in the folder are typically coded as relative

page1.html

instead of as absolute

/home/joe/public_html/website1/page1.html

when you move the folder around in the filesystem, or even to a different PC, all the pointers and links between the various html files will still be valid, while the same is obviously not true if the links are coded as absolute paths.

By now you should have a solid basic understanding of what the Linux filesystem is, and how you can move around with the cd command and list directories and files with ls. You can use this knowledge to start to have a look at your linux filesystem. If you are lost, just type pdw to know your current directory. Type cd and hit the return key to go back home.

More on Linux files: invisible files, ownership and permission

We have seen that using the ls command returns a list of the filenames of the files contained within our present working directory (figure 2-6) or in the directory we supply the path as argument to ls.

As with many commands in Linux, we can however call “ls” with some additional options. Calling “ls” with options will modify the kind of output we get from the command. There are two options that are particularly useful:

– the “l” option, that gives an output with a file for each line, with a number of details on the file
– the “a” option, that lists the invisible files in addition to the visible files. In Linux, an invisible file is a file whose name starts with a dot. You can make a file “invisible” by adding a dot at the beginning of the file name.

We can call “ls” with both the “l” and the “a” options, like this:

andrea@ubuntu:~$ ls -la

Let’s have a look to a sample output

We now have a output organized in several lines, one for each file, and several columns, that list different properties of the file. In order to be able to administer our Linux machine we need to fully understand this output.

Figure 2-2-9: Understanding the ls -la output

The last column contains the file name. Please note, in figure 2-2-9 the first two files: “.” and “..”. These indicate, with the relative path notation we are now familiar with, the current directory and it’s parent directory.

We then have a few invisible files listed (because we used the “a” option with the “ls” command”) such as .bash_history and .bash_logout. They are invisible because their names start with a dot. This has nothing to do with the dot that indicates the current directory, by the way.

File types

We also see a file called “Desktop”. This is a directory. Indeed, in Linux a directory is just a particular type of file.

Table 2-2-2: Linux file types

– Regular file
d Directory
l Link
c Special file
s Socket
p Pipe
b Block device

To know the type of one of the files listed with ls -l we can look at the very first character of each line, see the following two examples:

drwxr-xr-x 6 andrea andrea 4096 2011-03-23 06:23 Desktop

-rw-r–r– 1 andrea andrea 3103 2010-10-31 20:06 .bashrc

The first file is a directory, because the first character is a “d”. The second file is a regular file, because the first character is a “-“. For the purposes of this course, regular files and directories are the only file types we will need to deal with.

File permissions

Following the first character, we have other 9 characters that define the file’s read, write and execute permissions for the owner, the group, everybody else. These “rwxr-xr-x” notations might seem complicated at first. The goal of the next paragraph is to explain how this works, in simple terms.

Understanding Linux files permissions - The 777 permission — Figure 2-2-10: Understanding Linux files permissions – The 777 permission

As we can see in figure 2-2-10, there are also some numeric notations we need to deal with. We’ll come to those in a moment. For now, let us concentrate on the first two lines of figure 2-2-10. The orange line shows the nine characters. They are in fact three groups of three characters each. The first group of three characters defines the permissions for the file’s owner. Let’s look closely at these three characters, one by one.

– The first can be either “r”, which indicates read permission, or “-“, which indicates no read permission.
– The second can be “w”, which indicates write permission, or “-“, which indicates no write permission
– The third can be a “x”, which indicates execute permission, or “-” which indicates no execute permission
In the example in figure 2-10, the owner has a “rwx” permission, which means he can read, write and execute the file. If he could only read and write, but not execute, this would have been “rw-“, as in the example in figure 2-11.

Understanding Linux files permissions - The 644 permission — Figure 2-2-11: Understanding Linux files permissions – The 644 permission

Numeric scores are associated to these read, write and execute permission. Read is worth a 4, write a 2, execute a 1. The final score is a single number, the sum of the three individual scores. Let us make some examples:

r w x = 4 + 2 + 1 = 7

r – x = 4 + 0 + 1 = 5

r w – = 4 + 2 + 0 = 6

r – – = 4 + 0 + 0 = 4

This three characters scheme is repeated for the owner of the file, the group, and everyone else (others), this is why we have 9 characters to define permissions in the output of the ls -l command.

For each group of three, we can summarize the permission with a single number, as just shown. So the final permission for a file becomes a group of three numbers, such as for example “777” (figure 2-2-10), “644” (figure 2-2-11) or “755” (figure 2-2-12).

Understanding Linux files permissions - The 755 permission — Figure 2-2-12: Understanding Linux files permissions – The 755 permission

Permissions for a file can be changed with the

chmod

command. This will be discussed in the next section Basic Linux Shell Commands.

If you want to learn more on Linux file permission, we suggest you visit the excellent files permissions page on Tuxfiles

Moving around in the filesystem is not enough. You need to learn a few commands in order to be able to change things on your computer, such as creating and editing files, renaming files and directories and moving them around, creating and deleting users, changing files permissions and ownerships, installing and updating software and more. Let’s move on and learn the Basic Linux Shell Commands.

Chapter Sections

[pagelist include=”63″]

[siblings]