Learn the Art of Linux Troubleshooting

7
11639

Here are some valuable tips to help you find important system files in RHEL 5, which get deleted by accident.

Everything seems fine when your Linux machine works just the way you want it to. But that feeling changes dramatically when your machine starts creating problems that you find really difficult to sort out. Not everyone can troubleshoot a Linux machine efficiently… but you can, if you are ready to stay with us for the next few minutes. Let’s look at what to do when some of your important system files are deleted or corrupted under Red Hat Enterprise Linux 5. The action begins now!
Scene 1: /etc/passwd is deleted
This is an important file in Linux as it contains information about user accounts and passwords. If it’s missing in your system and you try to log in to a user account, you get an error message stating Log-in incorrect and after restarting the system.

Now that you have seen the problem and its consequences, it’s time to solve it. Boot into single user mode. At the start of booting, press any key to enter into the GRUB menu. Here you will see a list of the operating systems installed. Just select the one you are working with and press.

It’s time to have some fun with kernel parameters. So highlight the kernel and again press e to edit its parameters.
Next, instruct the kernel to boot into single user mode, which is also known as maintenance mode. Just type 1 after a space and press the Enter key. Now press b to continue the booting process.
Now that you have booted into single user mode, you are probably asking yourself, What is next?. The tricky portion of this exercise is now over and it takes just one command to have your passwd file in its place. Actually, there is a file /etc/passwd-, which is nothing but the backup file for /etc/passwd. So all you need to do is to issue the following command:

cp /etc/passwd- /etc/passwd

…and you are done. Now you can issue the init 5 command to switch to the graphical mode. Everything is fine now. You can also find the backup of /etc/shadow and /etc/gshadow as /etc/shadow- and /etc/gshadow- respectively.
Scene 2: /etc/pam.d/login is deleted
If your /etc/pam.d/login file is deleted and you try to log in, it won’t ask you to enter your password after entering your username. Instead, it will continuously show the localhost login prompt. Here again, there is a single command that will solve the problem for you:

cp /etc/pam.d/system-auth /etc/pam.d/login

Just boot into the single user mode as done earlier, type this command and you’ll be able to log in normally. There is also a second solution to this problem, which we’ll look at after a while.
Scene 3: /etc/inittab is deleted
We know that in Linux, init is the first process to be started and it starts all the other processes. The /etc/inittab file contains instructions for the init process and if it’s missing, then no further process can be launched. On starting a system with no inittab file, it will show the following message:

INIT:No inittab file found …and will ask you to enter a runlevel. When you do that, it again shows the message that no more processes are left in this runlevel.
Fixing this problem is not easy because being in the single user mode doesn’t help in this case. Here, you need the Linux rescue environment to fix this problem. So set your first boot device to CD and boot with the RHEL5 CD. At the boot prompt, type ‘Linux rescue’ to enter the rescue environment.
Once you have entered into the rescue environment, your system will be mounted under /mnt/sysimage. Here, reinstall the package that provides the /etc/inittab file. The overall process is given below:

chroot /mnt/sysimage 
rpm -q --whatprovides /etc/inittab 
mkdir /a 
mount /dev/hdc /a 
 
Here /dev/hdc is the path of the CD. It may vary on your system, though. 
 
rpm –Uvh --force /a/Server/initscripts-8.45.25-1.el5.i386.rpm

You can also hit the Tab key after init to auto complete the name.
Now you’ll get your /etc/inittab file back. The same procedure can be applied to recover the /etc/pam.d/login file. In this case, you’ll have to install the util-linux package. Once you are done with it, type Exit to leave the rescue environment, set your first boot device to hard disk and boot normally.

Scene 4: /boot/grub/grub.conf is deleted
This file is the configuration file of the GRUB boot loader. If it is deleted and you start your machine, you will see a GRUB prompt that indicates that grub.conf is missing and there is no further instruction for GRUB to carry on its operation.
But don’t worry, as we’ll solve this problem, too, in the next few minutes. You don’t even need to enter single user mode or the Linux rescue environment for this. At the GRUB prompt, you can enter some command that can make your system boot. So here we go: Type root (and hit Tab to find out the hard disks attached to the system. In my case, I got hd0 and fd0—the hard disk and floppy disk, respectively). Now we know that GRUB is stored in the first sector of a hard disk, which is hd0,0. So the complete command would be root (hd0,0). Enter this command and press the Enter key to carry on.
You now need to find out the kernel image file. So enter kernel /v and hit Tab to auto complete it. In my system, it’s vmlinuz-2.6.18-128.el5. Please note it down as we’ll require this information further, and then press Enter.
Next, let’s figure out the initrd image file. So enter initrd /i and press Tab to auto complete it. For me, it’s initrd-2.6.18-128.el5.img. Again note it down and press Enter.
Type boot and press Enter, and the system will boot normally.
Now it’s time to create a grub.conf file manually. So create the /boot/grub/grub.conf file and enter the following data in it:

 
splashimage=(hd0,0)/grub/splash.xpm.gz 
default=0 
timeout=5 
title Red Hat 
root (hd0,0) 
kernel /vmlinuz-2.6.18-128.el5 
initrd /initrd-26.18-128.el5.img

Save the file and quit it. You have created a grub.conf file manually to resolve the problem. Don’t forget that the kernel and the initrd image file name may vary on your system. That’s why I asked you to note them down earlier. You can also find them in the /boot folder once you are logged in it’s not a big issue.
So we have looked at solutions to four different problems. I hope this information  assists you in learning Linux troubleshooting. Carry on this work and acquire more troubleshooting skills because that’s what makes you a true Linux geek.

7 COMMENTS

    • how can i create grub.conf file after type boot and hit enter .. after execute boot command it’s showing call trace . system_call_fastpath+0X16/0X1b

  1. how can i create grub.conf file after type boot and hit enter .. after
    execute boot command it’s showing call trace .
    system_call_fastpath+0X16/0X1b

  2. This is really a great tutorial on linux trouble shooting boot process.But in scene 3 , how did you find out the device name for the cd rom?

    • you can using the wodim command
      for eg.

      [root@localhost Desktop]# wodim –devices
      wodim: Overview of accessible drives (2 found) :
      ————————————————————————-
      0 dev=’/dev/scd0′ rwrw– : ‘NECVMWar’ ‘VMware SATA CD00′
      1 dev=’/dev/scd1’ rwrw– : ‘NECVMWar’ ‘VMware SATA CD01’
      ————————————————————————-

    • mind you..wodim will only show results of “accessible” devices. so if you have your cdrom already mounted, then unmount it first and then run the wodim command to see any results.

      if wodim is not installed on your system you can do it by installing the cdrecord package.

      # yum install cdrecord

LEAVE A REPLY

Please enter your comment!
Please enter your name here