Common Issues – Main Drive Full (Mounting Issues)

So the recent problems with a server at my job and a few issues I’ve seen of recent gave me an idea for a new Common Issues post. This one is based on the common issue of the main drive being full and various mounting issues. As always, this email comes with all the disclaimers previously mentioned, as well as the biggy below:

Issue (Note: This only applies to our shared and reseller servers so we can assume they have a separate backup drive):

Client contacts us saying that they cannot receive email, their site is down, and FTP is throwing errors. A good thing to check is the drive usage of the server to ensure that it is not full. This can be done by using the command “df -h” which returns similar output as below:

[email protected] [~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda5             104G   11G   88G  11% /
/dev/hda1              99M   32M   63M  34% /boot
/dev/hda2             2.9G   80M  2.7G   3% /tmp
/dev/mapper/nvidia_dbechcfep1     917G   27G  844G   4% /home
/dev/mapper/nvidia_dbechcfep2     917G   72G  799G   9% /backup

This output can be interpreted as follows:

  • Filesystem – This is the Block location of the partition in question. This can be read as follows:
    • – /dev/ (Device) h (Drive Interface; h = ata; s = sata) d (Drive) a (Drive Letter; can be a,b,c,etc.. depending on number of drives) 1(Partition Number; can be 1,2,3,etc… depending on number of partitions)
    • Size – Total size of the partition
    • Used – Total amount of disk space used on the partition
    • Avail – Total amount of disk space available on the partition
    • Use% – Total amount of disk space used/Total amount of disk space in percentage form
    • Mounted On – Location the partition is mounted on the filesystem

If you see a 0 in Avail or a 100% in Use% next to the drive mounted on / then you have found your issue. Now to solve it. The best thing to check for this is backups. First, check if there is a backup drive on the server using the following command:

cat /etc/fstab | grep -r backup

If the output of this command is similar to below (or it outputs anything really) then there is a backup drive:

[email protected] [~]# cat /etc/fstab | grep -r backup
/dev/mapper/nvidia_dbechcfep2        /backup                  ext3    defaults,noauto        0 0

Also, check and see if the backup drive is currently mounted. If it is, go ahead and unmount it so you can get a better look at the / filesystem. You can do this by using the ‘umount’ command as below:

umount /backup

If this outputs “This Drive is busy”, then you get to the more complex part. Finding out why it’s busy. To do this, we will use the ‘lsof’ command. This command is run on a block device such as below:

[email protected] [~]# lsof /dev/hda2
COMMAND  PID   USER   FD   TYPE DEVICE  SIZE   NODE NAME
mysqld  5294  mysql    4u   REG    3,2     0     13 /tmp/ibb2mUbP (deleted)
mysqld  5294  mysql    5u   REG    3,2    69     14 /tmp/ibWiEoN3 (deleted)
mysqld  5294  mysql    6u   REG    3,2     0     15 /tmp/ib0O7Soi (deleted)
mysqld  5294  mysql    7u   REG    3,2     0     16 /tmp/ibuN6K0w (deleted)
mysqld  5294  mysql   11u   REG    3,2     0     17 /tmp/ibxm1GDL (deleted)
jsvc    6106 tomcat  mem    REG    3,2 32768 351650 /tmp/hsperfdata_tomcat/6106
cpdavd  8729   root    0r   REG    3,2 16671     72 /tmp/sh-thd-1238113224 (deleted)

As you can see, in my example I used my /tmp/ partition and the lsof command told me that there are three separate unique processes using the partition; MySQL, JSVC (Java Server Virtual Console), and cpdavd (cPanel). In order to unmount this partition successfully, I will need to end these processes. When available, use the stop scripts available to you such as the following in my case:

/etc/init.d/mysql stop
/etc/init.d/jsvc stop
/etc/init.d/cpanel stop

Once these processes are stopped, you should be able to unmount. If the process doesn’t have a stop script or perhaps if it is cpbackup or a similar script running, you will need to kill the process using the ‘kill’ command. Find the PID of the process from the output of lsof and use it as accordingly:

kill -9 5294

In my example I killed the MySQL process that was running on the server with PID 5294. Obviously, this number will differ depending on the server, time, etc. Once lsof is showing no output, you should be able to unmount and proceed with cleaning the main drive. This can be done by just removing the backups on the main drive with rm -rf. Once the rm has been running a few minutes, go ahead and open a second SSH session and start restarting services that had been failing due to no disk space such as cPanel, MySQL, Exim, and HTTPD. Once the rm is complete, go ahead and remount the /backup/ drive.

Please note: The drive mounting and unmounting portion of this email can be used in more cases then just the drive being full. Sometimes, /tmp/ can be having an issue that unmounting and mounting can solve and a similar process as above would be required.

Also, Note this is just one solution for a main drive being full. Though this is the most common issue, it is not the only problem that could arise. Other issues like core dumps from PHP can quickly fill a drive or just plain usage. For core dumps, you can use a script such as below to locate core dumps to remove:

find /home/ -iname "core.*"

This will locate all core dumps in user directories so you can delete them and inform the client of their broken script. Also, if the drive is as far as you can see, legitimately full (backups on backup drive and no core dumps), please do not hesitate to email [email protected] and CC the office and the NOC about the issue so Abuse can check for abusive users to remove or the NOC can look into restoring the server as soon as possible.

As with all of my Common Issues posts, please do not hesitate to ask any questions on this email by replying all to it so that all can benefit from the question and answer. If you are unsure on anything in this email also, do not hesitate to ask me or any other senior staff about the issue and possible resolutions.

Thanks for reading and have a great day!

Common Issues – MySQL Connectivity Issues

This week on Common Issues and their Solutions, we will be tackling the common, yet annoying, MySQL connection issues that many of clients see daily. This issue is typically much more simple than it may seem so let’s see if we can hammer it out together.

Client submits a ticket saying his WordPress installation is not working correctly and just showing a very plain error “WordPress cannot connect to Database”. The error is nondescript but we can use a set route to find the solution quickly.

Step 1: Check the MySQL service. You would be surprised how many times the MySQL service has just died or is overloaded and just needs to be restarted. This is a quick and painless restart typically so it is not a pain to do.

Check MySQL Status with: /etc/init.d/mysql status -OR- Through Service Status in WHM

Restart MySQL with: /scripts/restartsrv_mysql -OR- Through Restart MySQL in WHM

If the status is either okay or restarting the service did not fix the problem, move on to step 2.

Step 2: Check the user account and ensure they are connecting to the database correctly.

To check this with the least amount of steps and most accuracy, first check the configuration file for the given software the client is having an issue with. Since our example is WordPress, we will go on that. WordPress stores all of its database connection information in a flat file named wp-config.php in the root directory of the WordPress installation. You will find the following variables:

define('DB_NAME', 'dennis_wrdp1');
define('DB_USER', 'dennis_wrdp1');
define('DB_PASSWORD', 's3kr3t!1');

These three variables are pretty obviously named as far as letting you know what they do. Once you have these variables, test out the connection by using the following command:

mysql -u USERNAME -p

MySQL will then prompt you for the password. Provide that and if it shows the following error:

ERROR 1045 (28000): Access denied for user 'dennis_wrdp1'@'localhost' (using password: YES)

Then we have located the issue. To fix it, we will need to repair the password that MySQL is expecting. The easiest way to do this is just use the following commands:

mysql // enter mysql as root
USE mysql; // select the mysql database to manipulate
UPDATE user SET password=PASSWORD('INSERTNEWPASSHERE') where User='USERNAMEHERE'; // updates the password
FLUSH privileges; // push all changes live
exit;

-OR- in WHM you can use the “Change a User or Database Password” tool and select the user in the drop down and insert the correct password into the “New Password” box.

Please note: These steps should only be taken given that the user is using a MySQL user account OTHER than their main account (ie. If in the example I had been using “dennis” to connect to my database). In that case, you will need to get the user’s cPanel password and update the WordPress configuration to reflect it rather then this route as the user has just updated his cPanel password but not the configuration for their WordPress installation.

This should solve 99% of MySQL connection problems and the hardest part will just be locating the DB configuration file. At times, the error message says it can’t connect and the error shows the mysql_connect line but hunting down the actual variables it uses is a bit of an easter egg hunt as the user may have multiple useless files and variables that could be scattered wildly for no apparent reason. Being able to follow the trail of included files, and their included files to find the missing information can be complex but valuable. My one big tip is: If there is not a clear ‘variables.php’ or ‘vars.php’ or ‘db.php’ but rather just ‘header.php’, ‘body.php’, and ‘footer.php’, check the header.php for an include to a more obvious file.

Note: If you cannot find the configuration file in about 5 minutes, then the user is hiding it far too well for a novice and they can likely assist you in finding it. Ask them for more information and they will likely be able to help. Also, most commercial PHP applications such as WordPress and Drupal have a set file such as Configuration.php or wp_config.php that contain all of this information. Check these files if you know the PHP application is a typical commercial application.

That rounds up this week’ Common Issues and their solutions. If you have any questions on this email, please feel free to contact me and I’ll do my best to help you out.