Server not responding? Use the Emergency Console

As you probably already know, our platform protects you from hardware failures that might occur on your server.

In the event of a problem on the machine, or if we suspect that a problem might occur (abnormal temperature, corrupted memory, etc.), your “server” will automatically be migrated to another machine. However, if you have an internal problem on your “server” that is not to due to the physical machine, and if it no longer responds, then you will need to take action.

The first action to take is to be sure that your server’s status is shown as “Running” on your Gandi interface. This is because the status may also be “Stopped” or “Paused” if, for example, it has not been renewed. So given that it is “Running” and non-responding, here's what to do.

There are three different cases:

You can still connect to your server via SSH.

In this case, the following commands will help you analyze the situation:

  • uptime will give you the current load of the machine,
  • free will show you the amount of memory used by your applications in the “used” column,
  • top (we recommend that you install htop) will show you the ranking of applications in real time, ordered by their use of resources (memory, CPU)
  • dmesg shows you messages from your Linux server’s kernel,
  • Consulting logs such as /var/log/messages or /var/log/daemons with the command tail, for example (i.e. tail /var/log/daemons), will also provide you with precious information.
  • df –h shows you the amount of disk space available on your disks.

The most frequent causes of error are:

  • No space left on your system disk: this situation is often caused by a lack of appropriate log management on the server, or by a database which fills up too quickly. The solution is to clean up wasted space or enlarge the disk.
  • Not enough RAM on the server, or too much memory used: the simple solution is to add more RAM by adding additional shares. If you have an expert server, you can also try to modify the behavior of the Linux’s available memory by using the command sysctl -w vm.overcommit_memory = 2. Warning: so that the modification can be maintained following a reboot, you must also add “vm.overcommit_memory = 2” to the ”/etc/sysctl.conf” and ”/etc/gandi/sysctl.conf” files.
  • Too many processes are running simultaneously on your machine: you will need to lower the values in the configuration files of your applications (the number of simultaneous connections on Apache, for example) or increasing the power of your server by adding shares.

You can no longer connect to your server under SSH, and it does not respond to ping or it has a slow response time.

The emergency console which you can activate from your account gives you direct access to your machine as if a monitor and a keyboard (virtual) were directly attached to the server.

In this case, you can stop all the applications that are causing problems, and once again get access to the server.

The 'sysreq' shortcuts are available from the console of your server. The commands can be given by pressing Ctrl and “o” (as in Oscar) in order to enter the sysreq mode, and then you can enter the command. By doing this, you can stop all the processes by: Ctrl+o +i (to kill). Ctrl+o +h gives you quick help to all of the available sysreq commands.

Your server may be “unreachable” but may nonetheless be working normally, and without any technical problem.

This may happen if your server is the victim of a DDoS attack for example. Your server will therefore be isolated from the rest of the network in order to protect our infrastructure and the quality of service for other customers.

You may verify whether or not your server is in this state by performing a traceroute command on the IP address of your server. If it stops at its arrival at Gandi, on one of our routers for example, it is likely that your server has been isolated from the network. You may then connect to your server via the console, though you will need to contact the support team to correct the problem (they will usually contact you first). This sort of isolation happens only very rarely.

After that, if you are still blocked, then this means that you are in the 4th case: your very own case ;-). Please send an email to our support team indicating that your server has been blocked, and you will get an answer as soon as possible.

Last modified: 02/17/2014 at 13:14 by Gilles L. (Gandi)