Ubuntu Server Keeps Shutting Down After 15 Minutes: Troubleshooting Guide

Welcome to our comprehensive guide on resolving the frustrating issue of an Ubuntu server shutting down prematurely, specifically after approximately 15 minutes. We understand the critical importance of server uptime and are here to provide you with a meticulously detailed, step-by-step approach to diagnose and rectify this problem. We will leverage our deep understanding of Linux systems and the Ubuntu environment to equip you with the knowledge and solutions you need.

Initial Assessment and Problem Definition

The core problem, as described, is that an Ubuntu 24.04 server, running on desktop hardware, unexpectedly shuts down after a consistent 15-minute interval. This behavior started after transitioning to a headless configuration, where the server is operated without a directly connected monitor. While a GUI is installed, the server is primarily accessed via SSH and RDP. We have already identified attempts to address the issue through logind.conf modifications, but the problem persists.

Before diving into specific solutions, it is crucial to reiterate the importance of detailed diagnostics. The shutdown itself might be caused by various underlying factors.

Investigating the Root Cause: Diagnostic Steps

The provided information gives us a good starting point for our investigation. Let’s break down the core areas we need to examine.

Examining systemd-logind Configuration

Your initial approach of modifying /etc/systemd/logind.conf is a valid one, and it is often the first place to look. However, let’s make sure that these settings are correctly implemented and understood.

Reviewing logind.conf Settings

The settings you’ve configured in /etc/systemd/logind.conf are designed to prevent actions triggered by idle events, lid closures, or power-related events. Here’s a critical review of these settings and their implications:

  • IdleAction=ignore: This setting is crucial. It should prevent the system from going into a sleep or power-off state due to user inactivity.

  • HandleSuspendKey=ignore, HandleHibernateKey=ignore, HandleLidSwitch=ignore, HandleLidSwitchExternalPower=ignore, HandleLidSwitchDocked=ignore: These settings are designed to prevent the system from shutting down based on key presses or lid actions. While you mentioned this is a desktop, it’s still worth making sure they are all in place, as sometimes system default configuration can be misleading.

Confirming Effective Settings

After making changes to /etc/systemd/logind.conf, it is essential to ensure that those changes are properly applied. The command sudo systemctl restart systemd-logind is the correct method for reloading the configuration without a full system reboot. However, sometimes, a reboot is required to be certain.

  • Verify the Configuration: To confirm that the configuration has taken effect, you can use the following command to view the active settings:

    systemctl show systemd-logind | grep -E "IdleAction|HandleSuspendKey|HandleHibernateKey|HandleLidSwitch"
    

    This command will output the currently active settings. Ensure that the values you set in /etc/systemd/logind.conf are correctly reflected in this output.

  • Check for Overriding Configurations: Systemd can have multiple configuration files. To ensure there are no conflicting settings, check other directories where systemd configuration files might reside. Common locations include /etc/systemd/system/ and /usr/lib/systemd/system/. Look for any files related to power management, sleep, or hibernation that might override your settings.

Analyzing Power Management Settings

The presence of a GUI introduces another layer of power management that needs to be evaluated. These settings can significantly influence how the system behaves when idle.

Inspecting GNOME Power Settings

As the system runs a GUI, it’s crucial to examine the GNOME power settings. The command gsettings list-recursively org.gnome.settings-daemon.plugins.power provides insight into these settings.

Interpreting gsettings Output

Let’s analyze the provided output from gsettings.

  • sleep-inactive-ac-timeout and sleep-inactive-battery-timeout: These settings determine how long the system waits before entering a sleep state when running on AC power or battery power, respectively. Your output shows 7200 seconds (2 hours) for AC and 900 seconds (15 minutes) for battery. Although the server is running on AC power, the battery settings could influence behavior, especially if there are underlying power management problems.

  • sleep-inactive-ac-type and sleep-inactive-battery-type: These settings define the action to take when the idle timeouts are reached. In your output, both are set to nothing and suspend. This seems like it should prevent unwanted shutdowns.

  • lid-close-ac-action and lid-close-battery-action: These settings, while related to lid closure, can affect power behavior.

  • idle-dim: If enabled, this may dim the screen after a period of inactivity. It should not be linked to shutdown, but it is still worth investigating.

Troubleshooting Power Settings

  • Verify Power Mode: Although you mentioned setting the power mode to “Balanced”, double-check this setting via the GUI. If you are running a service, ensure it does not affect power settings.
  • GUI Interference: If you’re using a remote desktop connection (RDP), the power settings within the GUI session could override system-level configurations.

Examining System Logs: The journalctl Command

System logs are invaluable for pinpointing the cause of the shutdowns. The output from sudo journalctl -b -1 -ex offers critical clues, but it needs careful interpretation.

Understanding the journalctl Output

Let’s analyze the log entries:

  • The log indicates a power-off sequence initiated by systemd. The final entries show the final.target reached, followed by the termination of services and shutdown.

  • The journal is from the previous boot (-b -1). It is imperative to look at the current boot’s journal logs to find details before the shutdown. The command to view the current boot logs is sudo journalctl -b -ex.

Interpreting Log Entries and Identifying Root Causes

  • Look for Warnings and Errors: Scan the logs for any warning or error messages that appear before the shutdown entries. These messages might provide clues about the cause of the shutdowns.
  • Check for Service Failures: Search for any services that might have failed before the shutdown. These failures might be related to the power issue.
  • Hardware-Related Errors: Keep an eye out for errors related to hardware, such as the hard drive, memory, or CPU. These could indicate hardware problems that might be causing the shutdowns.

Reviewing Additional System Information

Operating System Version

Verify that you are running Ubuntu 24.04 (Noble Numbat). Use lsb_release -a to confirm.

Hardware Configuration

Although you mention a desktop machine, detailed hardware information might be helpful. For example, you can use the lshw command:

sudo lshw -short

Examine the output for critical components like the CPU, RAM, storage, and power supply. A faulty power supply can lead to unexpected shutdowns.

Installed Services

List all running services using:

systemctl list-units --type=service

Pay attention to services related to power management, network management, and any custom scripts you might have installed. It’s possible that a service is interfering with power management.

Advanced Troubleshooting Steps

If the initial steps don’t resolve the issue, consider these more advanced approaches.

Investigating the Headless Configuration

Since the problem manifested after going headless, examine any potential conflicts arising from the transition.

Virtual Console and TTYs

Even in a headless configuration, the virtual consoles (TTYs) can still be active. Although you’re primarily accessing the server remotely, problems with TTYs can still impact the system.

  • Check TTY Settings: Look into the settings of the active TTYs. These can influence power management behavior.
  • Disable Unnecessary TTYs: You can disable unused TTYs by masking their services using systemctl mask getty@tty[1-6].service.

Monitor Emulation

Some systems may detect the lack of a connected monitor and, by default, take certain actions.

  • Dummy Plug: Consider using a “dummy plug” or a VGA/HDMI emulator. These devices can trick the system into thinking that a monitor is connected, which can alter its power management behavior.
  • Xorg Configuration: If using Xorg, review the configuration to ensure the monitor setup is appropriate for a headless environment.

Analyzing Process Resource Usage

High resource utilization can sometimes lead to unexpected shutdowns, especially if combined with power management settings.

Monitoring CPU and Memory Usage

Use the top command or htop to monitor CPU and memory usage in real time. Identify any processes consuming excessive resources.

Disk I/O Analysis

Use iotop to monitor disk I/O activity. Excessive disk activity could be related to a process or a storage issue that could trigger a shutdown.

Network Traffic Analysis

If network traffic is involved, use iftop or tcpdump to monitor network traffic for any unusual activity that may be causing a problem.

Hardware faults are a common cause of intermittent system issues.

SMART Data Monitoring

Use the smartctl command to check the health of your hard drives.

sudo smartctl -a /dev/sda

Replace /dev/sda with the correct device name for your hard drive. Look for errors or warnings.

Memory Testing

Run a memory test using memtest86+ to check for any memory-related issues. This is a more intrusive test and requires booting from a USB drive.

Power Supply Inspection

A failing power supply is a frequent culprit in shutdown problems.

  • Test with Another Power Supply: If possible, replace the power supply with a known good unit to determine if it is the source of the problem.
  • Voltage Monitoring: Monitor the voltages being supplied to various components. This typically requires specialized equipment.

Advanced Systemd Configuration

Systemd provides extensive control over system behavior. Further adjustments might be necessary.

Investigating Systemd Timers

Systemd timers are a way to trigger actions at specified intervals. These could potentially be interfering with power management.

  • List Active Timers: Use systemctl list-timers to see which timers are currently active.
  • Examine Timer Units: Inspect the configuration of any suspicious timers for clues on the cause.

Custom Systemd Units

If you have custom systemd units, review them carefully for any power-related settings or actions that might be causing a shutdown.

Power Management Configuration Files

Systemd uses configuration files to manage power settings. Examine the files in /etc/systemd/ and /usr/lib/systemd/.

Step-by-Step Troubleshooting Plan

Here’s a structured plan to systematically diagnose and resolve the shutdown issue:

Phase 1: Initial Verification

  1. Reboot and Re-Check logind.conf: Ensure the changes in /etc/systemd/logind.conf are applied.
  2. Confirm gsettings: Verify that the GUI power settings are as expected.

Phase 2: Log Analysis

  1. Current Boot Logs: Review the output from sudo journalctl -b -ex immediately after a shutdown.
  2. Identify Shutdown Cause: Look for errors, warnings, or service failures.

Phase 3: Resource Monitoring

  1. Run top or htop: Monitor CPU and memory usage.
  2. Use iotop: Monitor disk I/O.

Phase 4: Hardware Checks

  1. SMART Data: Check hard drive health using smartctl.
  2. Power Supply Evaluation: If possible, swap the power supply.

Phase 5: Advanced Investigations

  1. Headless Configuration: Investigate TTY settings and consider a dummy plug.
  2. Systemd Timers: Examine active timers.

Conclusion and Further Action

By following this comprehensive guide, you should be well-equipped to identify and resolve the issue of your Ubuntu server shutting down after 15 minutes. Remember that thorough investigation is key. System logs, configuration files, and resource utilization are essential tools in this process.

If the issue persists after performing these steps, it’s helpful to:

  1. Document Your Findings: Keep a detailed record of each step taken and the corresponding results.
  2. Seek Expert Assistance: If you can’t resolve the problem, consider seeking assistance from experienced Linux system administrators. They may be able to offer more specific solutions based on your detailed documentation.
  3. Provide Detailed Information: When seeking help, provide as much information as possible, including your hardware configuration, logs, and a detailed account of the steps you’ve taken.

We hope that this article has helped you understand and solve the problem. Good luck!