Ubuntu server keeps shutting down after 15 minutes
Ubuntu Server Keeps Shutting Down After 15 Minutes: Troubleshooting Guide
Welcome to our comprehensive guide on resolving the frustrating issue of an Ubuntu server shutting down prematurely, specifically after approximately 15 minutes. We understand the critical importance of server uptime and are here to provide you with a meticulously detailed, step-by-step approach to diagnose and rectify this problem. We will leverage our deep understanding of Linux systems and the Ubuntu environment to equip you with the knowledge and solutions you need.
Initial Assessment and Problem Definition
The core problem, as described, is that an Ubuntu 24.04 server, running on desktop hardware, unexpectedly shuts down after a consistent 15-minute interval. This behavior started after transitioning to a headless configuration, where the server is operated without a directly connected monitor. While a GUI is installed, the server is primarily accessed via SSH and RDP. We have already identified attempts to address the issue through logind.conf
modifications, but the problem persists.
Before diving into specific solutions, it is crucial to reiterate the importance of detailed diagnostics. The shutdown itself might be caused by various underlying factors.
Investigating the Root Cause: Diagnostic Steps
The provided information gives us a good starting point for our investigation. Let’s break down the core areas we need to examine.
Examining systemd-logind
Configuration
Your initial approach of modifying /etc/systemd/logind.conf
is a valid one, and it is often the first place to look. However, let’s make sure that these settings are correctly implemented and understood.
Reviewing logind.conf
Settings
The settings you’ve configured in /etc/systemd/logind.conf
are designed to prevent actions triggered by idle events, lid closures, or power-related events. Here’s a critical review of these settings and their implications:
IdleAction=ignore
: This setting is crucial. It should prevent the system from going into a sleep or power-off state due to user inactivity.HandleSuspendKey=ignore
,HandleHibernateKey=ignore
,HandleLidSwitch=ignore
,HandleLidSwitchExternalPower=ignore
,HandleLidSwitchDocked=ignore
: These settings are designed to prevent the system from shutting down based on key presses or lid actions. While you mentioned this is a desktop, it’s still worth making sure they are all in place, as sometimes system default configuration can be misleading.
Confirming Effective Settings
After making changes to /etc/systemd/logind.conf
, it is essential to ensure that those changes are properly applied. The command sudo systemctl restart systemd-logind
is the correct method for reloading the configuration without a full system reboot. However, sometimes, a reboot is required to be certain.
Verify the Configuration: To confirm that the configuration has taken effect, you can use the following command to view the active settings:
systemctl show systemd-logind | grep -E "IdleAction|HandleSuspendKey|HandleHibernateKey|HandleLidSwitch"
This command will output the currently active settings. Ensure that the values you set in
/etc/systemd/logind.conf
are correctly reflected in this output.Check for Overriding Configurations: Systemd can have multiple configuration files. To ensure there are no conflicting settings, check other directories where systemd configuration files might reside. Common locations include
/etc/systemd/system/
and/usr/lib/systemd/system/
. Look for any files related to power management, sleep, or hibernation that might override your settings.
Analyzing Power Management Settings
The presence of a GUI introduces another layer of power management that needs to be evaluated. These settings can significantly influence how the system behaves when idle.
Inspecting GNOME Power Settings
As the system runs a GUI, it’s crucial to examine the GNOME power settings. The command gsettings list-recursively org.gnome.settings-daemon.plugins.power
provides insight into these settings.
Interpreting gsettings
Output
Let’s analyze the provided output from gsettings
.
sleep-inactive-ac-timeout
andsleep-inactive-battery-timeout
: These settings determine how long the system waits before entering a sleep state when running on AC power or battery power, respectively. Your output shows7200
seconds (2 hours) for AC and900
seconds (15 minutes) for battery. Although the server is running on AC power, the battery settings could influence behavior, especially if there are underlying power management problems.sleep-inactive-ac-type
andsleep-inactive-battery-type
: These settings define the action to take when the idle timeouts are reached. In your output, both are set tonothing
andsuspend
. This seems like it should prevent unwanted shutdowns.lid-close-ac-action
andlid-close-battery-action
: These settings, while related to lid closure, can affect power behavior.idle-dim
: If enabled, this may dim the screen after a period of inactivity. It should not be linked to shutdown, but it is still worth investigating.
Troubleshooting Power Settings
- Verify Power Mode: Although you mentioned setting the power mode to “Balanced”, double-check this setting via the GUI. If you are running a service, ensure it does not affect power settings.
- GUI Interference: If you’re using a remote desktop connection (RDP), the power settings within the GUI session could override system-level configurations.
Examining System Logs: The journalctl
Command
System logs are invaluable for pinpointing the cause of the shutdowns. The output from sudo journalctl -b -1 -ex
offers critical clues, but it needs careful interpretation.
Understanding the journalctl
Output
Let’s analyze the log entries:
The log indicates a power-off sequence initiated by systemd. The final entries show the
final.target
reached, followed by the termination of services and shutdown.The journal is from the previous boot (
-b -1
). It is imperative to look at the current boot’s journal logs to find details before the shutdown. The command to view the current boot logs issudo journalctl -b -ex
.
Interpreting Log Entries and Identifying Root Causes
- Look for Warnings and Errors: Scan the logs for any warning or error messages that appear before the shutdown entries. These messages might provide clues about the cause of the shutdowns.
- Check for Service Failures: Search for any services that might have failed before the shutdown. These failures might be related to the power issue.
- Hardware-Related Errors: Keep an eye out for errors related to hardware, such as the hard drive, memory, or CPU. These could indicate hardware problems that might be causing the shutdowns.
Reviewing Additional System Information
Operating System Version
Verify that you are running Ubuntu 24.04 (Noble Numbat). Use lsb_release -a
to confirm.
Hardware Configuration
Although you mention a desktop machine, detailed hardware information might be helpful.
For example, you can use the lshw
command:
sudo lshw -short
Examine the output for critical components like the CPU, RAM, storage, and power supply. A faulty power supply can lead to unexpected shutdowns.
Installed Services
List all running services using:
systemctl list-units --type=service
Pay attention to services related to power management, network management, and any custom scripts you might have installed. It’s possible that a service is interfering with power management.
Advanced Troubleshooting Steps
If the initial steps don’t resolve the issue, consider these more advanced approaches.
Investigating the Headless Configuration
Since the problem manifested after going headless, examine any potential conflicts arising from the transition.
Virtual Console and TTYs
Even in a headless configuration, the virtual consoles (TTYs) can still be active. Although you’re primarily accessing the server remotely, problems with TTYs can still impact the system.
- Check TTY Settings: Look into the settings of the active TTYs. These can influence power management behavior.
- Disable Unnecessary TTYs: You can disable unused TTYs by masking their services using
systemctl mask getty@tty[1-6].service
.
Monitor Emulation
Some systems may detect the lack of a connected monitor and, by default, take certain actions.
- Dummy Plug: Consider using a “dummy plug” or a VGA/HDMI emulator. These devices can trick the system into thinking that a monitor is connected, which can alter its power management behavior.
- Xorg Configuration: If using Xorg, review the configuration to ensure the monitor setup is appropriate for a headless environment.
Analyzing Process Resource Usage
High resource utilization can sometimes lead to unexpected shutdowns, especially if combined with power management settings.
Monitoring CPU and Memory Usage
Use the top
command or htop
to monitor CPU and memory usage in real time. Identify any processes consuming excessive resources.
Disk I/O Analysis
Use iotop
to monitor disk I/O activity. Excessive disk activity could be related to a process or a storage issue that could trigger a shutdown.
Network Traffic Analysis
If network traffic is involved, use iftop
or tcpdump
to monitor network traffic for any unusual activity that may be causing a problem.
Checking for Hardware-Related Issues
Hardware faults are a common cause of intermittent system issues.
SMART Data Monitoring
Use the smartctl
command to check the health of your hard drives.
sudo smartctl -a /dev/sda
Replace /dev/sda
with the correct device name for your hard drive. Look for errors or warnings.
Memory Testing
Run a memory test using memtest86+
to check for any memory-related issues. This is a more intrusive test and requires booting from a USB drive.
Power Supply Inspection
A failing power supply is a frequent culprit in shutdown problems.
- Test with Another Power Supply: If possible, replace the power supply with a known good unit to determine if it is the source of the problem.
- Voltage Monitoring: Monitor the voltages being supplied to various components. This typically requires specialized equipment.
Advanced Systemd Configuration
Systemd provides extensive control over system behavior. Further adjustments might be necessary.
Investigating Systemd Timers
Systemd timers are a way to trigger actions at specified intervals. These could potentially be interfering with power management.
- List Active Timers: Use
systemctl list-timers
to see which timers are currently active. - Examine Timer Units: Inspect the configuration of any suspicious timers for clues on the cause.
Custom Systemd Units
If you have custom systemd units, review them carefully for any power-related settings or actions that might be causing a shutdown.
Power Management Configuration Files
Systemd uses configuration files to manage power settings. Examine the files in /etc/systemd/
and /usr/lib/systemd/
.
Step-by-Step Troubleshooting Plan
Here’s a structured plan to systematically diagnose and resolve the shutdown issue:
Phase 1: Initial Verification
- Reboot and Re-Check
logind.conf
: Ensure the changes in/etc/systemd/logind.conf
are applied. - Confirm
gsettings
: Verify that the GUI power settings are as expected.
Phase 2: Log Analysis
- Current Boot Logs: Review the output from
sudo journalctl -b -ex
immediately after a shutdown. - Identify Shutdown Cause: Look for errors, warnings, or service failures.
Phase 3: Resource Monitoring
- Run
top
orhtop
: Monitor CPU and memory usage. - Use
iotop
: Monitor disk I/O.
Phase 4: Hardware Checks
- SMART Data: Check hard drive health using
smartctl
. - Power Supply Evaluation: If possible, swap the power supply.
Phase 5: Advanced Investigations
- Headless Configuration: Investigate TTY settings and consider a dummy plug.
- Systemd Timers: Examine active timers.
Conclusion and Further Action
By following this comprehensive guide, you should be well-equipped to identify and resolve the issue of your Ubuntu server shutting down after 15 minutes. Remember that thorough investigation is key. System logs, configuration files, and resource utilization are essential tools in this process.
If the issue persists after performing these steps, it’s helpful to:
- Document Your Findings: Keep a detailed record of each step taken and the corresponding results.
- Seek Expert Assistance: If you can’t resolve the problem, consider seeking assistance from experienced Linux system administrators. They may be able to offer more specific solutions based on your detailed documentation.
- Provide Detailed Information: When seeking help, provide as much information as possible, including your hardware configuration, logs, and a detailed account of the steps you’ve taken.
We hope that this article has helped you understand and solve the problem. Good luck!