When your Linux server acts up, sifting through endless logs for a crucial clue can be a frustrating and time-consuming ordeal. What if you could instantly pinpoint the exact services causing trouble, cutting through the noise of hundreds of healthy processes? This article delves into the power of systemctl --failed, a vital command for any Linux administrator or DevOps engineer. Discover how this simple yet potent tool, integral to effective Linux service management, can streamline your systemd troubleshooting workflow, helping you rapidly identify, diagnose, and resolve issues on any modern Linux distribution.
Before diving deep, let’s establish some foundational context. systemd is the ubiquitous init system responsible for bootstrapping your Linux machine and overseeing every background service. Its companion, systemctl, is your command-line interface for interacting with systemd. Every command demonstrated here has been thoroughly tested on Ubuntu and Rocky Linux, ensuring identical behavior across any modern Linux distribution running systemd version 230 or later.
Mastering `systemctl –failed` for Linux Service Troubleshooting
Instantly Spotting Failed Linux Services with `systemctl –failed`
The systemctl list-units command is designed to display every unit that systemd is actively tracking. A “unit” is systemd‘s generic term for any managed entity, be it a service, a mount point, or a timer. By default, this list is extensive and often overwhelming, as it includes everything functioning correctly.
Enter the game-changer: the --failed flag. This powerful filter instantly narrows the output to only those units currently in a “failed” state. This means it identifies services that either attempted to start or maintain operation but were unsuccessful. For seasoned sysadmins, this is often the first command executed when a server misbehaves, offering an immediate answer to “what’s broken right now?”
While systemctl --failed won’t tell you the *reason* for the failure or automatically fix anything, it’s an indispensable triage tool. It provides the exact service name you need to investigate further, saving precious time.
In its simplest form, you just type the command and review the output:
systemctl list-units --failed
Output:
UNIT LOAD ACTIVE SUB DESCRIPTION
● nginx.service loaded failed failed A high performance web server
● mysql.service loaded failed failed MySQL Community Server
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e., generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
2 loaded units listed.
In this example, both nginx and mysql services have failed, visually indicated by the red dot at the start of each line. If your output shows “0 loaded units listed,” congratulations—your system is currently free of actively failed units. If you encounter a “Permission denied” error, remember to prefix your command with sudo, as systemd requires root privileges to access the full unit state.
Advanced Filtering and Diagnosis for Failed Linux Services
Pinpointing Failed `systemd` Services with `–type=service`
By default, systemctl --failed includes all unit types: sockets, timers, mounts, and services. However, during an outage, your primary concern is almost always services. To narrow your focus and reduce clutter, simply add the --type=service filter:
sudo systemctl list-units --failed --type=service
Output:
UNIT LOAD ACTIVE SUB DESCRIPTION
● nginx.service loaded failed failed A high performance web server
1 loaded units listed.
Now, the output is elegantly limited to only failed services, providing a clearer picture during live incidents. A common beginner’s pitfall here is mistakenly typing --type=services (with an ‘s’ at the end), which will result in an “Invalid unit type” error. Remember, the correct value is always singular: service.
Uncovering the Root Cause: Diagnosing Failed Services on Linux
Once you have identified the list of failed services, the next logical step in debugging failed services is to determine *why* they failed. You can directly run systemctl status on a specific service name from the list. Alternatively, for multiple failures, you can leverage the power of xargs to fetch the status of every failed service in a single command.
The xargs command is a versatile utility that takes input from the left side of a pipe (|) and transforms it into arguments for the command on the right. This is incredibly useful when dealing with a dynamic list of failed services.
To check a single service:
sudo systemctl status nginx.service
Output:
× nginx.service - A high performance web server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Wed 2026-04-22 09:14:22 UTC; 3min ago
Process: 4821 ExecStartPre=/usr/sbin/nginx -t -q (code=exited, status=1/FAILURE)
Apr 22 09:14:22 web01 nginx[4821]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
The crucial line here is usually the last log entry: “bind() to 0.0.0.0:80 failed (98: Address already in use),” indicating that port 80 is already occupied. If the status output truncates the log line, add --no-pager -l to force the full text. The -l flag prevents ellipses, and --no-pager bypasses the less viewer, which can hide the tail of the output.
For situations with multiple failed services, you can chain commands to get status reports for all of them in one go:
systemctl list-units --failed --no-legend --plain | awk '{print $1}' | xargs sudo systemctl status --no-pager
Each component of this powerful pipeline serves a specific purpose:
systemctl list-units --failed --no-legend --plain: This prints only the names of failed units, stripping unnecessary headers, footers, and visual dots.awk '{print $1}': This command extracts just the first column, which contains the unit name.xargs sudo systemctl status --no-pager: This takes the list of unit names and feeds them as arguments tosystemctl status, ensuring all reports are displayed without pagination.
Output:
× nginx.service - A high performance web server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Wed 2026-04-22 09:14:22 UTC; 3min ago
Process: 4821 ExecStartPre=/usr/sbin/nginx -t -q (code=exited, status=1/FAILURE)
Apr 22 09:14:22 web01 nginx[4821]: nginx: [emerg] bind() to 0.0.0.0:80 failed
× mysql.service - MySQL Community Server
Loaded: loaded (/lib/systemd/system/mysql.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Wed 2026-04-22 09:15:01 UTC; 2min ago
Process: 4902 ExecStart=/usr/sbin/mysqld (code=exited, status=1/FAILURE)
Apr 22 09:15:01 web01 mysqld[4902]: [ERROR] Could not open file '/var/log/mysql/error.log'
Now, both failed services report their issues sequentially: a port conflict for Nginx and a log file permission problem for MySQL. This immediate context allows for quick delegation of troubleshooting tasks. A common mistake here is omitting --no-pager, which would force you to manually quit the less viewer after each service’s status, turning a quick command into an interactive chore.
Unique Tip: For even deeper investigation into why a service failed, remember to pair systemctl status with journalctl. Once you have the service name (e.g., nginx.service), run sudo journalctl -xeu nginx.service to view all relevant log entries for that specific unit, often providing more extensive context than systemctl status alone.
Automating Health Checks: Counting Failed Units in Linux
For automated health checks and robust Linux server monitoring scripts, you often need a simple numerical count of failed services rather than a human-readable table. This can be achieved by stripping headers and footers and then piping the output to wc -l.
systemctl list-units --failed --no-legend --plain | wc -l
Output:
2
Let’s break down the flags:
systemctl list-units --failed: Lists failed units with standard header and footer.--no-legend: Removes the column header and the summary line.--plain: Drops the colored status dot, ensuring the output is ASCII-only and script-safe.wc -l: Counts the remaining lines, with each line representing one failed unit.
This provides a clean ‘2’ (or ‘0’ if no failures), which is perfect for alerting systems via cron jobs, monitoring agents, or custom shell scripts. Forgetting the --plain flag is a common beginner’s error; while wc -l might still work, the presence of the colored dot can complicate later parsing attempts with tools like awk.
Revealing All: Listing Unloaded and Failed Units
By default, list-units only displays units that systemd is actively tracking. This means services that failed so catastrophically they were unloaded might not appear. To include these, add the --all flag and combine it with --state=failed for consistent filtering behavior.
sudo systemctl list-units --all --state=failed
Output:
UNIT LOAD ACTIVE SUB DESCRIPTION
● nginx.service loaded failed failed A high performance web server
● mysql.service loaded failed failed MySQL Community Server
● apt-daily-upgrade.timer loaded inactive dead Daily apt upgrade activities
The --all flag brings units that are loaded but inactive into view, which can sometimes reveal a stuck timer or a socket that silently stopped firing. While --state=failed and --failed perform the same function in this specific context, --state= is the more versatile option as it accepts many other values (e.g., active, inactive, activating), making it a valuable flag to master for broader system introspection.
Essential `systemctl` Flags for Linux System Administrators
Several systemctl flags prove invaluable during server outages and routine maintenance. Keep these in your muscle memory:
--no-pager: Sends output directly to the terminal, bypassing thelesspager.--no-legend: Strips the header and footer, ideal for script parsing.--plain: Removes the visual status dot for clean, ASCII-only output.--type=service: Filters the output to show only services, ignoring other unit types.--state=failed: The explicit version of--failed, highly flexible when used with--allor other state filters.
Conclusion: Streamline Your Linux Troubleshooting Workflow
You now possess the knowledge to ask systemd the most critical question during an outage: “Which services are actually broken?” Furthermore, you can refine this answer to target only services, count failures for automated scripts, and swiftly retrieve the full failure reason. The --failed flag is unequivocally the fastest way to triage a sick server. When combined with --no-legend and --plain, it becomes a robust component for any monitoring script, free from unexpected parsing issues.
We encourage you to try this immediately on your own machine. Open a terminal and execute:
systemctl list-units --failed
If your system is healthy, challenge yourself by intentionally breaking a service (on a test machine, of course!):
sudo systemctl stop nginx && sudo systemctl start broken-unit-name
Then, rerun the first command to witness the real-time output change. This hands-on experience will transform this command from something you’ve read about into an intuitive tool your fingers remember.
Have you used systemctl --failed to catch a production outage before your monitoring system did? What was the trickiest failure you traced with it? Share your experiences and insights in the comments below!

