How to Check Disk Space Usage in Linux with du and df
Graphic showing Linux commands df -h and du -sh /path, illustrating filesystem summary vs directory sizes, example outputs, key columns, and tips to locate largest files. Use du/df
Sponsor message — This article is made possible by Dargslan.com, a publisher of practical, no-fluff IT & developer workbooks.
Why Dargslan.com?
If you prefer doing over endless theory, Dargslan’s titles are built for you. Every workbook focuses on skills you can apply the same day—server hardening, Linux one-liners, PowerShell for admins, Python automation, cloud basics, and more.
Managing disk space effectively is one of the most critical responsibilities for anyone working with Linux systems. When servers slow down, applications crash, or backups fail, the culprit is often insufficient disk space that went unnoticed until it became a crisis. Understanding how to monitor and analyze disk usage isn't just a technical skill—it's essential preventive maintenance that keeps systems running smoothly and helps avoid costly downtime.
In the Linux ecosystem, two powerful command-line utilities stand at the forefront of disk space management: du (disk usage) and df (disk free). While both tools provide insights into storage consumption, they approach the problem from different angles—du examines individual files and directories in granular detail, while df offers a bird's-eye view of filesystem capacity and availability. Together, they form a complete toolkit for understanding where your storage is going and how much remains available.
Throughout this comprehensive guide, you'll discover practical techniques for using both utilities, learn to interpret their output accurately, and master advanced options that transform these simple commands into powerful diagnostic tools. Whether you're troubleshooting a full partition, planning capacity upgrades, or simply maintaining healthy systems, you'll gain the knowledge to confidently manage disk space on any Linux distribution.
Understanding the df Command: Your Filesystem Overview Tool
The df command provides a comprehensive snapshot of filesystem disk space usage across all mounted partitions. When executed without arguments, it displays information about every mounted filesystem, showing total size, used space, available space, and the percentage of capacity consumed. This high-level perspective makes df invaluable for quickly identifying partitions that are approaching capacity limits.
The basic syntax is straightforward: simply type df in your terminal. However, the default output displays sizes in 1K blocks, which can be difficult to interpret at a glance. For human-readable output, the -h flag transforms those numbers into familiar units like megabytes, gigabytes, and terabytes. Running df -h presents information in a format that's immediately understandable, showing "15G" instead of "15728640" blocks.
"Regular monitoring of filesystem capacity prevents emergency situations where critical systems fail due to full disks. Prevention is always easier than recovery."
Beyond basic usage, df offers several specialized options that provide deeper insights. The -T flag adds filesystem type information, revealing whether each partition uses ext4, xfs, btrfs, or another filesystem format. This detail becomes particularly important when troubleshooting performance issues or planning migrations, as different filesystem types have distinct characteristics and limitations.
| Option | Description | Example Output |
|---|---|---|
-h |
Human-readable format (KB, MB, GB) | 15G instead of 15728640 |
-T |
Display filesystem type | ext4, xfs, btrfs |
-i |
Show inode information instead of blocks | Inode usage percentages |
-t |
Limit output to specific filesystem type | Only ext4 filesystems |
-x |
Exclude specific filesystem type | Hide tmpfs or devtmpfs |
--total |
Add a total line at the bottom | Grand total of all filesystems |
Interpreting df Output Columns
Understanding what each column represents is essential for accurate analysis. The Filesystem column identifies the device or partition name, such as /dev/sda1 or /dev/mapper/vg-root. The Size column shows total capacity, while Used indicates how much space is currently occupied. The Available column reveals remaining free space, and Use% displays the percentage of capacity consumed.
One common point of confusion involves the relationship between these numbers. You might notice that Used + Available doesn't exactly equal Size. This discrepancy occurs because filesystems reserve a small percentage of space for system processes and the root user, preventing regular users from completely filling the disk and causing system failures. Typically, this reserved space amounts to about 5% of total capacity on ext-based filesystems.
Monitoring Specific Filesystems
Rather than viewing all mounted filesystems, you can target specific ones by providing the mount point as an argument. The command df -h /home displays information exclusively about the partition containing the /home directory. This focused approach is particularly useful in scripts or when you're concerned about a specific partition's capacity.
For systems with numerous temporary filesystems like tmpfs or devtmpfs that clutter the output, the exclusion option proves invaluable. Running df -h -x tmpfs -x devtmpfs filters out these virtual filesystems, presenting only physical storage devices. This cleaner view helps you concentrate on the filesystems that actually consume disk space.
Mastering the du Command: Detailed Directory Analysis
While df provides filesystem-level insights, du excels at examining disk usage within directories and files. This command recursively analyzes directory structures, calculating the total space consumed by each directory and its contents. When a partition fills up, du becomes your investigative tool for identifying which directories and files are consuming the most space.
The basic du command without options displays disk usage for the current directory and all subdirectories, with sizes shown in kilobytes. However, this default output can be overwhelming for large directory trees. Adding the -h flag once again provides human-readable sizes, while -s (summarize) shows only the total for each specified directory without descending into subdirectories.
"The difference between du and df often reveals hidden space consumption from deleted files still held open by processes—a common troubleshooting scenario that catches many administrators off guard."
Essential du Options for Practical Use
- -h — Displays sizes in human-readable format (K, M, G)
- -s — Shows summary totals only, without subdirectory details
- -c — Produces a grand total at the end of the output
- -a — Includes individual files in the output, not just directories
- --max-depth=N — Limits recursion to N levels deep
- --exclude=PATTERN — Skips files and directories matching the pattern
- --time — Shows the last modification time for each entry
One of the most powerful combinations for identifying space hogs is du -h --max-depth=1 | sort -hr. This command analyzes the current directory to one level deep, displays results in human-readable format, and pipes the output to sort for arranging entries by size in descending order. The result is an immediately actionable list showing which subdirectories consume the most space.
Finding the Largest Files and Directories
When investigating disk space issues, you typically want to identify the biggest consumers quickly. The command du -ah /var | sort -rh | head -20 scans the /var directory (including all files with the -a flag), sorts results by size in reverse order, and displays the top 20 entries. This technique rapidly pinpoints large log files, database dumps, or accumulated cache data that may need attention.
For targeting specific file types, combining du with find creates powerful search capabilities. The command find /home -name "*.log" -exec du -h {} + | sort -rh locates all log files under /home and displays their sizes sorted from largest to smallest. This approach is particularly useful when you suspect a particular file type is consuming excessive space.
Comparing du and df: Understanding the Differences
A frequent source of confusion arises when du and df report different numbers for the same filesystem. These discrepancies are normal and occur for several legitimate reasons. Understanding why these differences exist helps you interpret the data correctly and avoid misdiagnosis of disk space issues.
The primary difference lies in their measurement approach. The df command reports space usage at the filesystem level, querying the filesystem's metadata structures directly. In contrast, du walks through the directory tree, summing up the sizes of files and directories it encounters. This fundamental distinction means they can see different aspects of space consumption.
"When du and df disagree significantly, the most common culprit is deleted files that remain open in running processes. The filesystem still allocates space for these files even though they're no longer visible in the directory structure."
Common Reasons for Discrepancies
🔍 Deleted but Open Files: When a process opens a file and that file is subsequently deleted, the filesystem maintains the file's data blocks until the process closes the file or terminates. During this time, df counts this space as used (because it is), but du doesn't see the file (because it's been removed from the directory structure). This scenario frequently occurs with log files that are deleted while applications still write to them.
📊 Reserved Blocks: As mentioned earlier, ext-based filesystems reserve space for privileged processes. The df command accounts for this reserved space in its calculations, while du only sees the actual files present. This reserved space typically amounts to 5% of total capacity, which can represent gigabytes on large filesystems.
💾 Sparse Files: Some applications create sparse files—files with "holes" where no data is actually written to disk. These files report a large size but consume less physical space. The du command, depending on options used, may report apparent size rather than actual disk usage, leading to differences from df's accounting.
🗂️ Metadata Overhead: Filesystems maintain various metadata structures including inodes, directory entries, journals, and allocation bitmaps. The df command includes this overhead in its calculations, while du only counts user-visible files and directories. On filesystems with millions of small files, this metadata can consume substantial space.
⚙️ Mount Point Overlays: When filesystems are mounted over existing directories, the files in those directories become hidden but still consume space. The df command reports usage for the mounted filesystem, while du might traverse the hidden directory structure depending on how the scan was initiated.
Identifying Deleted but Open Files
When you suspect deleted files are consuming space, the lsof command reveals processes holding deleted files open. Running lsof +L1 lists all open files that have been deleted (link count less than 1). The output shows the process ID, the user, and the file descriptor, allowing you to identify which processes need to be restarted or signaled to release the space.
Alternatively, examining the /proc filesystem provides similar information. The command find /proc/*/fd -ls 2>/dev/null | grep deleted searches through process file descriptors for deleted files. Once identified, you can either restart the offending process or, in some cases, truncate the file through the /proc interface to immediately reclaim space.
| Scenario | df Shows | du Shows | Resolution |
|---|---|---|---|
| Deleted open files | Space as used | Files not present | Restart process or truncate via /proc |
| Reserved blocks | Includes reserved space | Only user files | Normal behavior, no action needed |
| Sparse files | Actual disk usage | May show apparent size | Use du --apparent-size for comparison |
| Filesystem metadata | Includes metadata overhead | Only file data | Normal overhead, consider in capacity planning |
| Hidden mount points | Mounted filesystem only | May include hidden files | Unmount to access underlying data |
Advanced Techniques for Disk Space Analysis
Beyond basic usage, combining these commands with other utilities creates powerful disk space management workflows. Shell scripting enables automated monitoring, alerting, and reporting that keeps you informed about storage trends before they become problems.
Creating Automated Monitoring Scripts
A simple monitoring script can check disk usage and send alerts when thresholds are exceeded. This script uses df to check usage percentages and sends notifications when any filesystem exceeds 80% capacity:
#!/bin/bash
THRESHOLD=80
df -H | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5 " " $1 }' | while read output;
do
usage=$(echo $output | awk '{ print $1}' | sed 's/%//g')
partition=$(echo $output | awk '{ print $2 }')
if [ $usage -ge $THRESHOLD ]; then
echo "Alert: $partition is ${usage}% full"
fi
doneThis script can be scheduled via cron to run at regular intervals, providing proactive monitoring that catches capacity issues before they impact services. For production environments, integrating with monitoring systems like Nagios, Zabbix, or Prometheus provides more sophisticated alerting and historical trending.
Analyzing Growth Patterns Over Time
Understanding how disk usage changes over time helps with capacity planning and identifying unusual growth patterns. Creating a simple logging mechanism that records daily usage enables trend analysis. The command df -h | grep "^/dev" >> /var/log/disk-usage-$(date +%Y-%m).log appends current usage to a monthly log file.
"Capacity planning isn't about reacting to full disks—it's about predicting when they'll fill based on historical growth patterns and taking action well in advance."
For more sophisticated analysis, tools like sar (System Activity Reporter) can track disk usage metrics over extended periods. Enabling sar's data collection through the sysstat package provides detailed historical data that reveals usage patterns, growth rates, and capacity trends across all monitored systems.
Excluding Directories from Analysis
When analyzing disk usage, certain directories may contain data you want to exclude from your analysis. Network-mounted filesystems, temporary directories, or backup locations might skew results. The du command's --exclude option filters out unwanted paths: du -h --exclude=/proc --exclude=/sys --exclude=/mnt / | sort -rh | head -20 analyzes the root filesystem while skipping virtual and mounted filesystems.
For complex exclusion patterns, creating an exclude file simplifies command syntax. List patterns in a text file (one per line) and reference it with du -h --exclude-from=exclude-list.txt /. This approach is particularly useful for standardized audits across multiple systems where the same directories should always be excluded.
Troubleshooting Common Disk Space Issues
Even with regular monitoring, disk space problems occasionally occur. Knowing how to quickly diagnose and resolve these issues minimizes downtime and prevents cascading failures. The combination of df and du, along with supporting utilities, provides a complete troubleshooting toolkit.
When a Filesystem Shows 100% Full
A filesystem reporting 100% capacity requires immediate attention, as it can cause application failures, prevent logging, and even impact system stability. The first step is identifying what's consuming space. Start with df to confirm which filesystem is full, then use du to investigate: du -hx --max-depth=2 /full/filesystem | sort -rh | head -20. The -x flag prevents du from crossing filesystem boundaries, ensuring you're analyzing only the problematic partition.
Common culprits include log files that have grown unexpectedly, core dumps from crashed applications, temporary files that weren't cleaned up, or database files that have expanded. Once identified, you can take appropriate action—rotating logs, removing old backups, cleaning temporary directories, or expanding the filesystem if growth is legitimate.
Inode Exhaustion: The Hidden Capacity Problem
Filesystems can become unusable even when plenty of space remains if they run out of inodes. Each file and directory consumes an inode regardless of size, so filesystems with millions of tiny files can exhaust inodes while showing available space. The command df -i displays inode usage instead of block usage, revealing whether inode exhaustion is the problem.
"Inode exhaustion is often overlooked because most administrators only check space usage. Systems with extensive log directories, mail spools, or cache files are particularly susceptible to this issue."
When facing inode exhaustion, the solution involves removing files rather than freeing space. Use find /path -type f | wc -l to count files in suspect directories. Common sources include mail queues with thousands of messages, log directories with one file per event, or application caches that create numerous small files. Implementing file rotation, archiving, or consolidation resolves the immediate issue, while adjusting inode allocation during filesystem creation prevents recurrence.
Rapidly Growing Directories
Sometimes directories grow unexpectedly fast, filling filesystems before monitoring alerts trigger. Identifying rapid growth requires comparing usage over short time intervals. Create a baseline with du -sh /suspect/directory > /tmp/baseline.txt, wait a few minutes, then compare with du -sh /suspect/directory. Significant differences indicate active growth that needs investigation.
For real-time monitoring of directory growth, the watch command provides continuous updates: watch -n 60 'du -sh /suspect/directory' refreshes the display every 60 seconds, allowing you to observe growth as it happens. This technique is invaluable when troubleshooting runaway processes or misconfigured applications that generate excessive data.
Optimizing Performance for Large Filesystems
On systems with massive directory structures containing millions of files, both du and df can take considerable time to complete. Understanding performance implications and optimization techniques ensures these tools remain practical even on large-scale systems.
Limiting du Recursion Depth
The most effective performance optimization for du is limiting recursion depth. Rather than analyzing entire directory trees, restrict scanning to a few levels: du -h --max-depth=3 /large/directory. This approach provides sufficient detail for identifying problem areas without the overhead of traversing millions of files.
For initial investigations, start with --max-depth=1 to identify which top-level subdirectories consume the most space, then drill down into specific directories with deeper scans. This iterative approach is far more efficient than immediately scanning the entire tree, especially when you're searching for a specific problem area.
Using Parallel Processing
On systems with multiple cores, parallel processing can significantly accelerate du operations. The parallel utility, available in most distributions, enables concurrent directory scanning. The command find /path -maxdepth 1 -type d | parallel du -sh analyzes multiple subdirectories simultaneously, leveraging available CPU cores to reduce total execution time.
For extremely large filesystems, dedicated disk usage analysis tools like ncdu (NCurses Disk Usage) provide optimized scanning with interactive navigation. These tools cache results, enabling quick exploration without rescanning, and offer visual interfaces that make identifying space consumers more intuitive than command-line output.
Integration with System Monitoring and Automation
Professional system administration requires integrating disk space monitoring into broader infrastructure management practices. Modern monitoring ecosystems collect metrics, generate alerts, and provide historical analysis that transforms reactive troubleshooting into proactive capacity management.
Exporting Metrics to Monitoring Systems
Most monitoring platforms can execute custom scripts and collect their output as metrics. A simple script that outputs df statistics in a parseable format enables integration with monitoring systems. This example generates output suitable for Prometheus or similar time-series databases:
#!/bin/bash
df -P | awk 'NR>1 {print "disk_usage{mount=\""$6"\",device=\""$1"\"} "$5}' | sed 's/%//'This script produces metrics in a format that monitoring systems can scrape, store, and visualize, enabling dashboards that show disk usage trends across entire server fleets. Historical data reveals growth rates, seasonal patterns, and anomalies that inform capacity planning decisions.
Automated Cleanup Strategies
Preventive automation reduces manual intervention by implementing cleanup policies that execute automatically. Simple cron jobs can remove old files from temporary directories, rotate logs, or archive data based on age. The command find /tmp -type f -mtime +7 -delete removes files from /tmp older than seven days, while find /var/log -name "*.gz" -mtime +30 -delete purges compressed logs older than 30 days.
More sophisticated automation uses configuration management tools like Ansible, Puppet, or Chef to enforce consistent cleanup policies across infrastructure. These tools ensure that all systems maintain appropriate retention policies, preventing individual servers from developing unique disk space issues due to configuration drift.
Best Practices for Disk Space Management
Effective disk space management combines proactive monitoring, regular maintenance, and thoughtful system design. Implementing these practices prevents most disk space emergencies and ensures systems remain stable and performant.
Establishing Baseline Monitoring
📌 Set Alert Thresholds Appropriately: Configure alerts at 75% capacity to provide warning, and escalate at 85% to ensure urgent attention before reaching critical levels. Different filesystems may warrant different thresholds based on their growth patterns and criticality.
📌 Monitor Both Space and Inodes: Don't rely solely on space usage metrics. Include inode monitoring in your alerting strategy to catch exhaustion scenarios before they impact operations.
📌 Track Growth Rates: Historical data reveals how quickly filesystems fill, enabling predictive alerting that warns when capacity will be reached based on current trends rather than waiting for threshold breaches.
📌 Document Normal Patterns: Understanding typical usage patterns for each system helps identify anomalies quickly. Sudden deviations from baseline behavior often indicate problems that need investigation.
📌 Regular Audits: Schedule periodic comprehensive audits using du to identify accumulating data that may not trigger alerts but gradually consumes capacity. Old backups, forgotten archives, and orphaned files often hide in corners of the filesystem.
Implementing Log Rotation and Retention Policies
Log files are among the most common causes of unexpected disk space consumption. Implementing robust log rotation through logrotate ensures logs are compressed, archived, and eventually deleted according to defined policies. Standard configurations rotate logs daily or weekly, compress old logs, and retain them for a specified period before deletion.
Application-specific logs require particular attention, as not all applications automatically rotate their logs. Custom logrotate configurations can manage these logs, while monitoring ensures rotation occurs successfully. Failed rotation often goes unnoticed until logs fill the filesystem.
Capacity Planning and Growth Projections
Reactive disk space management—adding capacity only when filesystems fill—creates unnecessary emergencies and service disruptions. Proactive capacity planning uses historical growth data to predict when additional capacity will be needed, allowing for planned expansions during maintenance windows rather than emergency interventions.
"The best disk space problem is the one that never happens because you saw it coming months in advance and took preventive action during scheduled maintenance."
Calculate growth rates by comparing usage over consistent intervals. If a filesystem grows 10GB per month, and 100GB remains available, you have approximately 10 months before intervention is needed. Factor in growth acceleration—systems rarely grow linearly—and plan capacity additions with comfortable margins.
Platform-Specific Considerations
While du and df are standard across Linux distributions, subtle differences in behavior, available options, and output formatting exist between platforms. Understanding these variations ensures scripts and procedures work reliably across diverse environments.
GNU vs BSD Implementations
Linux systems typically use GNU versions of du and df, which offer extensive options and consistent behavior. BSD-based systems (including macOS) use different implementations with slightly different option flags and output formats. For example, GNU df's --output option for custom column selection isn't available in BSD df.
When writing portable scripts, stick to POSIX-standard options that work across implementations. The -h flag for human-readable output is widely supported, while more exotic options may require conditional logic based on detected platform.
Container and Cloud Environments
Containerized environments introduce additional complexity to disk space management. Containers share the host's kernel and filesystem, making traditional monitoring approaches less effective. Container-specific tools like docker system df provide visibility into space consumed by images, containers, and volumes.
Cloud environments often abstract storage behind network-attached volumes and object storage services. While du and df remain relevant for analyzing instance-local storage, comprehensive monitoring requires integration with cloud provider APIs to track network storage consumption, which may not be visible through traditional filesystem commands.
Practical Examples and Real-World Scenarios
Theoretical knowledge becomes practical skill through application to real-world scenarios. These examples demonstrate how to apply du and df in common situations administrators face regularly.
Scenario: Identifying What Filled the Root Partition
Your monitoring alerts that the root partition has reached 90% capacity. Starting with df -h / confirms the issue. Next, du -hx --max-depth=1 / | sort -rh reveals which top-level directories consume the most space. Suppose /var dominates the output. Drill deeper with du -hx --max-depth=1 /var | sort -rh, which shows /var/log as the culprit.
Further investigation with ls -lhS /var/log lists log files by size, revealing a 20GB application log that hasn't rotated. Examining the log confirms it's current and active, so truncating it would lose data. Instead, configure logrotate for this application, manually rotate the current log, and compress old entries to immediately reclaim space.
Scenario: Preparing for Database Migration
Before migrating a database to a new server, you need to determine how much space the database currently consumes. Using du -sh /var/lib/mysql provides a summary, while du -h --max-depth=1 /var/lib/mysql | sort -rh shows which databases are largest. This information guides decisions about storage allocation on the destination server.
Additionally, running df -h /var/lib/mysql shows how much free space remains on the current system, helping determine whether a backup can be created locally or requires external storage. This comprehensive view of current usage and available capacity ensures the migration proceeds without space-related complications.
Scenario: Investigating Discrepancy Between du and df
The df command shows /home at 85% capacity, but du -sh /home indicates only 60% of the filesystem size is accounted for. This 25% discrepancy suggests deleted files held open by processes. Running lsof +L1 | grep /home reveals a process holding a deleted 50GB file open.
Investigating the process shows it's a long-running application that logs to a file that was deleted during manual cleanup. Rather than restarting the application immediately, you truncate the file through the /proc filesystem: : > /proc/[PID]/fd/[FD], where [PID] is the process ID and [FD] is the file descriptor number from lsof output. This immediately reclaims space without interrupting the application.
Advanced Filtering and Reporting Techniques
Beyond basic usage, combining du and df with text processing utilities creates sophisticated reporting capabilities that provide exactly the information you need in precisely the format you want.
Creating Custom Reports with awk and sed
The awk programming language excels at processing columnar data like df output. This example generates a report of filesystems exceeding 70% capacity:
df -h | awk 'NR>1 && $5+0 > 70 {print $6 " is " $5 " full with " $4 " available"}'This command skips the header line (NR>1), converts the percentage to a number ($5+0), filters for usage above 70%, and formats a readable message. The output might read: "/var is 85% full with 15G available", providing actionable information at a glance.
Generating HTML Reports
For regular reporting to management or team dashboards, converting command output to HTML creates professional-looking reports. This script generates an HTML table of disk usage:
#!/bin/bash
echo "<table border='1'>"
echo "<tr><th>Filesystem</th><th>Size</th><th>Used</th><th>Available</th><th>Use%</th><th>Mounted On</th></tr>"
df -h | grep "^/dev" | awk '{print "<tr><td>"$1"</td><td>"$2"</td><td>"$3"</td><td>"$4"</td><td>"$5"</td><td>"$6"</td></tr>"}'
echo "</table>"This HTML can be embedded in emails, displayed on monitoring dashboards, or published to internal wikis, making disk usage information accessible to non-technical stakeholders who need visibility into system health.
Security Considerations in Disk Space Monitoring
Disk space analysis can inadvertently expose sensitive information or create security vulnerabilities if not implemented thoughtfully. Understanding these risks and implementing appropriate safeguards protects both data and systems.
Permissions and Access Control
The du command requires read permissions on directories and files it analyzes. Running du as a regular user may encounter permission denied errors when accessing system directories or other users' files. While running as root provides complete access, it also risks exposing sensitive information in reports or logs.
For automated monitoring, create a dedicated service account with minimal necessary permissions rather than using root. Grant this account read access only to directories that require monitoring, using group memberships or ACLs to provide targeted access without excessive privileges.
Protecting Sensitive Information in Reports
Disk usage reports may reveal sensitive information through filenames, directory structures, or usage patterns. Before distributing reports broadly, consider what information they expose. A report showing a sudden 100GB increase in /home/username might reveal that user's activities in ways they didn't intend.
Implement filtering and aggregation that provides necessary information without excessive detail. Summary reports showing department-level or project-level usage are often sufficient for capacity planning without exposing individual user data. Reserve detailed reports for administrators with legitimate need-to-know.
Troubleshooting Command Errors and Unexpected Behavior
Even straightforward commands occasionally produce errors or unexpected results. Understanding common issues and their resolutions prevents frustration and wasted troubleshooting time.
Permission Denied Errors
When du encounters directories it cannot read, it displays "Permission denied" errors and continues processing. These errors can clutter output and make results difficult to read. Redirect error messages to /dev/null to suppress them: du -h /path 2>/dev/null. However, be aware that suppressing errors also hides legitimate problems, so use this technique judiciously.
Alternatively, if you need to see errors but want them separated from normal output, redirect stderr to a separate file: du -h /path 2>errors.txt. This approach preserves error information for review while keeping standard output clean.
Stale NFS Mounts
Network filesystems that become unavailable can cause du and df to hang indefinitely while waiting for responses from unresponsive servers. The df command's -x option excludes specific filesystem types: df -x nfs -x nfs4 skips NFS mounts entirely, preventing hangs.
For du, preventing traversal into network mounts requires the -x flag, which restricts scanning to a single filesystem. The command du -hx /path won't cross filesystem boundaries, effectively excluding mounted network filesystems from analysis.
Future-Proofing Your Disk Space Management Strategy
Technology evolves continuously, and disk space management practices must adapt to new storage technologies, architectures, and scale requirements. Building flexibility into your monitoring and management approach ensures it remains effective as infrastructure changes.
Adapting to New Filesystem Technologies
Modern filesystems like btrfs and zfs introduce features that traditional tools don't fully understand. Snapshots, compression, and deduplication affect how space is consumed and reported. While du and df continue working, they may not accurately represent actual disk usage on these advanced filesystems.
Filesystem-specific tools provide more accurate information. For btrfs, btrfs filesystem usage /path shows detailed space allocation including metadata and system chunks. For zfs, zfs list -o space displays space usage accounting for compression and deduplication. Incorporate these specialized tools alongside traditional utilities for comprehensive monitoring.
Scaling Monitoring for Large Infrastructures
As infrastructure grows from dozens to hundreds or thousands of servers, manual monitoring becomes impractical. Centralized monitoring platforms aggregate data from all systems, providing fleet-wide visibility and alerting. Investing in these platforms early—before they become absolutely necessary—prevents scrambling to implement monitoring during rapid growth.
Cloud-native environments with ephemeral instances require different approaches. Infrastructure-as-code ensures consistent monitoring configuration across all instances, while containerized monitoring agents deploy automatically with new instances. This automation ensures complete coverage without manual configuration for each system.
How often should I monitor disk space on production servers?
Production servers should have automated monitoring that checks disk space at least every 5-15 minutes, with alerts configured at appropriate thresholds. This frequency catches rapidly growing issues before they become critical. Additionally, implement weekly or monthly manual audits using du to identify gradual accumulation that might not trigger immediate alerts but represents long-term capacity concerns.
What's the best way to handle a filesystem that's completely full and preventing normal operations?
When a filesystem reaches 100% capacity, immediate action is required. First, use df to identify the full filesystem, then find the largest files with: find /path -type f -exec du -h {} + | sort -rh | head -20. Look for obvious candidates like old logs, core dumps, or temporary files that can be safely deleted. If no obvious files exist, check for deleted files held open by processes using lsof +L1. As a last resort, temporarily move files to another filesystem to restore operation, then investigate the root cause.
Why does du take so long to complete on large directories, and how can I speed it up?
The du command must traverse the entire directory tree and stat every file, which becomes time-consuming with millions of files. Speed up analysis by limiting depth with --max-depth=N to avoid unnecessary recursion. For initial investigations, start with --max-depth=1 to identify problem areas, then drill down into specific directories. Alternatively, use specialized tools like ncdu that cache results for faster navigation, or implement parallel processing with find and parallel to leverage multiple CPU cores.
What should I do when du and df show significantly different space usage numbers?
Discrepancies between du and df are normal and usually result from deleted files held open by processes, reserved filesystem blocks, or filesystem metadata overhead. To investigate, use lsof +L1 to identify deleted but open files. Check for sparse files with du --apparent-size versus regular du. Consider the 5% reserved space on ext filesystems. If the discrepancy is large and unexplained, filesystem corruption might be involved, warranting a filesystem check with fsck during maintenance.
How can I prevent disk space issues before they impact services?
Prevention requires multiple layers of protection. Implement automated monitoring with alerts at 75% and 85% capacity. Configure log rotation for all applications using logrotate. Establish retention policies that automatically remove old files from temporary directories, archives, and backups. Track growth rates to predict when capacity additions will be needed. Implement capacity planning that adds storage proactively rather than reactively. Regular audits with du identify accumulating data before it becomes problematic. This multi-layered approach catches issues at various stages before they impact operations.
Are there situations where df or du might provide inaccurate information?
Several scenarios can produce misleading results. Filesystem corruption may cause df to report incorrect statistics until fsck repairs the filesystem. On filesystems with compression or deduplication (btrfs, zfs), standard tools don't account for these features, requiring filesystem-specific utilities. When filesystems are mounted over existing directories, hidden files still consume space but aren't visible to du. Network filesystems that become unavailable may cause commands to hang or report stale data. In these cases, use filesystem-specific tools and verify results through multiple methods.