How to Find Large Files in Linux Quickly

Quick guide: find large Linux files fast with du, find, ncdu; use du -ah --max-depth=1 | sort -hr or find / -type f -size +100M -exec ls -lh {} + to free disk. to delete or archive

How to Find Large Files in Linux Quickly
SPONSORED

Sponsor message — This article is made possible by Dargslan.com, a publisher of practical, no-fluff IT & developer workbooks.

Why Dargslan.com?

If you prefer doing over endless theory, Dargslan’s titles are built for you. Every workbook focuses on skills you can apply the same day—server hardening, Linux one-liners, PowerShell for admins, Python automation, cloud basics, and more.


How to Find Large Files in Linux Quickly

Running out of disk space on your Linux system can bring productivity to a grinding halt. Whether you're managing a personal workstation, maintaining a server, or troubleshooting performance issues, identifying which files are consuming precious storage is often the first critical step toward resolving the problem. Large files accumulate silently over time—log files grow unchecked, forgotten downloads pile up, and temporary files become permanent residents on your filesystem.

Finding large files in Linux involves using command-line tools and utilities that scan your filesystem to identify space-consuming culprits. This process encompasses multiple approaches, from simple one-line commands to sophisticated file analysis tools, each suited to different scenarios and user preferences. The right method depends on your specific needs, system configuration, and comfort level with terminal operations.

Throughout this comprehensive guide, you'll discover practical techniques for locating large files across your Linux system, understand the tools available at your disposal, learn how to interpret the results effectively, and develop strategies for managing disk space proactively. You'll gain hands-on knowledge of commands like find, du, and specialized utilities, complete with real-world examples and performance considerations that will transform how you approach storage management.

Understanding Disk Space Challenges in Linux Environments

Before diving into specific commands and techniques, it's essential to understand why disk space management matters and how files accumulate on Linux systems. Unlike desktop operating systems with graphical storage analyzers readily available, Linux environments—particularly servers—often require command-line proficiency to diagnose and resolve space issues efficiently.

Linux filesystems organize data hierarchically, starting from the root directory (/) and branching into numerous subdirectories. System files, user data, application logs, temporary files, and cached content all compete for available space. Without regular monitoring, certain directories can balloon unexpectedly: /var/log might fill with application logs, /tmp could retain abandoned temporary files, or /home directories might accumulate user downloads and media files.

"The most common cause of unexpected disk space exhaustion isn't a single massive file, but rather the gradual accumulation of medium-sized files that escape regular attention and maintenance routines."

Performance implications extend beyond simple storage capacity. When a filesystem approaches full capacity, system performance degrades, applications fail to write data, logs stop recording critical information, and in severe cases, the entire system may become unstable. Identifying large files quickly becomes not just a maintenance task but a critical troubleshooting skill.

Common Scenarios Requiring Large File Identification

Several situations typically prompt the need to locate large files:

  • Disk space alerts: System monitoring tools or manual checks reveal critically low available space
  • Application failures: Programs crash or malfunction due to inability to write temporary or permanent data
  • Performance degradation: System slowdowns related to disk I/O bottlenecks or near-capacity filesystems
  • Backup failures: Insufficient space prevents backup operations from completing successfully
  • Routine maintenance: Proactive identification of space consumption patterns before problems emerge
  • Security investigations: Detecting unusual file growth that might indicate compromise or unauthorized activity

Each scenario benefits from different approaches to file discovery. Emergency situations demand quick, targeted searches, while routine maintenance allows for comprehensive analysis and reporting.

Essential Command-Line Tools for File Discovery

Linux provides several powerful native utilities for locating large files. Understanding each tool's strengths, limitations, and optimal use cases enables you to select the most appropriate approach for your specific situation.

The Find Command: Precision and Flexibility

The find command represents the most versatile tool for locating files based on various criteria, including size. Its flexibility allows complex queries combining multiple parameters, making it ideal for targeted searches.

Basic syntax for finding large files:

find /path/to/search -type f -size +100M

This command searches the specified path for regular files (-type f) larger than 100 megabytes (-size +100M). The size parameter accepts various units:

  • 📊 c - bytes
  • 📊 k - kilobytes
  • 📊 M - megabytes
  • 📊 G - gigabytes

Advanced find examples with practical applications:

find / -type f -size +500M -exec ls -lh {} \; 2>/dev/null

This enhanced version searches the entire filesystem (/), finds files exceeding 500MB, executes ls -lh on each result to display human-readable sizes, and suppresses error messages by redirecting stderr to /dev/null.

For sorted output showing the largest files first:

find /home -type f -size +100M -exec ls -lh {} \; | sort -k5 -hr

This command searches the /home directory, lists files over 100MB, and sorts results by size in descending order. The sort command uses -k5 to sort by the fifth column (file size), -h for human-readable number comparison, and -r for reverse (largest first) ordering.

"When searching the entire filesystem, always redirect error messages to avoid cluttering output with permission denied warnings from directories your user account cannot access."

The Du Command: Directory-Level Analysis

While find excels at locating individual files, the du (disk usage) command provides directory-level analysis, revealing which folders consume the most space. This perspective often proves more actionable for cleanup operations.

Basic du usage for directory analysis:

du -h --max-depth=1 /var | sort -hr | head -20

This command analyzes the /var directory with human-readable sizes (-h), limits depth to immediate subdirectories (--max-depth=1), sorts results by size in descending order, and displays the top 20 largest directories.

For a comprehensive system-wide analysis:

du -hx --max-depth=2 / | sort -hr | head -30

The -x flag restricts the scan to a single filesystem, preventing du from crossing into mounted filesystems, which is particularly useful on systems with multiple partitions or network mounts.

Command Option Purpose Use Case
-h Human-readable sizes (K, M, G) Improves readability for manual inspection
-s Summary only (total for each argument) Quick overview without subdirectory details
-x Stay on current filesystem Prevents scanning mounted network drives
--max-depth=N Limit directory traversal depth Controls output verbosity and scan time
-c Produce grand total Shows cumulative size of all arguments
--exclude Skip specified patterns Ignore known large directories like backups

Combining Commands for Powerful Queries

The true power of Linux command-line tools emerges when combining utilities through pipes and command substitution. These combinations create sophisticated queries tailored to specific discovery needs.

Finding the largest files across the entire system:

find / -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -20 | awk '{print $1/1024/1024 "MB", $2}'

This comprehensive command chain performs several operations: find locates all regular files and prints their size in bytes followed by the path; sort -nr numerically sorts results in descending order; head -20 limits output to the top 20 results; and awk converts byte sizes to megabytes for readability.

Locating files modified within a specific timeframe:

find /var/log -type f -size +50M -mtime -7 -exec ls -lh {} \;

This targeted search finds files in /var/log larger than 50MB that were modified within the last 7 days, helping identify actively growing log files that require attention.

"Combining size criteria with modification time filters helps distinguish between static large files that have existed for extended periods and actively growing files that represent ongoing space consumption."

Specialized Tools for Enhanced File Analysis

Beyond standard utilities, several specialized tools provide enhanced functionality, visual interfaces, and more intuitive interaction for file discovery and disk space analysis.

NCurses Disk Usage (ncdu): Interactive Exploration

The ncdu tool offers an interactive, ncurses-based interface for exploring disk usage. Unlike command-line utilities that produce static output, ncdu allows real-time navigation through directory structures, making it exceptionally user-friendly for identifying space consumption patterns.

Installation on various distributions:

# Debian/Ubuntu
sudo apt install ncdu

# Red Hat/CentOS/Fedora
sudo yum install ncdu

# Arch Linux
sudo pacman -S ncdu

Basic usage:

ncdu /

After scanning completes, ncdu presents an interactive interface where you can navigate directories using arrow keys, sort by various criteria, and even delete files directly from the interface. The tool displays size information, item counts, and percentage of parent directory space consumed.

Key features and navigation:

  • 🔍 Arrow keys: Navigate between directories and files
  • 🔍 Enter: Descend into selected directory
  • 🔍 d: Delete selected item (with confirmation)
  • 🔍 n: Sort by name
  • 🔍 s: Sort by size

Disk Usage Analyzer (Baobab): Graphical Visualization

For desktop Linux environments, Baobab (also known as Disk Usage Analyzer) provides a graphical interface with visual representations of disk space consumption. The tool presents data as treemaps or ring charts, making patterns immediately apparent.

Installation:

# Debian/Ubuntu
sudo apt install baobab

# Fedora
sudo dnf install baobab

Launch from your application menu or via command line with baobab. The graphical interface allows scanning specific directories, viewing results as lists or charts, and accessing files directly through your file manager.

Dutree: Modern Visualization in Terminal

The dutree utility combines the functionality of du with visual tree representations directly in the terminal, providing an intuitive middle ground between pure command-line tools and full graphical applications.

Installation typically requires cargo (Rust package manager):

cargo install dutree

Usage example:

dutree -d 2 /var

The output displays a color-coded tree structure showing directory sizes with visual bars indicating relative space consumption, making patterns immediately recognizable even in terminal environments.

"Interactive tools like ncdu significantly reduce the cognitive load of disk space analysis by allowing exploration rather than requiring perfect query construction upfront."

Performance Considerations and Optimization Strategies

File discovery operations, particularly on large filesystems, can be resource-intensive and time-consuming. Understanding performance implications and optimization techniques ensures efficient searches without unnecessarily impacting system operations.

Limiting Search Scope for Faster Results

The most effective optimization strategy involves narrowing search scope to relevant directories rather than scanning entire filesystems. If you suspect log files are consuming space, searching only /var/log completes exponentially faster than scanning from root.

Strategic search locations based on common space consumers:

  • /var/log - Application and system logs
  • /tmp - Temporary files that may not be automatically cleaned
  • /home - User data, downloads, and personal files
  • /var/cache - Package manager caches and application caches
  • /var/spool - Print queues, mail queues, and cron jobs
  • /opt - Third-party application installations

Using Filesystem Boundaries Effectively

When multiple filesystems are mounted, tools like find and du will traverse into mounted filesystems by default, potentially scanning network shares or external drives unnecessarily.

Restricting searches to single filesystems:

# Using find
find / -xdev -type f -size +100M

# Using du
du -hx --max-depth=2 /

The -xdev flag for find and -x flag for du prevent crossing filesystem boundaries, significantly improving performance on systems with multiple mount points.

Background Execution for Non-Urgent Searches

For comprehensive filesystem scans that aren't time-critical, running searches in the background with adjusted priority prevents interference with normal system operations.

Using nice to reduce process priority:

nice -n 19 find / -type f -size +500M > /tmp/large_files.txt 2>&1 &

This command runs find with the lowest priority (nice -n 19), redirects output to a file, and executes in the background (&), allowing you to continue working while the search progresses.

Optimization Technique Performance Impact Trade-offs
Limiting search scope High - Dramatically reduces scan time May miss files outside specified paths
Using -xdev/-x flags Medium - Avoids network and external drives Won't find files on other filesystems
Adjusting max-depth Medium - Reduces directory traversal May not reveal deeply nested large files
Background execution Low - Doesn't speed search but preserves responsiveness Results not immediately available
Excluding directories Variable - Depends on excluded content size Requires knowing which directories to exclude

Caching Results for Repeated Analysis

When performing multiple queries or analyses on the same data, generating a comprehensive file listing once and querying that cached data proves far more efficient than repeated filesystem scans.

Creating a searchable file database:

# Generate comprehensive file listing
find / -type f -printf "%s\t%p\n" 2>/dev/null > /tmp/all_files.txt

# Query the cached data
sort -nr /tmp/all_files.txt | head -50 | awk '{print $1/1024/1024 "MB\t" $2}'

This approach separates the expensive filesystem scanning operation from the analysis phase, enabling rapid experimentation with different queries and size thresholds.

"For systems with millions of files, the difference between repeated filesystem scans and querying cached data can mean the difference between minutes and seconds for each analysis iteration."

Practical Workflows for Different Scenarios

Different situations call for different approaches to file discovery. Understanding these scenario-specific workflows helps you respond effectively to various disk space challenges.

Emergency Response: Rapidly Freeing Critical Space

When a system runs critically low on disk space, immediate action is necessary. This workflow prioritizes speed and impact over comprehensive analysis.

Step 1: Identify the filesystem under pressure

df -h

This command reveals which mounted filesystems are approaching capacity, allowing you to focus subsequent searches on the relevant partition.

Step 2: Quick identification of largest files

find /var -xdev -type f -size +100M -exec ls -lh {} \; | sort -k5 -hr | head -20

This targeted search focuses on the problematic filesystem (in this example, /var), quickly identifying the largest files for immediate review.

Step 3: Examine log rotation and cleanup

du -sh /var/log/*

Log directories frequently contain the most easily reclaimable space through rotation, compression, or deletion of old entries.

Step 4: Clear known temporary locations

du -sh /tmp /var/tmp

Temporary directories often contain abandoned files safe for deletion, providing quick space recovery.

Routine Maintenance: Proactive Space Management

Regular maintenance prevents emergency situations by identifying and addressing space consumption trends before they become critical.

Monthly comprehensive analysis workflow:

# Generate baseline disk usage report
df -h > /var/log/disk_usage_$(date +%Y%m%d).txt

# Identify top 50 largest files system-wide
find / -xdev -type f -printf "%s\t%p\n" 2>/dev/null | sort -nr | head -50 > /var/log/large_files_$(date +%Y%m%d).txt

# Analyze directory-level consumption
du -hx --max-depth=3 / | sort -hr | head -100 > /var/log/directory_usage_$(date +%Y%m%d).txt

These commands generate dated reports allowing trend analysis over time, revealing gradual space consumption patterns that might otherwise go unnoticed.

Security Investigation: Detecting Anomalous Files

Unusual file growth can indicate security compromises, unauthorized activity, or application malfunctions requiring investigation.

Finding recently created large files:

find / -xdev -type f -size +50M -ctime -7 -exec ls -lh {} \;

This search identifies files larger than 50MB created within the last week, highlighting unusual or unexpected file generation.

Locating files with suspicious ownership:

find /var/www -type f -size +10M ! -user www-data -exec ls -lh {} \;

In web server environments, this command finds large files not owned by the web server user, potentially indicating uploaded malicious content.

Application-Specific Investigation

When specific applications exhibit issues or consume excessive space, targeted investigation focuses on application-specific directories and file types.

Database file analysis:

find /var/lib/mysql -name "*.ibd" -o -name "*.MYD" -o -name "*.MYI" | xargs ls -lh | sort -k5 -hr

This command specifically targets MySQL/MariaDB data files, identifying which tables consume the most space.

Docker container and image analysis:

docker system df
du -sh /var/lib/docker/*

These commands reveal Docker-related space consumption, often a significant source of disk usage on container-based systems.

Interpreting Results and Making Informed Decisions

Discovering large files represents only the first step; interpreting results correctly and making informed decisions about which files to keep, compress, archive, or delete requires understanding file purposes and potential consequences.

Categorizing Discovered Files

Large files generally fall into several categories, each requiring different handling approaches:

  • 💾 Log files: Candidates for rotation, compression, or deletion based on age and retention policies
  • 💾 Database files: Require careful analysis; deletion may cause data loss; consider optimization or archival
  • 💾 Media files: Often legitimate but may be duplicates or forgotten downloads suitable for removal
  • 💾 Backup files: Verify backup integrity elsewhere before deletion; consider moving to dedicated backup storage
  • 💾 Cache files: Generally safe to delete; applications will regenerate as needed

Safe Deletion Practices

Before deleting any file, especially large ones that might be important, follow these safety practices:

Verify file purpose and ownership:

ls -lh /path/to/large/file
file /path/to/large/file
lsof /path/to/large/file

These commands reveal file details, type, and whether any process currently has the file open, preventing deletion of actively used files.

Move before deleting:

mkdir /tmp/files_to_delete
mv /path/to/large/file /tmp/files_to_delete/

Moving files to a temporary location rather than immediate deletion provides a safety window for verification before permanent removal.

Compress rather than delete:

gzip /var/log/old_application.log

For log files and text-based content, compression often reduces size by 90% or more while preserving data for potential future reference.

"The most dangerous command in system administration isn't the one that fails loudly, but the one that succeeds silently in deleting something you'll need tomorrow."

Establishing Retention Policies

Rather than reactive deletion, establishing clear retention policies prevents space issues from recurring:

  • Log retention: Define how long logs should be retained based on compliance, debugging needs, and available space
  • Backup rotation: Implement grandfather-father-son or similar schemes ensuring old backups are automatically removed
  • Temporary file cleanup: Configure automated cleanup of temporary directories through cron jobs or systemd timers
  • User quotas: Implement filesystem quotas preventing individual users from consuming excessive space

Example automated cleanup script for old logs:

#!/bin/bash
# Remove logs older than 30 days
find /var/log -name "*.log" -type f -mtime +30 -delete

# Compress logs older than 7 days
find /var/log -name "*.log" -type f -mtime +7 -exec gzip {} \;

Schedule this script via cron to run daily, maintaining consistent log retention without manual intervention.

Advanced Techniques and Specialized Scenarios

Beyond basic file discovery, advanced techniques address specialized scenarios and provide deeper insights into storage consumption patterns.

Identifying Duplicate Files

Duplicate files waste storage space without providing value. Tools like fdupes identify identical files based on content comparison.

Installation and usage:

# Installation
sudo apt install fdupes

# Find duplicates in directory tree
fdupes -r /home

# Find and delete duplicates (interactive)
fdupes -rd /home

The -r flag enables recursive searching, while -d provides interactive deletion prompts for duplicate files.

Analyzing Sparse Files

Sparse files contain large regions of zeros that don't actually consume disk space, but appear large in listings. Understanding sparse files prevents mistaken attempts to "free" space that isn't actually used.

Identifying sparse files:

find / -type f -printf "%S\t%s\t%p\n" 2>/dev/null | awk '$1 < 1.0 {print}'

This command uses find's %S format specifier (sparseness ratio) to identify files where actual disk usage is less than apparent size.

Tracking Space Consumption Over Time

Understanding how space consumption changes over time helps identify trends and predict future capacity needs.

Automated daily logging:

#!/bin/bash
# Add to daily cron job
echo "$(date +%Y-%m-%d) $(df -h / | tail -1 | awk '{print $3}')" >> /var/log/disk_usage_history.txt

This simple script logs daily disk usage, creating a historical record for trend analysis.

Graphing historical data:

# Using gnuplot for visualization
gnuplot -e "set terminal png; set output '/tmp/disk_usage.png'; plot '/var/log/disk_usage_history.txt' using 1:2 with lines"

Network Filesystem Considerations

When working with NFS, SMB, or other network filesystems, additional considerations apply:

  • Network latency significantly impacts scan performance
  • File metadata may not accurately reflect actual storage location
  • Permissions and access controls may differ from local filesystems
  • Deletion operations may require different privileges

Excluding network mounts from scans:

find / -xdev -type f -size +100M

The -xdev flag prevents find from traversing into mounted network filesystems, avoiding performance penalties and irrelevant results.

Container and Virtual Machine Storage

Modern infrastructure often involves containers and virtual machines, each with unique storage characteristics.

Docker space analysis:

# Overall Docker space usage
docker system df

# Detailed container space usage
docker ps -as

# Remove unused containers, images, and volumes
docker system prune -a --volumes

Virtual machine disk images:

find /var/lib/libvirt/images -name "*.qcow2" -exec qemu-img info {} \;

This command locates QEMU virtual machine disk images and displays their actual vs. virtual size, revealing space-saving opportunities through image compression or unused VM removal.

Automation and Monitoring Best Practices

Transforming reactive file discovery into proactive monitoring prevents space crises and maintains optimal system performance through automation.

Implementing Automated Monitoring

Automated monitoring alerts you to space issues before they become critical, allowing planned intervention rather than emergency response.

Simple shell script for space monitoring:

#!/bin/bash
THRESHOLD=80
CURRENT=$(df / | tail -1 | awk '{print $5}' | sed 's/%//')

if [ $CURRENT -gt $THRESHOLD ]; then
    echo "Disk usage on / is ${CURRENT}% - exceeds threshold of ${THRESHOLD}%" | mail -s "Disk Space Alert" admin@example.com
    
    # Generate detailed report
    find / -xdev -type f -size +100M -exec ls -lh {} \; | sort -k5 -hr | head -20 > /tmp/large_files_alert.txt
fi

Schedule this script via cron to run hourly or daily, adjusting the threshold based on your environment's characteristics.

Integrating with System Monitoring Tools

Enterprise environments benefit from integration with comprehensive monitoring solutions like Nagios, Prometheus, or Zabbix.

Prometheus node exporter example:

The Prometheus node exporter automatically collects and exposes filesystem metrics, enabling sophisticated alerting rules:

# Prometheus alert rule
- alert: HighDiskUsage
  expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) < 0.1
  for: 10m
  annotations:
    summary: "High disk usage on {{ $labels.instance }}"

Scheduled Cleanup Automation

Beyond monitoring, automated cleanup maintains space availability without manual intervention.

Systemd timer for automated cleanup:

# /etc/systemd/system/cleanup-logs.service
[Unit]
Description=Clean old log files

[Service]
Type=oneshot
ExecStart=/usr/local/bin/cleanup-logs.sh

# /etc/systemd/system/cleanup-logs.timer
[Unit]
Description=Run log cleanup daily

[Timer]
OnCalendar=daily
Persistent=true

[Install]
WantedBy=timers.target

Enable with systemctl enable --now cleanup-logs.timer to establish regular automated maintenance.

Documentation and Runbook Development

Documenting your file discovery and cleanup procedures ensures consistent handling across team members and reduces response time during incidents.

Essential runbook components:

  • Step-by-step procedures for identifying large files on each critical filesystem
  • Decision trees for determining which files are safe to delete
  • Contact information for application owners when application-specific files are involved
  • Escalation procedures when standard cleanup doesn't free sufficient space
  • Post-incident review templates for improving future responses

Common Pitfalls and How to Avoid Them

Understanding common mistakes in file discovery and space management helps you avoid costly errors and inefficient approaches.

Deleting Files Still in Use

One of the most dangerous mistakes involves deleting files that applications currently have open. On Linux, deleted files with open file handles continue consuming space until the application closes them.

Checking for open file handles before deletion:

lsof /var/log/large_application.log

If this command returns results, processes are actively using the file. Deleting it won't free space immediately and may cause application errors.

Proper procedure for rotating active log files:

# Copy file, truncate original (preserves file handle)
cp /var/log/app.log /var/log/app.log.old
> /var/log/app.log

# Or use logrotate for automated handling
sudo logrotate -f /etc/logrotate.d/application

Ignoring Filesystem Boundaries

Searching across filesystem boundaries wastes time and may produce misleading results when mounted filesystems have different capacity constraints.

Always use filesystem-limiting flags:

find / -xdev -type f -size +100M
du -hx --max-depth=2 /

Overlooking Hidden Files and Directories

By default, some commands and tools skip hidden files (those beginning with a dot). These files can accumulate significant space, particularly in user home directories.

Ensuring hidden files are included:

du -sh /home/*/.cache
du -sh /home/*/.*

Common space-consuming hidden directories include .cache, .local/share, and application-specific directories like .mozilla or .config.

Focusing Only on File Size

While individual large files attract attention, numerous smaller files can collectively consume significant space and create performance issues through inode exhaustion.

Checking inode usage:

df -i

High inode usage percentage with available space indicates numerous small files. Address this by identifying and removing unnecessary files:

find / -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -nr

This command counts files by top-level directory, revealing where small file accumulation occurs.

Neglecting Compression Opportunities

Immediately deleting large files eliminates recovery options. Compression often reduces size dramatically while preserving data.

Compression comparison for log files:

ls -lh /var/log/application.log
gzip /var/log/application.log
ls -lh /var/log/application.log.gz

Text-based logs commonly compress to 10% or less of original size, providing substantial space savings without data loss.

Frequently Asked Questions

What is the fastest way to find the largest files on my Linux system?

The fastest approach depends on your specific needs, but for quick results, use: find / -xdev -type f -size +500M -exec ls -lh {} \; 2>/dev/null | sort -k5 -hr | head -20. This command searches from root while staying on the current filesystem (-xdev), finds files larger than 500MB, sorts by size, and shows the top 20 results. For interactive exploration, install and use ncdu /, which provides a navigable interface after initial scanning.

How can I find large files without having root access?

Without root privileges, you can only search directories where you have read permissions, typically your home directory and world-readable locations. Use: find ~ -type f -size +100M -exec ls -lh {} \; to search your home directory, or du -h ~ | sort -hr | head -20 for directory-level analysis. You won't be able to scan system directories like /var or other users' home directories without appropriate permissions.

Why does deleting a large file not free up disk space immediately?

This occurs when a process still has the file open. On Linux, deleting a file with open file handles removes the directory entry but doesn't actually free the space until all processes close their handles to the file. Use lsof | grep deleted to identify processes holding deleted files open. Restart the application or use lsof +L1 to find processes with handles to deleted files, then restart those services to reclaim the space.

What's the difference between du and df, and which should I use?

The df command reports filesystem-level space usage by querying filesystem metadata, showing total, used, and available space for mounted filesystems. The du command calculates space by traversing directories and summing file sizes. Use df -h to quickly identify which filesystems are full, then use du to drill down into specific directories to find what's consuming space. Discrepancies between the two often indicate deleted files still held open by processes.

How do I exclude certain directories from my search for large files?

Use the -prune option with find to exclude directories: find / -path /mnt -prune -o -path /media -prune -o -type f -size +100M -print. This example excludes /mnt and /media directories from the search. For du, use --exclude: du -h --exclude=/proc --exclude=/sys / | sort -hr. Excluding directories like /proc, /sys, and mounted network shares significantly improves performance.

Can I safely delete files in the /tmp directory to free space?

Generally, yes, but with caution. The /tmp directory is designed for temporary files, and most systems automatically clean it on reboot. However, running applications may have active temporary files there. Check for recent modification times with find /tmp -type f -mtime +7 to find files older than 7 days, which are usually safe to delete. Avoid deleting files modified recently or those owned by running processes. Better yet, let the system's automatic cleanup handle it, or configure tmpwatch or tmpreaper for automated maintenance.

How can I find files that have grown recently rather than just large static files?

Combine size criteria with modification time: find /var/log -type f -size +50M -mtime -7 -exec ls -lh {} \; finds files larger than 50MB modified in the last 7 days. For more sophisticated tracking, create baseline reports and compare: find / -type f -size +100M > /tmp/baseline.txt, then later run the same command and use diff to identify newly large files. This approach helps identify actively growing files versus static large files that have existed for extended periods.

What tools can I use if I need a graphical interface instead of command-line?

For desktop Linux environments, install Baobab (Disk Usage Analyzer) with sudo apt install baobab on Debian/Ubuntu or sudo dnf install baobab on Fedora. This GNOME application provides visual treemaps and ring charts showing space consumption. Another option is Filelight (sudo apt install filelight), which offers similar visualization with a different interface. For a middle ground between command-line and GUI, use ncdu, which provides an interactive terminal interface combining the accessibility of command-line tools with navigable exploration.