How to Measure Script Execution Time in Python

Diagram showing methods to measure Python script execution time: time module, timeit, perf_counter, profiling tools and best practices for accurate timing and performance analysis.

How to Measure Script Execution Time in Python
SPONSORED

Sponsor message — This article is made possible by Dargslan.com, a publisher of practical, no-fluff IT & developer workbooks.

Why Dargslan.com?

If you prefer doing over endless theory, Dargslan’s titles are built for you. Every workbook focuses on skills you can apply the same day—server hardening, Linux one-liners, PowerShell for admins, Python automation, cloud basics, and more.


In the world of software development, understanding how long your code takes to execute isn't just about satisfying curiosity—it's about building applications that respect your users' time and system resources. Whether you're optimizing a data processing pipeline, debugging performance bottlenecks, or simply ensuring your application meets response time requirements, measuring script execution time provides the empirical evidence needed to make informed decisions about your code's efficiency.

Script execution time measurement refers to the process of quantifying how long a particular piece of code takes to run from start to finish. This encompasses everything from simple function calls to complex algorithmic operations. Python offers multiple approaches to accomplish this task, each with distinct advantages depending on your specific use case, precision requirements, and the scope of what you're measuring. Understanding these different methods allows you to select the most appropriate tool for your particular situation.

Throughout this comprehensive guide, you'll discover practical techniques for measuring execution time in Python, ranging from basic timing methods suitable for quick checks to sophisticated profiling tools designed for in-depth performance analysis. You'll learn when to use each approach, how to interpret the results accurately, and what common pitfalls to avoid. By the end, you'll possess a complete toolkit for performance measurement that will elevate your ability to write efficient, production-ready Python code.

Understanding the Importance of Execution Time Measurement

Performance optimization begins with measurement. Without concrete data about how long different parts of your application take to execute, you're essentially navigating in the dark. Execution time measurement provides the foundation for identifying bottlenecks, comparing alternative implementations, and validating that optimizations actually improve performance rather than merely changing how the code looks.

Modern applications face increasing demands for responsiveness and efficiency. Users expect near-instantaneous feedback, while backend systems must handle growing data volumes without proportional increases in processing time. In this environment, understanding execution time becomes critical for maintaining competitive advantage and user satisfaction.

"The single biggest problem in performance optimization is not knowing where the actual bottlenecks are. Measurement transforms guesswork into science."

Beyond immediate performance concerns, execution time metrics serve as valuable indicators of code health over time. Tracking how execution times change across versions helps identify performance regressions before they reach production, while establishing baseline measurements enables meaningful comparison when refactoring or introducing new features.

Basic Time Measurement Using the time Module

The simplest approach to measuring execution time in Python involves the built-in time module, which provides straightforward functions for capturing timestamps before and after code execution. This method works well for quick measurements and doesn't require importing specialized libraries beyond what's included in Python's standard library.

Using time.time() for Wall-Clock Measurement

The time.time() function returns the current time in seconds since the Unix epoch (January 1, 1970). By capturing this value before and after a code block, you can calculate the elapsed wall-clock time—the actual time that passed from a human perspective, including any time the system spent on other processes.

import time

start_time = time.time()

# Your code to measure
result = sum(range(1000000))

end_time = time.time()
execution_time = end_time - start_time

print(f"Execution time: {execution_time:.6f} seconds")

This approach provides adequate precision for most general purposes, typically offering resolution in microseconds on modern systems. However, the actual resolution depends on your operating system and hardware, which can introduce variability in measurements.

Leveraging time.perf_counter() for Higher Precision

For more precise measurements, Python provides time.perf_counter(), which uses the highest-resolution clock available on your system. This function is specifically designed for performance measurement and excludes time when the system is in sleep mode, making it ideal for benchmarking code execution.

import time

start = time.perf_counter()

# Code block to measure
data = [x**2 for x in range(100000)]

end = time.perf_counter()

print(f"Execution time: {(end - start):.9f} seconds")
"Precision in measurement directly correlates with confidence in optimization decisions. Choose your timing method based on the level of precision your analysis requires."

Understanding time.process_time() for CPU Time

Unlike wall-clock measurements, time.process_time() measures only the CPU time consumed by the current process, excluding time spent sleeping or waiting for I/O operations. This distinction becomes crucial when measuring CPU-intensive operations where you want to isolate actual computation time from system-level delays.

import time

start_cpu = time.process_time()

# CPU-intensive operation
result = [i * 2 for i in range(1000000)]

end_cpu = time.process_time()
cpu_time = end_cpu - start_cpu

print(f"CPU time: {cpu_time:.6f} seconds")
Function What It Measures Best Use Case Resolution
time.time() Wall-clock time (real-world elapsed time) General timing, user-facing operations System-dependent (typically microseconds)
time.perf_counter() High-resolution wall-clock time Precise benchmarking, performance testing Nanosecond precision on most systems
time.process_time() CPU time for current process CPU-bound operations, algorithm comparison System-dependent
time.monotonic() Monotonic clock (never goes backward) Measuring intervals, avoiding clock adjustments System-dependent

Advanced Timing with the timeit Module

While basic time measurements work for simple cases, the timeit module provides a more robust framework specifically designed for benchmarking small code snippets. This module addresses several challenges inherent in performance measurement, including garbage collection interference and the need for multiple executions to obtain statistically meaningful results.

Basic timeit Usage from the Command Line

The timeit module can be invoked directly from the command line, making it convenient for quick measurements without writing a separate script. This approach is particularly useful for testing simple expressions or comparing different approaches to the same problem.

python -m timeit "'-'.join(str(n) for n in range(100))"
python -m timeit "'-'.join([str(n) for n in range(100)])"
python -m timeit "'-'.join(map(str, range(100)))"

The command-line interface automatically runs the code multiple times and reports the best result, eliminating the need to manually implement repetition logic. This automation ensures consistent methodology across different measurements.

Using timeit in Python Code

For more complex scenarios or when you need to integrate timing into your application, the timeit module offers programmatic access through its timeit() and repeat() functions. These allow fine-grained control over the number of executions and repetitions.

import timeit

# Define the code to measure
code_to_test = """
def calculate_sum():
    return sum(range(1000))

calculate_sum()
"""

# Measure execution time
execution_time = timeit.timeit(code_to_test, number=10000)
print(f"Average time per execution: {execution_time/10000:.9f} seconds")
"Running code multiple times and taking the minimum result, rather than the average, often provides the most accurate representation of the code's true performance potential by minimizing the impact of system-level interference."

Comparing Multiple Implementations

One of the most valuable applications of timeit is comparing different approaches to solving the same problem. By measuring multiple implementations under identical conditions, you can make data-driven decisions about which approach offers the best performance.

import timeit

# Setup code that runs once
setup = "data = list(range(1000))"

# Different implementations to compare
method1 = "result = [x * 2 for x in data]"
method2 = "result = list(map(lambda x: x * 2, data))"
method3 = "result = []; [result.append(x * 2) for x in data]"

# Compare execution times
time1 = timeit.timeit(method1, setup=setup, number=100000)
time2 = timeit.timeit(method2, setup=setup, number=100000)
time3 = timeit.timeit(method3, setup=setup, number=100000)

print(f"List comprehension: {time1:.6f} seconds")
print(f"Map with lambda: {time2:.6f} seconds")
print(f"Append in loop: {time3:.6f} seconds")

Context Managers for Clean Time Measurement

Creating reusable timing utilities through Python's context manager protocol provides an elegant solution for measuring execution time without cluttering your code with repetitive timing logic. This approach leverages the with statement to automatically handle start and end timing, ensuring clean and readable code.

Building a Custom Timer Context Manager

A custom timer context manager encapsulates the timing logic, making it easy to measure any code block consistently throughout your application. This pattern promotes code reusability and maintains a clear separation between your business logic and performance measurement concerns.

import time
from contextlib import contextmanager

@contextmanager
def timer(label="Code block"):
    start = time.perf_counter()
    try:
        yield
    finally:
        end = time.perf_counter()
        print(f"{label}: {(end - start):.6f} seconds")

# Usage example
with timer("Data processing"):
    data = [x**2 for x in range(1000000)]
    result = sum(data)

The context manager pattern ensures that timing measurements are always completed correctly, even if the measured code raises an exception. The finally block guarantees that the end time is captured and the result is reported regardless of execution flow.

Enhanced Timer with Statistics Collection

Extending the basic timer to collect statistics across multiple measurements provides deeper insights into performance characteristics, including variability and consistency. This enhanced version stores timing data for later analysis rather than just printing immediate results.

import time
from contextlib import contextmanager
from typing import List

class TimerStats:
    def __init__(self):
        self.measurements: List[float] = []
    
    @contextmanager
    def measure(self, label: str = ""):
        start = time.perf_counter()
        try:
            yield
        finally:
            end = time.perf_counter()
            duration = end - start
            self.measurements.append(duration)
            if label:
                print(f"{label}: {duration:.6f} seconds")
    
    def summary(self):
        if not self.measurements:
            return "No measurements recorded"
        
        avg = sum(self.measurements) / len(self.measurements)
        minimum = min(self.measurements)
        maximum = max(self.measurements)
        
        return f"""
        Measurements: {len(self.measurements)}
        Average: {avg:.6f} seconds
        Minimum: {minimum:.6f} seconds
        Maximum: {maximum:.6f} seconds
        """

# Usage
stats = TimerStats()

for i in range(5):
    with stats.measure(f"Iteration {i+1}"):
        result = sum(range(100000))

print(stats.summary())

Decorators for Function-Level Timing

Decorators provide another elegant approach to measuring execution time, particularly when you want to monitor specific functions throughout your application. This technique allows you to add timing functionality without modifying the function's internal code, adhering to the principle of separation of concerns.

Creating a Basic Timing Decorator

A timing decorator wraps a function and measures its execution time each time it's called. This approach is particularly useful for monitoring frequently-called functions or identifying which functions consume the most time in your application.

import time
from functools import wraps

def timing_decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        end = time.perf_counter()
        print(f"{func.__name__} executed in {(end - start):.6f} seconds")
        return result
    return wrapper

@timing_decorator
def process_data(n):
    return sum(range(n))

@timing_decorator
def calculate_squares(n):
    return [x**2 for x in range(n)]

# Functions are automatically timed when called
process_data(1000000)
calculate_squares(10000)
"Decorators transform timing from an intrusive modification into a declarative enhancement, allowing you to add performance monitoring with a single line of code."

Advanced Decorator with Conditional Logging

Enhancing the basic decorator to support conditional logging and threshold-based reporting makes it more suitable for production environments where you might only want to log functions that exceed certain execution time thresholds.

import time
import logging
from functools import wraps

logging.basicConfig(level=logging.INFO)

def timing_decorator(threshold=None, log_level=logging.INFO):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            start = time.perf_counter()
            result = func(*args, **kwargs)
            end = time.perf_counter()
            duration = end - start
            
            if threshold is None or duration > threshold:
                logging.log(
                    log_level,
                    f"{func.__name__} executed in {duration:.6f} seconds"
                )
            
            return result
        return wrapper
    return decorator

@timing_decorator(threshold=0.1, log_level=logging.WARNING)
def slow_function():
    time.sleep(0.15)
    return "Done"

@timing_decorator(threshold=0.1)
def fast_function():
    return sum(range(1000))

slow_function()  # Will log a warning
fast_function()  # Won't log anything

Profiling for Comprehensive Performance Analysis

While measuring individual functions or code blocks provides valuable insights, comprehensive performance analysis requires understanding how different parts of your application interact and where time is actually spent across the entire execution. Python's profiling tools offer this holistic view, revealing the complete performance landscape of your application.

Using cProfile for Deterministic Profiling

The cProfile module provides deterministic profiling, meaning it tracks every function call and records detailed statistics about execution time, call counts, and cumulative time spent in each function. This comprehensive data helps identify not just slow functions, but also functions that are called excessively.

import cProfile
import pstats
from pstats import SortKey

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

def calculate_multiple():
    results = []
    for i in range(20):
        results.append(fibonacci(i))
    return results

# Profile the function
profiler = cProfile.Profile()
profiler.enable()

calculate_multiple()

profiler.disable()

# Print statistics
stats = pstats.Stats(profiler)
stats.sort_stats(SortKey.CUMULATIVE)
stats.print_stats(10)  # Show top 10 functions

The output from cProfile includes several important metrics: ncalls (number of calls), tottime (total time spent in the function excluding subfunctions), cumtime (cumulative time including subfunctions), and percall (average time per call). Understanding these metrics is crucial for identifying genuine bottlenecks versus functions that simply get called frequently.

Command-Line Profiling

For quick profiling of entire scripts, cProfile can be invoked directly from the command line without modifying your code. This approach is particularly useful for profiling existing applications or when you want to profile a complete script execution.

python -m cProfile -o profile_output.prof your_script.py
python -m pstats profile_output.prof

The profiling data can be saved to a file and analyzed later using the pstats module, enabling you to generate different reports and sort results by various criteria without re-running the expensive profiling process.

Line-by-Line Profiling with line_profiler

While cProfile provides function-level insights, sometimes you need to understand which specific lines within a function consume the most time. The line_profiler package (available via pip) offers this granular view, showing execution time for each line of code.

# Install: pip install line_profiler

from line_profiler import LineProfiler

def process_data(n):
    # Initialize data structures
    squares = []
    cubes = []
    
    # Process numbers
    for i in range(n):
        squares.append(i ** 2)
        cubes.append(i ** 3)
    
    # Calculate sums
    square_sum = sum(squares)
    cube_sum = sum(cubes)
    
    return square_sum, cube_sum

# Profile the function
profiler = LineProfiler()
profiler.add_function(process_data)
profiler.enable()

process_data(100000)

profiler.disable()
profiler.print_stats()
"Line-level profiling reveals the performance story that function-level metrics can't tell—where within your logic the actual computation time is being spent."
Profiling Tool Granularity Overhead Best For
cProfile Function-level Low to moderate Overall application profiling, identifying hot functions
line_profiler Line-level High Detailed analysis of specific functions, pinpointing exact bottlenecks
memory_profiler Line-level (memory) Very high Memory usage analysis, detecting memory leaks
py-spy Function-level (sampling) Very low Production profiling, running applications

Measuring Asynchronous Code Execution Time

Asynchronous programming introduces unique challenges for execution time measurement. Traditional timing approaches may not accurately capture the actual work being done when your code involves coroutines, async functions, and event loops. Understanding how to properly measure async code execution is essential for optimizing modern Python applications.

Timing Individual Async Functions

Measuring async functions requires using the same timing mechanisms but within an async context. The key difference is ensuring that you await the async function properly while capturing the timing information around the actual execution.

import asyncio
import time

async def async_operation(duration):
    await asyncio.sleep(duration)
    return f"Completed after {duration} seconds"

async def measure_async_function():
    start = time.perf_counter()
    result = await async_operation(0.5)
    end = time.perf_counter()
    
    print(f"Async operation took {(end - start):.6f} seconds")
    return result

# Run the async function
asyncio.run(measure_async_function())

Creating an Async Timer Context Manager

Just as with synchronous code, context managers provide a clean way to measure async code blocks. An async context manager uses __aenter__ and __aexit__ methods to handle timing around async operations.

import asyncio
import time
from contextlib import asynccontextmanager

@asynccontextmanager
async def async_timer(label="Async operation"):
    start = time.perf_counter()
    try:
        yield
    finally:
        end = time.perf_counter()
        print(f"{label}: {(end - start):.6f} seconds")

async def fetch_data():
    await asyncio.sleep(0.3)
    return "Data fetched"

async def process_data():
    await asyncio.sleep(0.2)
    return "Data processed"

async def main():
    async with async_timer("Fetch operation"):
        data = await fetch_data()
    
    async with async_timer("Process operation"):
        result = await process_data()
    
    async with async_timer("Total parallel operations"):
        results = await asyncio.gather(
            fetch_data(),
            process_data()
        )

asyncio.run(main())

Measuring Concurrent Async Operations

One of the main benefits of async programming is the ability to run multiple operations concurrently. Measuring the execution time of concurrent operations requires understanding the difference between individual operation time and total elapsed time for the group.

import asyncio
import time

async def async_task(task_id, duration):
    start = time.perf_counter()
    await asyncio.sleep(duration)
    end = time.perf_counter()
    return {
        'task_id': task_id,
        'duration': duration,
        'actual_time': end - start
    }

async def measure_concurrent_tasks():
    tasks = [
        async_task(1, 0.3),
        async_task(2, 0.5),
        async_task(3, 0.2)
    ]
    
    start = time.perf_counter()
    results = await asyncio.gather(*tasks)
    end = time.perf_counter()
    
    total_time = end - start
    individual_times = sum(r['actual_time'] for r in results)
    
    print(f"Total elapsed time: {total_time:.6f} seconds")
    print(f"Sum of individual times: {individual_times:.6f} seconds")
    print(f"Time saved by concurrency: {(individual_times - total_time):.6f} seconds")
    
    return results

asyncio.run(measure_concurrent_tasks())
"In asynchronous programming, the difference between sequential execution time and concurrent execution time represents the actual benefit of your async implementation."

Best Practices and Common Pitfalls

Accurate execution time measurement requires more than just knowing which tools to use. Understanding common pitfalls and following best practices ensures that your measurements reflect genuine performance characteristics rather than measurement artifacts or environmental factors.

Warming Up Before Measurement

The first execution of code often takes longer than subsequent executions due to various initialization overhead, including Python's bytecode compilation, module imports, and system-level caching. Running your code several times before taking measurements helps ensure you're measuring steady-state performance rather than initialization costs.

import time

def benchmark_with_warmup(func, warmup_runs=3, measurement_runs=10):
    # Warm-up phase
    for _ in range(warmup_runs):
        func()
    
    # Measurement phase
    times = []
    for _ in range(measurement_runs):
        start = time.perf_counter()
        func()
        end = time.perf_counter()
        times.append(end - start)
    
    avg_time = sum(times) / len(times)
    min_time = min(times)
    max_time = max(times)
    
    return {
        'average': avg_time,
        'minimum': min_time,
        'maximum': max_time,
        'variance': max_time - min_time
    }

def sample_function():
    return sum(range(100000))

results = benchmark_with_warmup(sample_function)
print(f"Average: {results['average']:.6f}s")
print(f"Minimum: {results['minimum']:.6f}s")
print(f"Maximum: {results['maximum']:.6f}s")

Accounting for Garbage Collection

Python's garbage collector can introduce unpredictable pauses during execution, potentially skewing timing measurements. For consistent benchmarking, consider disabling garbage collection during measurements or explicitly triggering it before starting your timing.

import time
import gc

def benchmark_with_gc_control(func, runs=10):
    # Disable garbage collection
    gc.disable()
    
    times = []
    try:
        for _ in range(runs):
            # Explicitly collect before each run
            gc.collect()
            
            start = time.perf_counter()
            func()
            end = time.perf_counter()
            
            times.append(end - start)
    finally:
        # Re-enable garbage collection
        gc.enable()
    
    return min(times)  # Return minimum as most representative

def memory_intensive_operation():
    data = [i * 2 for i in range(1000000)]
    return sum(data)

execution_time = benchmark_with_gc_control(memory_intensive_operation)
print(f"Execution time: {execution_time:.6f} seconds")

Understanding Measurement Overhead

Every timing mechanism introduces some overhead—the act of measuring affects what's being measured. For very fast operations (microseconds or less), this overhead can become significant relative to the actual execution time. When measuring extremely fast code, subtract the timing overhead or use specialized tools designed for minimal intrusion.

import time

def measure_timing_overhead(iterations=100000):
    # Measure the overhead of timing itself
    overhead_times = []
    
    for _ in range(iterations):
        start = time.perf_counter()
        end = time.perf_counter()
        overhead_times.append(end - start)
    
    avg_overhead = sum(overhead_times) / len(overhead_times)
    return avg_overhead

def fast_operation():
    x = 1 + 1
    return x

# Calculate timing overhead
overhead = measure_timing_overhead()
print(f"Average timing overhead: {overhead:.9f} seconds")

# Measure actual operation
start = time.perf_counter()
result = fast_operation()
end = time.perf_counter()
raw_time = end - start

# Adjust for overhead
adjusted_time = raw_time - overhead
print(f"Raw measurement: {raw_time:.9f} seconds")
print(f"Adjusted time: {adjusted_time:.9f} seconds")

Considering System Load and Background Processes

External factors like system load, background processes, and resource contention can significantly impact timing measurements. For reliable benchmarking, run measurements multiple times, consider the system state, and use statistical measures like median or minimum time rather than relying solely on averages.

"The minimum execution time across multiple runs often provides the most accurate representation of your code's true performance potential, as it reflects execution with minimal interference from external factors."

Practical Applications and Real-World Scenarios

Understanding how to apply execution time measurement techniques in real-world scenarios transforms theoretical knowledge into practical skill. Different situations call for different measurement approaches, and recognizing which technique suits your specific needs is crucial for effective performance optimization.

API Response Time Monitoring

When building web services or APIs, monitoring response times helps ensure you meet service level agreements and maintain good user experience. Combining timing measurements with logging creates a performance monitoring system that tracks how your API performs under real-world conditions.

import time
import logging
from functools import wraps

logging.basicConfig(level=logging.INFO)

def monitor_api_performance(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        try:
            result = func(*args, **kwargs)
            status = "success"
            return result
        except Exception as e:
            status = "error"
            raise
        finally:
            end = time.perf_counter()
            duration = end - start
            
            logging.info(
                f"API: {func.__name__} | "
                f"Duration: {duration:.3f}s | "
                f"Status: {status}"
            )
    
    return wrapper

@monitor_api_performance
def fetch_user_data(user_id):
    # Simulate database query
    time.sleep(0.05)
    return {"user_id": user_id, "name": "John Doe"}

@monitor_api_performance
def process_payment(amount):
    # Simulate payment processing
    time.sleep(0.1)
    return {"status": "completed", "amount": amount}

# Simulate API calls
fetch_user_data(123)
process_payment(99.99)

Database Query Optimization

Database operations frequently represent performance bottlenecks in applications. Measuring query execution times helps identify slow queries and validate that optimizations actually improve performance. This approach works with any database library by wrapping query execution with timing logic.

import time
from contextlib import contextmanager

class DatabaseProfiler:
    def __init__(self):
        self.query_times = {}
    
    @contextmanager
    def profile_query(self, query_name):
        start = time.perf_counter()
        try:
            yield
        finally:
            end = time.perf_counter()
            duration = end - start
            
            if query_name not in self.query_times:
                self.query_times[query_name] = []
            
            self.query_times[query_name].append(duration)
    
    def report(self):
        print("\n=== Database Query Performance Report ===")
        for query_name, times in sorted(
            self.query_times.items(),
            key=lambda x: sum(x[1]),
            reverse=True
        ):
            total = sum(times)
            avg = total / len(times)
            count = len(times)
            print(f"\n{query_name}:")
            print(f"  Total time: {total:.3f}s")
            print(f"  Average: {avg:.3f}s")
            print(f"  Executions: {count}")

# Usage example
profiler = DatabaseProfiler()

def simulate_query(query_name, duration):
    time.sleep(duration)
    return f"Results for {query_name}"

# Simulate various queries
with profiler.profile_query("SELECT users"):
    result = simulate_query("SELECT users", 0.05)

with profiler.profile_query("SELECT orders"):
    result = simulate_query("SELECT orders", 0.15)

with profiler.profile_query("SELECT users"):
    result = simulate_query("SELECT users", 0.04)

profiler.report()

Batch Processing Performance Tracking

When processing large datasets in batches, tracking execution time per batch and overall progress helps estimate completion times and identify performance degradation as the process continues. This information is valuable for capacity planning and optimization efforts.

import time
from typing import List, Any

class BatchProcessor:
    def __init__(self, batch_size: int):
        self.batch_size = batch_size
        self.batch_times: List[float] = []
        self.total_items = 0
    
    def process_batches(self, items: List[Any], process_func):
        total_items = len(items)
        num_batches = (total_items + self.batch_size - 1) // self.batch_size
        
        print(f"Processing {total_items} items in {num_batches} batches...")
        
        overall_start = time.perf_counter()
        
        for i in range(0, total_items, self.batch_size):
            batch = items[i:i + self.batch_size]
            batch_num = (i // self.batch_size) + 1
            
            batch_start = time.perf_counter()
            process_func(batch)
            batch_end = time.perf_counter()
            
            batch_time = batch_end - batch_start
            self.batch_times.append(batch_time)
            self.total_items += len(batch)
            
            # Calculate progress metrics
            avg_batch_time = sum(self.batch_times) / len(self.batch_times)
            remaining_batches = num_batches - batch_num
            estimated_remaining = remaining_batches * avg_batch_time
            
            print(
                f"Batch {batch_num}/{num_batches}: "
                f"{batch_time:.3f}s | "
                f"Avg: {avg_batch_time:.3f}s | "
                f"Est. remaining: {estimated_remaining:.1f}s"
            )
        
        overall_end = time.perf_counter()
        total_time = overall_end - overall_start
        
        print(f"\nCompleted in {total_time:.2f}s")
        print(f"Average batch time: {sum(self.batch_times)/len(self.batch_times):.3f}s")
        print(f"Throughput: {self.total_items/total_time:.1f} items/second")

# Example usage
def process_batch(batch):
    # Simulate processing
    time.sleep(0.02 * len(batch))

processor = BatchProcessor(batch_size=100)
data = list(range(1000))
processor.process_batches(data, process_batch)

Statistical Analysis of Timing Data

Single measurements rarely tell the complete performance story. Statistical analysis of multiple measurements provides insights into performance consistency, identifies outliers, and helps distinguish genuine performance differences from random variation. This analytical approach transforms raw timing data into actionable intelligence.

Calculating Performance Statistics

Beyond simple averages, comprehensive statistical measures including median, standard deviation, and percentiles reveal the distribution of execution times and help identify performance patterns that averages alone might obscure.

import time
import statistics
from typing import List, Callable

class PerformanceAnalyzer:
    def __init__(self):
        self.measurements: List[float] = []
    
    def measure(self, func: Callable, iterations: int = 100):
        self.measurements = []
        
        for _ in range(iterations):
            start = time.perf_counter()
            func()
            end = time.perf_counter()
            self.measurements.append(end - start)
    
    def analyze(self) -> dict:
        if not self.measurements:
            return {}
        
        sorted_times = sorted(self.measurements)
        n = len(sorted_times)
        
        return {
            'count': n,
            'mean': statistics.mean(sorted_times),
            'median': statistics.median(sorted_times),
            'stdev': statistics.stdev(sorted_times) if n > 1 else 0,
            'min': min(sorted_times),
            'max': max(sorted_times),
            'p95': sorted_times[int(n * 0.95)],
            'p99': sorted_times[int(n * 0.99)],
        }
    
    def report(self):
        stats = self.analyze()
        if not stats:
            print("No measurements available")
            return
        
        print("\n=== Performance Analysis Report ===")
        print(f"Iterations: {stats['count']}")
        print(f"Mean: {stats['mean']*1000:.3f}ms")
        print(f"Median: {stats['median']*1000:.3f}ms")
        print(f"Std Dev: {stats['stdev']*1000:.3f}ms")
        print(f"Min: {stats['min']*1000:.3f}ms")
        print(f"Max: {stats['max']*1000:.3f}ms")
        print(f"95th percentile: {stats['p95']*1000:.3f}ms")
        print(f"99th percentile: {stats['p99']*1000:.3f}ms")

# Example usage
def sample_operation():
    result = sum(range(10000))
    return result

analyzer = PerformanceAnalyzer()
analyzer.measure(sample_operation, iterations=200)
analyzer.report()
"Percentile measurements, particularly the 95th and 99th percentiles, reveal the performance experience of your slowest users—the ones most likely to notice and complain about poor performance."

Detecting Performance Regressions

Comparing timing measurements across different code versions or implementations helps detect performance regressions and validate optimizations. Statistical significance testing ensures that observed differences represent genuine performance changes rather than random variation.

import time
import statistics
from typing import Callable, List

class PerformanceComparator:
    def __init__(self, iterations: int = 100):
        self.iterations = iterations
    
    def compare(self, func1: Callable, func2: Callable, 
                name1: str = "Version 1", name2: str = "Version 2"):
        print(f"\nComparing {name1} vs {name2}...")
        
        times1 = self._measure_multiple(func1)
        times2 = self._measure_multiple(func2)
        
        mean1 = statistics.mean(times1)
        mean2 = statistics.mean(times2)
        
        improvement = ((mean1 - mean2) / mean1) * 100
        
        print(f"\n{name1}: {mean1*1000:.3f}ms (avg)")
        print(f"{name2}: {mean2*1000:.3f}ms (avg)")
        
        if improvement > 0:
            print(f"✅ {name2} is {improvement:.1f}% faster")
        elif improvement < 0:
            print(f"⚠️ {name2} is {abs(improvement):.1f}% slower")
        else:
            print(f"➡️ No significant difference")
        
        return {
            'version1': {'name': name1, 'mean': mean1, 'times': times1},
            'version2': {'name': name2, 'mean': mean2, 'times': times2},
            'improvement_percent': improvement
        }
    
    def _measure_multiple(self, func: Callable) -> List[float]:
        times = []
        for _ in range(self.iterations):
            start = time.perf_counter()
            func()
            end = time.perf_counter()
            times.append(end - start)
        return times

# Example: comparing two implementations
def implementation_v1():
    result = []
    for i in range(1000):
        result.append(i ** 2)
    return result

def implementation_v2():
    return [i ** 2 for i in range(1000)]

comparator = PerformanceComparator(iterations=500)
results = comparator.compare(
    implementation_v1, 
    implementation_v2,
    "Loop with append",
    "List comprehension"
)
What is the most accurate way to measure execution time in Python?

The most accurate method depends on your specific needs, but time.perf_counter() generally provides the highest resolution for measuring wall-clock time. For CPU-bound operations, time.process_time() gives you pure CPU time. For comprehensive analysis, the timeit module automatically handles many measurement pitfalls by running code multiple times and providing statistical results. When measuring very short operations, always run multiple iterations and take the minimum or median result to minimize the impact of system-level interference.

Why do my timing measurements vary between runs?

Timing variations occur due to numerous factors including CPU scheduling, garbage collection, system load, background processes, cache effects, and operating system interrupts. Python's garbage collector can introduce unpredictable pauses, while other processes competing for CPU time affect your measurements. To minimize variation, run warm-up iterations before measuring, disable garbage collection during critical measurements, close unnecessary applications, and collect multiple measurements to identify the typical range rather than relying on a single value.

How many iterations should I run when benchmarking code?

The appropriate number of iterations depends on how long your code takes to execute. For operations taking milliseconds, run hundreds or thousands of iterations to get statistically meaningful results. For operations taking seconds, fewer iterations may suffice. The timeit module automatically determines appropriate iteration counts, but as a general rule, aim for total measurement time of at least a few seconds to minimize the relative impact of measurement overhead and system variability. Always examine the distribution of results rather than just the average.

Should I use wall-clock time or CPU time for performance measurement?

Wall-clock time (measured with time.perf_counter()) represents the actual elapsed time from a user's perspective, including time spent waiting for I/O operations, sleeping, or when the process is not scheduled. CPU time (measured with time.process_time()) measures only the time the CPU actively spends executing your code. Use wall-clock time when measuring user-facing operations or when I/O is involved. Use CPU time when comparing algorithmic efficiency or measuring purely computational operations where you want to exclude external factors.

How can I measure the performance of asynchronous code accurately?

Measuring async code requires understanding the difference between individual operation time and concurrent execution time. Use the same timing functions (time.perf_counter()) but ensure you properly await async operations while timing. Create async context managers or decorators that work with async/await syntax. When measuring concurrent operations with asyncio.gather(), recognize that the total elapsed time will be less than the sum of individual operation times—this difference represents the benefit of concurrency. Always measure both individual task times and total concurrent execution time to understand your async code's true performance characteristics.

What's the difference between profiling and timing?

Timing measures how long specific code blocks or functions take to execute, giving you precise measurements of particular operations. Profiling provides a comprehensive view of where time is spent across your entire application, showing which functions are called most frequently and which consume the most cumulative time. Use timing when you know what to measure and need precise results for specific code sections. Use profiling when you need to discover where your application spends time, identify unexpected bottlenecks, or understand the performance characteristics of complex systems with many interacting components.