# Mastering File Handling, Parallel Processing & Python Decorators

Welcome, fellow devs! Whether you're just stepping into the world of Python or brushing up your skills, this guide is designed to give you a hands-on, practical experience with real-world Python features — from handling massive data files to writing clean, efficient, and reusable code. Let's dive right in! Part 1: File Handling in Python 1. Basic File Operations Python makes file operations a breeze using the built-in open() function. Here's a simple way to open, read, and close a file: file = open("sample.txt", "r") content = file.read() file.close() But there's a better way — enter the with statement: with open("sample.txt", "r") as file: content = file.read() Why use with? It ensures the file is closed automatically Prevents memory leaks and file locks Cleaner and more Pythonic 2. Reading and Writing Files Reading a file line-by-line: with open("sample.txt", "r") as file: lines = file.readlines() Writing to a file: with open("output.txt", "w") as file: file.write("Hello, world!") Appending to a file: with open("output.txt", "a") as file: file.write("\nNew line added!") 3. Handling Large Files Efficiently Trying to load a massive file all at once? ❌ Not ideal. Instead, use these efficient techniques: Reading line-by-line (streaming): with open("large_file.txt", "r") as file: for line in file: print(line.strip()) Reading in chunks: with open("large_file.txt", "r") as file: while chunk := file.read(1024): print(chunk) This way, you only load small portions of the file into memory at a time. 4. Working with CSV and Excel Files Using Pandas If you're working with structured data, Pandas is your best friend: import pandas as pd df = pd.read_csv("data.csv") print(df.head()) To write a CSV: df.to_csv("output.csv", index=False) To handle Excel files: df = pd.read_excel("data.xlsx", sheet_name="Sheet1") df.to_excel("output.xlsx", index=False, sheet_name="Results") Handling large CSVs in chunks: chunk_size = 10000 for chunk in pd.read_csv("large_data.csv", chunksize=chunk_size): print(chunk.shape) Part 2: Parallel Processing in Python Want to do more in less time? Use your CPU and I/O more efficiently by using parallelism. 1. Multithreading (Great for I/O-bound tasks) import threading def print_numbers(): for i in range(5): print(i) thread1 = threading.Thread(target=print_numbers) thread2 = threading.Thread(target=print_numbers) thread1.start() thread2.start() thread1.join() thread2.join() Threads are great for tasks like: Downloading files Reading/writing files Making multiple API calls 2. Multiprocessing (Perfect for CPU-bound tasks) from multiprocessing import Pool def square(n): return n * n if __name__ == "__main__": with Pool(4) as p: result = p.map(square, [1, 2, 3, 4]) print(result) Use this when you're crunching data or running heavy calculations. 3. concurrent.futures - Simpler Parallelism For I/O-bound tasks: from concurrent.futures import ThreadPoolExecutor def fetch_data(url): return f"Fetched {url}" urls = ["https://site1.com", "https://site2.com"] with ThreadPoolExecutor() as executor: results = executor.map(fetch_data, urls) print(list(results)) For CPU-bound tasks: from concurrent.futures import ProcessPoolExecutor def cube(n): return n ** 3 with ProcessPoolExecutor() as executor: results = executor.map(cube, [1, 2, 3, 4]) print(list(results)) Part 3: Decorators — Python's Superpower Decorators let you wrap functions with extra behavior. 1. A Simple Decorator def my_decorator(func): def wrapper(): print("Before function call") func() print("After function call") return wrapper @my_decorator def say_hello(): print("Hello!") say_hello() 2. Decorator with Arguments def repeat(n): def decorator(func): def wrapper(*args, **kwargs): for _ in range(n): func(*args, **kwargs) return wrapper return decorator @repeat(3) def greet(): print("Hello!") 3. Using functools.wraps import functools def log(func): @functools.wraps(func) def wrapper(*args, **kwargs): print(f"Calling {func.__name__} with {args}") return func(*args, **kwargs) return wrapper @log def add(a, b): return a + b print(add(2, 3)) functools.wraps keeps the original function name and docstring intact. Lambda Functions Short, anonymous functions. Great for one-liners. add = lambda x, y: x + y print(add(5, 3)) # 8 List Comprehensions squares = [x ** 2 for x in range(5)] Dictionary Comprehensions squares_dict = {x: x ** 2 for x in

May 6, 2025 - 00:33
 0
# Mastering File Handling, Parallel Processing & Python Decorators

Welcome, fellow devs! Whether you're just stepping into the world of Python or brushing up your skills, this guide is designed to give you a hands-on, practical experience with real-world Python features — from handling massive data files to writing clean, efficient, and reusable code.

Let's dive right in!

Part 1: File Handling in Python

1. Basic File Operations

Python makes file operations a breeze using the built-in open() function. Here's a simple way to open, read, and close a file:

file = open("sample.txt", "r")
content = file.read()
file.close()

But there's a better way — enter the with statement:

with open("sample.txt", "r") as file:
    content = file.read()

Why use with?

  • It ensures the file is closed automatically
  • Prevents memory leaks and file locks
  • Cleaner and more Pythonic

2. Reading and Writing Files

Reading a file line-by-line:

with open("sample.txt", "r") as file:
    lines = file.readlines()

Writing to a file:

with open("output.txt", "w") as file:
    file.write("Hello, world!")

Appending to a file:

with open("output.txt", "a") as file:
    file.write("\nNew line added!")

3. Handling Large Files Efficiently

Trying to load a massive file all at once? ❌ Not ideal.

Instead, use these efficient techniques:

Reading line-by-line (streaming):

with open("large_file.txt", "r") as file:
    for line in file:
        print(line.strip())

Reading in chunks:

with open("large_file.txt", "r") as file:
    while chunk := file.read(1024):
        print(chunk)

This way, you only load small portions of the file into memory at a time.

4. Working with CSV and Excel Files Using Pandas

If you're working with structured data, Pandas is your best friend:

import pandas as pd

df = pd.read_csv("data.csv")
print(df.head())

To write a CSV:

df.to_csv("output.csv", index=False)

To handle Excel files:

df = pd.read_excel("data.xlsx", sheet_name="Sheet1")
df.to_excel("output.xlsx", index=False, sheet_name="Results")

Handling large CSVs in chunks:

chunk_size = 10000
for chunk in pd.read_csv("large_data.csv", chunksize=chunk_size):
    print(chunk.shape)

Part 2: Parallel Processing in Python

Want to do more in less time? Use your CPU and I/O more efficiently by using parallelism.

1. Multithreading (Great for I/O-bound tasks)

import threading

def print_numbers():
    for i in range(5):
        print(i)

thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_numbers)

thread1.start()
thread2.start()
thread1.join()
thread2.join()

Threads are great for tasks like:

  • Downloading files
  • Reading/writing files
  • Making multiple API calls

2. Multiprocessing (Perfect for CPU-bound tasks)

from multiprocessing import Pool

def square(n):
    return n * n

if __name__ == "__main__":
    with Pool(4) as p:
        result = p.map(square, [1, 2, 3, 4])
    print(result)

Use this when you're crunching data or running heavy calculations.

3. concurrent.futures - Simpler Parallelism

For I/O-bound tasks:

from concurrent.futures import ThreadPoolExecutor

def fetch_data(url):
    return f"Fetched {url}"

urls = ["https://site1.com", "https://site2.com"]
with ThreadPoolExecutor() as executor:
    results = executor.map(fetch_data, urls)
    print(list(results))

For CPU-bound tasks:

from concurrent.futures import ProcessPoolExecutor

def cube(n):
    return n ** 3

with ProcessPoolExecutor() as executor:
    results = executor.map(cube, [1, 2, 3, 4])
    print(list(results))

Part 3: Decorators — Python's Superpower

Decorators let you wrap functions with extra behavior.

1. A Simple Decorator

def my_decorator(func):
    def wrapper():
        print("Before function call")
        func()
        print("After function call")
    return wrapper

@my_decorator
def say_hello():
    print("Hello!")

say_hello()

2. Decorator with Arguments

def repeat(n):
    def decorator(func):
        def wrapper(*args, **kwargs):
            for _ in range(n):
                func(*args, **kwargs)
        return wrapper
    return decorator

@repeat(3)
def greet():
    print("Hello!")

3. Using functools.wraps

import functools

def log(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        print(f"Calling {func.__name__} with {args}")
        return func(*args, **kwargs)
    return wrapper

@log
def add(a, b):
    return a + b

print(add(2, 3))

functools.wraps keeps the original function name and docstring intact.

Lambda Functions

Short, anonymous functions. Great for one-liners.

add = lambda x, y: x + y
print(add(5, 3))  # 8

List Comprehensions

squares = [x ** 2 for x in range(5)]

Dictionary Comprehensions

squares_dict = {x: x ** 2 for x in range(5)}

Assignment Questions (Practice Makes Perfect!)

Part 1: File Handling

  1. Write a Python program that reads a CSV file, filters rows where a specific column > 100, and writes the result to a new file.
  2. Modify the program to process large files in chunks.

Part 2: Parallel Processing

  1. Use multithreading to download multiple files simultaneously.
  2. Use multiprocessing to compute factorials of numbers from 1 to 10.

Part 3: Decorators

  1. Create a decorator that logs function execution time.
  2. Write a decorator that caches results of function calls.

Thanks for your time! Feel free to ask any questions!
Which concept would you like to see a deep dive on next? Let me know in the comments!