Clean Architecture in Python: Patterns, Principles, and Pythonic Design

Section 4 of 13

Object-Oriented Design in Python: Classes Done Right

By the time you've been writing Python for a few years, you've probably seen—or written—classes that do too much. The kind with thirty methods, a constructor accepting a dozen arguments, and a docstring that starts with "This class handles..." before listing six different things. We've all been there. The real question isn't whether you'll encounter these classes. It's whether you'll recognize what's wrong, name it, and know how to fix it.

This section is about building that vocabulary and those instincts. We're going to look at what makes a class genuinely well-designed—specifically cohesion and coupling at the class level—and then explore how Python gives you real, usable tools for solving these problems: dataclasses, properties, the dunder protocol, and the discipline to keep behaviour where it belongs.

One thing to establish upfront: object-oriented design in Python is not Java with different keywords. Python gives you enough rope to write Java-style classes, and people do it constantly, and it almost always backfires. The Pythonic version is usually shorter, clearer, and easier to test. We'll see why as we go.

Cohesion: The One Job Your Class Should Do

Cohesion is straightforward: it's the degree to which the elements of a class belong together. High cohesion means the class does one clearly-defined thing and everything in it serves that purpose. Low cohesion means you've got a grab-bag—a class that writes to the database, validates user input, sends emails, and generates PDF reports. (Only slightly exaggerated. I've seen worse.)

The Single Responsibility Principle, often attributed to Robert C. Martin, puts it more carefully: "a class should have only one reason to change." That framing is cleaner because it forces you to think forward in time. If your class would need to change when the email provider changes, and when the database schema changes, and when the PDF format changes, then you have a cohesion problem.

Here's a practical test: can you describe your class's purpose without using the word "and"? If you catch yourself saying "this class manages users and sends notifications and persists sessions," that's really three classes pretending to be one.

Let's look at a low-cohesion example:

class UserManager:
    def create_user(self, username, email):
        # Insert into database
        ...

    def send_welcome_email(self, user):
        # Connect to SMTP, build message, send
        ...

    def generate_user_report(self, user):
        # Build PDF with user stats
        ...

    def validate_password(self, password):
        # Check length, complexity rules
        ...

This class has four distinct reasons to change: database schema, email templates, report format, and password policy. Each belongs in its own class (or, depending on scope, its own module of functions—we covered that in the previous section). Splitting them makes each piece independently testable, independently replaceable, and independently understandable.

Coupling: What Your Class Knows About the Outside World

If cohesion is about what's inside the class, coupling is about what the class depends on outside itself. Coupling measures how tightly your class is entangled with other classes and modules. High coupling means changing one class ripples through the system. Low coupling means you can swap one class out without everything else noticing.

The clearest signal of tight coupling is imports. If your class imports five other concrete classes and calls their methods directly, you're locked into all of them. The moment any of them change their interface, your class breaks.

# Tight coupling: UserService knows exactly who it's talking to
from myapp.database.postgres_repository import PostgresUserRepository
from myapp.email.mailgun_client import MailgunEmailClient
from myapp.pdf.wkhtmltopdf_generator import WkhtmltopdfReportGenerator

class UserService:
    def __init__(self):
        self.repo = PostgresUserRepository()
        self.email = MailgunEmailClient()
        self.reports = WkhtmltopdfReportGenerator()

Now you swap Mailgun for SendGrid and you're editing UserService. You swap Postgres for SQLite in tests and you're editing UserService. That's coupling making your life harder.

The solution involves dependency inversion—accepting collaborators from outside rather than constructing them inside—which we'll dig into thoroughly in Section 8. For now, just recognize the pattern: a class that reaches out to construct its own dependencies is announcing its couplings loudly.

graph TD
    A[Low Coupling + High Cohesion<br/>= Ideal] --> B[Easy to Test]
    A --> C[Easy to Replace]
    A --> D[Easy to Understand]
    E[High Coupling + Low Cohesion<br/>= God Object] --> F[Breaks When Anything Changes]
    E --> G[Impossible to Test in Isolation]
    E --> H[Nobody Dares Touch It]

Dataclasses as the Pythonic Value Object

Python's dataclasses module, introduced in 3.7, is one of the best standard library additions of the last decade. It solves a genuine, annoying problem: representing structured data without writing boilerplate.

Before dataclasses, if you wanted a class to hold a money amount and currency, you'd write something like this:

class Money:
    def __init__(self, amount, currency):
        self.amount = amount
        self.currency = currency

    def __repr__(self):
        return f"Money(amount={self.amount!r}, currency={self.currency!r})"

    def __eq__(self, other):
        if not isinstance(other, Money):
            return NotImplemented
        return self.amount == other.amount and self.currency == other.currency

That's thirty lines of boilerplate for two fields. With dataclasses:

from dataclasses import dataclass

@dataclass
class Money:
    amount: Decimal
    currency: str

You get __init__, __repr__, and __eq__ for free. The type annotations do double duty: they document the fields and they let tools like mypy and Pyright catch type errors before runtime.

The architectural payoff is real: dataclasses encourage you to think of your data structures as first-class design artifacts, not as incidental dictionaries or tuples. That mindset shift matters.

Frozen Dataclasses and Immutability as a Design Choice

Immutability doesn't get enough respect in Python circles, probably because Python doesn't enforce it the way functional languages do. But immutable objects are genuinely simpler to reason about: if you can't change an object's state, you can't accidentally share mutated state between components, and you don't need to worry about concurrency issues.

The frozen=True parameter makes a dataclass immutable:

from dataclasses import dataclass
from decimal import Decimal

@dataclass(frozen=True)
class Money:
    amount: Decimal
    currency: str

price = Money(Decimal("9.99"), "USD")
price.amount = Decimal("0.01")  # Raises FrozenInstanceError

More interestingly, frozen dataclasses are hashable by default (as long as all their fields are hashable), which means you can use them as dictionary keys or in sets:

prices = {
    Money(Decimal("9.99"), "USD"): "Standard",
    Money(Decimal("19.99"), "USD"): "Pro",
}

This is powerful for domain modelling. Making a type frozen is an architectural statement: "instances of this type represent values, not mutable state." That decision points us directly at one of the most useful distinctions in domain-driven design.

Entities vs Value Objects: Identity and Equality

This is one of the most practically useful distinctions in domain modelling, and it usually gets explained in confusing ways because people focus on the pattern's name rather than the underlying question it answers.

The question is: what makes two instances the same?

For a value object, sameness is defined entirely by the values it holds. Two Money(Decimal("10.00"), "USD") instances are identical—it doesn't matter which particular object in memory you're holding. This is how 5 == 5 works in mathematics. Value objects are immutable, interchangeable, and compared by value.

For an entity, sameness is defined by identity—usually a unique identifier. Two User objects with the same name and email are different users if they have different IDs. Entities have a lifecycle, can change their internal state over time, and are compared by their ID, not their field values.

from dataclasses import dataclass, field
from decimal import Decimal
import uuid

# Value object: equality by value
@dataclass(frozen=True)
class Money:
    amount: Decimal
    currency: str

# Entity: equality by identity
@dataclass
class Order:
    id: uuid.UUID = field(default_factory=uuid.uuid4)
    customer_email: str = ""
    total: Money = field(default_factory=lambda: Money(Decimal("0"), "USD"))

    def __eq__(self, other):
        if not isinstance(other, Order):
            return NotImplemented
        return self.id == other.id

    def __hash__(self):
        return hash(self.id)

Notice that Order has mutable state—you can update customer_email or total—but its identity is fixed by id. Two orders with the same email address are still different orders if they have different IDs.

Here's the practical consequence: if you're modelling something with a lifecycle—a user, an order, a bank account, a session—it's probably an entity. If you're modelling a measurement, a quantity, a colour, a coordinate, or anything whose meaning is entirely contained in its values—it's probably a value object. When in doubt: if replacing instance A with a different instance B that has identical field values would be meaningless from a business perspective, A is a value object.

The Anemic Domain Model: When Classes Have No Soul

Martin Fowler called the anemic domain model an anti-pattern, and he was right, though the term doesn't circulate much outside DDD circles. Here's what it looks like:

@dataclass
class Order:
    id: uuid.UUID
    items: list
    status: str
    total: Decimal

class OrderService:
    def add_item(self, order, item):
        order.items.append(item)
        order.total += item.price

    def submit(self, order):
        if order.status != "draft":
            raise ValueError("Can only submit draft orders")
        order.status = "submitted"

    def cancel(self, order):
        if order.status == "submitted":
            raise ValueError("Cannot cancel submitted orders")
        order.status = "cancelled"

The Order class is just a data container—a dictionary with type annotations. All the logic that governs orders lives somewhere else in OrderService. This looks clean on the surface, but it's actually a design failure. The rules about what orders can do—"you can only submit a draft," "you can't cancel a submitted order"—are domain logic. They belong in the domain object.

Why? Because domain logic tends to get duplicated. Another service—AdminOrderService, BatchOrderProcessor—has to repeat all those state checks. Or worse, it doesn't repeat them and accidentally allows invalid transitions that should be impossible.

The fix is to move behaviour back into the object that owns the data:

@dataclass
class Order:
    id: uuid.UUID = field(default_factory=uuid.uuid4)
    items: list = field(default_factory=list)
    status: str = "draft"
    total: Decimal = field(default=Decimal("0"))

    def add_item(self, item: "OrderItem") -> None:
        if self.status != "draft":
            raise ValueError(f"Cannot add items to a {self.status} order")
        self.items.append(item)
        self.total += item.price

    def submit(self) -> None:
        if self.status != "draft":
            raise ValueError(f"Cannot submit a {self.status} order")
        if not self.items:
            raise ValueError("Cannot submit an empty order")
        self.status = "submitted"

    def cancel(self) -> None:
        if self.status not in ("draft", "submitted"):
            raise ValueError(f"Cannot cancel a {self.status} order")
        self.status = "cancelled"

Now the object enforces its own invariants. Any code holding an Order can call submit() and trust that the transition is valid. Domain logic lives in one place, not scattered across services. This is the core insight behind rich domain models, and it connects directly to the central thesis of this course: behaviour belongs with data, and Python gives you the tools to make that happen cleanly.

One important caveat: "rich domain model" doesn't mean "put everything in the class." Infrastructure concerns—persisting to a database, sending emails—still don't belong in Order. The distinction is between domain behaviour (the rules of the business) and infrastructure behaviour (talking to the outside world). You keep them separate.

God Objects: Recognising and Breaking Them Up

A God object is the logical conclusion of low cohesion: a class that knows too much, does too much, and has so many methods that changing any of them feels dangerous without reading all the others first. They grow gradually—"I'll just add this one method here"—and before long you have a 2,000-line class that everyone fears.

Signs you're looking at a God object:

The class has methods spanning multiple conceptual domains
Its constructor takes more than four or five arguments
The class has more than a handful of direct dependencies
Understanding one method requires understanding state set by five other methods
Tests for this class take minutes to set up

The forensic approach to breaking one up is to look at method signatures and ask which methods share state. If method_a reads and writes self.email_client and self.template_engine, and method_b reads and writes self.db and self.query_cache, and nothing crosses that boundary—you've found two classes hiding inside one.

# Before: God object
class Application:
    def __init__(self, db_url, smtp_host, template_dir, cache_ttl, ...):
        self.db = Database(db_url)
        self.email = EmailClient(smtp_host)
        self.templates = TemplateEngine(template_dir)
        self.cache = Cache(cache_ttl)
        # ...

    def get_user(self, user_id): ...
    def create_user(self, ...): ...
    def send_welcome_email(self, user): ...
    def send_password_reset(self, user): ...
    def render_dashboard(self, user): ...
    def render_profile(self, user): ...
    def invalidate_cache(self, key): ...
    # ... 40 more methods

# After: extracted, focused classes
class UserRepository:
    def __init__(self, db): self.db = db
    def get(self, user_id): ...
    def create(self, ...): ...

class NotificationService:
    def __init__(self, email_client, templates):
        self.email_client = email_client
        self.templates = templates
    def send_welcome(self, user): ...
    def send_password_reset(self, user): ...

The refactoring itself is often mechanical—extract the methods that share dependencies into a new class, pass that shared state in via the constructor. What's genuinely hard is the courage to do it in an existing codebase where the God object has accumulated years of tangled state. The tests you write before touching it are your safety net; we'll cover that in Section 12.

Class, Instance, and Static Methods: Choosing the Right Tool

Python gives you three kinds of method, and the choice between them signals design intent:

Instance methods are the default. They receive self and have access to instance state. Use them for behaviour that operates on or modifies the object's state.

Class methods receive cls instead of self. They have access to the class but not to any particular instance. The most common use is as alternative constructors:

@dataclass
class Money:
    amount: Decimal
    currency: str

    @classmethod
    def zero(cls, currency: str) -> "Money":
        return cls(Decimal("0"), currency)

    @classmethod
    def from_cents(cls, cents: int, currency: str) -> "Money":
        return cls(Decimal(cents) / 100, currency)

# Clearer than Money(Decimal("0"), "USD") everywhere
starting_balance = Money.zero("USD")

Alternative constructors via @classmethod are idiomatic Python and should be used liberally. They're far cleaner than overloading __init__ with optional parameters serving multiple purposes.

Static methods receive neither self nor cls. They're functions that happen to live in a class's namespace. The honest question about any @staticmethod is: why is this in the class at all? If it doesn't use class or instance state, maybe it belongs as a module-level function. Sometimes the answer is legitimate—grouping related utility functions under a class namespace can aid discoverability—but it's worth being intentional.

class PasswordPolicy:
    MIN_LENGTH = 12

    @staticmethod
    def validate(password: str) -> bool:
        return (
            len(password) >= PasswordPolicy.MIN_LENGTH
            and any(c.isupper() for c in password)
            and any(c.isdigit() for c in password)
        )

This is fine, but notice that validate could equally be a module-level function validate_password. The class grouping is mostly for namespace organization. That's reasonable. What's not reasonable is using @staticmethod to make helper functions harder to import elsewhere.

Properties and Descriptors: Encapsulation Without Boilerplate

Java-style getters and setters are one of the worst things to import into Python from other languages:

# Please don't
class BankAccount:
    def get_balance(self):
        return self._balance

    def set_balance(self, value):
        self._balance = value

Python's @property decorator gives you the same encapsulation with attribute-access syntax. More importantly: you can start with a plain attribute and later add validation logic without changing the calling code at all:

from decimal import Decimal

class BankAccount:
    def __init__(self, initial_balance: Decimal):
        self._balance = initial_balance

    @property
    def balance(self) -> Decimal:
        return self._balance

    def deposit(self, amount: Decimal) -> None:
        if amount <= 0:
            raise ValueError("Deposit amount must be positive")
        self._balance += amount

    def withdraw(self, amount: Decimal) -> None:
        if amount <= 0:
            raise ValueError("Withdrawal amount must be positive")
        if amount > self._balance:
            raise ValueError("Insufficient funds")
        self._balance -= amount

Notice that balance is read-only—no setter. This is deliberate: the only way to change the balance is through deposit and withdraw, which enforce business rules. This is encapsulation actually doing work, not just hiding a field behind a method call.

When you need computed attributes or validation on assignment, properties handle both:

class Temperature:
    def __init__(self, celsius: float):
        self.celsius = celsius  # Goes through the setter

    @property
    def celsius(self) -> float:
        return self._celsius

    @celsius.setter
    def celsius(self, value: float) -> None:
        if value < -273.15:
            raise ValueError("Temperature below absolute zero")
        self._celsius = value

    @property
    def fahrenheit(self) -> float:
        return self._celsius * 9/5 + 32

Descriptors are the lower-level mechanism that @property is built on. They're worth knowing when you need reusable property logic across multiple classes. A descriptor is any class implementing __get__, __set__, or __delete__. They're a rabbit hole with real depth—if you're finding yourself writing the same validation property across multiple classes, that's when to dive into descriptors. For most application code, @property is sufficient.

Designing for Debuggability: `repr`, `eq`, and `hash`

These three dunders are often forgotten. They shouldn't be.

__repr__ is what appears in the REPL, in error messages, and in test output. A good __repr__ shows you everything needed to understand the object's state and, ideally, produces output that could be used to recreate the object. The convention is ClassName(field=value, ...):

class Order:
    def __repr__(self):
        return (
            f"Order(id={self.id!r}, status={self.status!r}, "
            f"item_count={len(self.items)})"
        )

When a test fails and prints <Order object at 0x7f3a2c>, you learn nothing. When it prints Order(id=UUID('...'), status='submitted', item_count=3), you can debug immediately. This sounds trivial until you're staring at a failing test late at night and genuinely grateful for whoever wrote the repr.

__eq__ controls what == means. The default—identity comparison—is almost never what you want for domain objects. If you're not using @dataclass (which provides __eq__ for free based on fields), define it explicitly:

def __eq__(self, other):
    if not isinstance(other, type(self)):
        return NotImplemented
    return self.id == other.id  # For entities

Note the return NotImplemented rather than return False when types don't match. This is subtle but important: returning NotImplemented allows Python to try the comparison from the other side, which matters for symmetric operations.

__hash__ is required if you override __eq__, because Python's rule is: objects that compare equal must have the same hash. If you override __eq__ and forget __hash__, Python silently sets __hash__ to None, making your object unhashable and causing baffling errors when someone tries to put it in a set.

For entities, hash on the stable identifier:

def __hash__(self):
    return hash(self.id)

For value objects, hash on all fields (which is exactly what @dataclass(frozen=True) does). For mutable entities, some practitioners choose to make them unhashable intentionally—leaving __hash__ = None explicitly—to prevent subtle bugs from putting mutable objects into sets and then mutating them.

# Explicit decision: this entity is mutable and should not be hashable
@dataclass
class Order:
    id: uuid.UUID = field(default_factory=uuid.uuid4)
    status: str = "draft"
    __hash__ = None  # Explicit, not accidental

That comment matters. The __hash__ = None is there to communicate intent, not as an oversight.

Putting It Together: A Well-Designed Class

Let's look at what a well-designed Python class actually looks like when you apply all these principles together. We'll use a Product example from an e-commerce domain:

from dataclasses import dataclass, field
from decimal import Decimal
from typing import Optional
import uuid


@dataclass(frozen=True)
class Money:
    """A value object representing an amount in a specific currency."""
    amount: Decimal
    currency: str

    def __post_init__(self):
        if self.amount < 0:
            raise ValueError(f"Money amount cannot be negative: {self.amount}")
        if not self.currency or len(self.currency) != 3:
            raise ValueError(f"Currency must be a 3-letter ISO code: {self.currency!r}")

    @classmethod
    def zero(cls, currency: str) -> "Money":
        return cls(Decimal("0"), currency)

    def add(self, other: "Money") -> "Money":
        if self.currency != other.currency:
            raise ValueError(
                f"Cannot add {self.currency} and {other.currency}"
            )
        return Money(self.amount + other.amount, self.currency)

    def multiply(self, factor: Decimal) -> "Money":
        return Money(self.amount * factor, self.currency)


@dataclass
class Product:
    """An entity representing a product in the catalogue."""
    id: uuid.UUID = field(default_factory=uuid.uuid4)
    name: str = ""
    price: Money = field(default_factory=lambda: Money.zero("USD"))
    _stock_count: int = field(default=0, repr=False)

    def __eq__(self, other):
        if not isinstance(other, Product):
            return NotImplemented
        return self.id == other.id

    def __hash__(self):
        return hash(self.id)

    @property
    def stock_count(self) -> int:
        return self._stock_count

    @property
    def is_available(self) -> bool:
        return self._stock_count > 0

    def restock(self, quantity: int) -> None:
        if quantity <= 0:
            raise ValueError(f"Restock quantity must be positive, got {quantity}")
        self._stock_count += quantity

    def sell(self, quantity: int) -> None:
        if quantity <= 0:
            raise ValueError(f"Sell quantity must be positive, got {quantity}")
        if quantity > self._stock_count:
            raise ValueError(
                f"Cannot sell {quantity} units; only {self._stock_count} in stock"
            )
        self._stock_count -= quantity

    @classmethod
    def create(cls, name: str, price: Money) -> "Product":
        """Factory method for creating a new product with validated inputs."""
        if not name.strip():
            raise ValueError("Product name cannot be empty")
        return cls(name=name, price=price)

Money is a frozen value object: immutable, hashable, equality by value, validated at construction, with behaviour that keeps arithmetic in one place. Product is a mutable entity: equality by identity, domain behaviour living in the class, invariants enforced through methods rather than exposed via public attributes.

Neither class knows anything about databases, HTTP, or email. They're pure domain objects—the kind you can test without mocking anything, the kind that survive infrastructure changes without modification. That's the payoff for keeping behaviour close to data while keeping domain logic separate from infrastructure concerns.

The design patterns catalog is full of patterns you might layer on top of this foundation. But the foundation itself—cohesive, loosely coupled classes with appropriate identity semantics and behaviour that enforces their own invariants—is what makes those patterns useful rather than merely decorative.

Next, Section 5 takes all of this and examines it through the lens of SOLID principles. That gives us a more formal language for talking about why certain class designs hold up under change and others collapse.

Functions as the Unit of Design SOLID Principles: The Five Laws of Maintainable Object-Oriented Design

Only visible to you