Go for Pythonistas: Building Web Apps with Go

Section 9 of 13

How to Use Goroutines and Channels in Go

So far, you've learned how Go handles the fundamentals: syntax, types, error handling, and dependency management. Go's module system probably already feels like a breath of fresh air compared to Python's virtualization overhead. But if there's one section of this course that tends to convert Python developers into genuine Go enthusiasts, this is it. Because we're about to tackle the problem where Python's weaknesses become impossible to ignore: concurrency.

Python's concurrency story is... complicated. You've probably felt this — the moment you need to do something concurrent in Python, you're suddenly making decisions about the GIL, choosing between threading and multiprocessing, maybe learning asyncio from scratch, and wondering why a supposedly simple thing requires so much ceremony. Go's answer to this problem is so elegantly different that it feels almost unfair. Let's build up from first principles.

Concurrency vs. Parallelism: The Distinction That Actually Matters

Here's the part that trips up most people: concurrency and parallelism are not the same thing, and understanding the difference shapes how you think about Go's approach.

Concurrency is about structure — managing multiple things happening at the same time. Parallelism is about execution — actually doing multiple things simultaneously on multiple cores. You can have concurrency without parallelism (interleaving tasks on a single core) or parallelism without concurrency (multiple cores independently doing unrelated work). Go's strength is that it makes concurrency so cheap and natural that parallelism almost takes care of itself.

Now, why does Python make this so painful? It comes down to three competing models that don't play nicely together.

threading lets you spawn OS threads that run in parallel. In theory, great. In practice, the Global Interpreter Lock (the GIL) means that only one thread can execute Python bytecode at a time. Threads in Python are useful for I/O-bound work (where the thread releases the GIL while waiting for network or disk), but useless for CPU-bound work. And the overhead is significant — processes are heavy, inter-process communication is awkward, and startup time is noticeable.

multiprocessing sidesteps the GIL by spawning separate Python processes, each with its own interpreter and GIL. Now you can actually parallelize CPU-bound work. But each process is expensive — you're duplicating the entire Python runtime, heap, and loaded modules. You probably can't spin up thousands of them. Inter-process communication is clunky. And debugging is a nightmare.

asyncio is Python's async/await model, introduced in Python 3.4 and refined since. It's actually elegant once you understand it — cooperative multitasking on a single thread, where coroutines voluntarily yield control at await points. But it requires you to mark functions as async, use await everywhere, and it can't mix easily with regular synchronous code. The "function coloring" problem is real: once you go async, everything in your call stack needs to be async-aware. A library that isn't async-native becomes a blocker.

The result: Python has three different concurrency models that don't compose well, each with different trade-offs, and choosing between them requires understanding all three. This complexity is a known pain point for Python web developers.

Go's designers looked at this and said: what if there were just one model, and it worked for everything?

Goroutines: Not Threads, Not Coroutines — Something Better

A goroutine is a function executing concurrently with other goroutines in the same address space. That's the official definition. Here's what it actually means in practice:

Goroutines are not OS threads. They're much lighter. A new goroutine starts with a small stack (around 2KB) that grows and shrinks dynamically as needed.
Goroutines are not Python coroutines. You don't need special syntax, you don't need await, and functions don't need to be marked as anything special.
The Go runtime scheduler multiplexes goroutines onto OS threads, handling the switching automatically. You don't manage this. The scheduler is clever about it — it knows when a goroutine is blocked on I/O and can switch to another one without wasting CPU cycles.

The practical upside: you can launch hundreds of thousands of goroutines in a single program without breaking a sweat. OS threads typically cap out in the thousands before memory becomes a problem — each OS thread carries a fixed stack of 1-8MB. Goroutines are dramatically cheaper, with dynamic stacks starting at 2KB.

graph TD
    A[Go Program] --> B[Go Runtime Scheduler]
    B --> C[OS Thread 1]
    B --> D[OS Thread 2]
    B --> E[OS Thread N]
    C --> F[Goroutine 1]
    C --> G[Goroutine 2]
    D --> H[Goroutine 3]
    D --> I[Goroutine 4]
    E --> J[Goroutine 5...100k]

The `go` Keyword: Launching a Goroutine in One Word

Here's the entire syntax for launching a goroutine:

go someFunction()

That's it. One keyword. Compare this to Python's asyncio where you need to define a coroutine with async def, then await it (or use asyncio.create_task() to run it concurrently, and then manage the event loop...).

Let's see a concrete example:

package main

import (
    "fmt"
    "time"
)

func fetchUserData(userID int) {
    // Simulate a slow database call
    time.Sleep(100 * time.Millisecond)
    fmt.Printf("Got data for user %d\n", userID)
}

func main() {
    // Launch 5 concurrent fetches
    for i := 1; i <= 5; i++ {
        go fetchUserData(i)
    }

    // Wait for goroutines to finish (we'll do this properly in a moment)
    time.Sleep(500 * time.Millisecond)
    fmt.Println("All done")
}

Five database calls happening concurrently, in four lines of logic. No async/await, no executor, no event loop. You call go fetchUserData(i) and the runtime takes care of the rest.

Warning: The time.Sleep at the end is a hack, not a pattern. If the goroutines take longer than expected, your program exits before they finish. We'll fix this properly with WaitGroups shortly.

The equivalent Python asyncio version, for comparison:

import asyncio

async def fetch_user_data(user_id: int) -> None:
    await asyncio.sleep(0.1)  # every call in the chain must be async
    print(f"Got data for user {user_id}")

async def main():
    tasks = [asyncio.create_task(fetch_user_data(i)) for i in range(1, 6)]
    await asyncio.gather(*tasks)

asyncio.run(main())

The Python version isn't bad — asyncio.gather is clean. But notice the ceremony: async def for every function in the chain, await at every call site, asyncio.run() to bootstrap the event loop. Go just says go.

The Problem with Goroutines Alone

Goroutines as shown above fire-and-forget. That's fine for "send a log message asynchronously" but useless for "fetch this data and give me the result." How do you get a value back from a goroutine?

In Python's asyncio, you'd await the result. In Python's threading, you'd use queue.Queue or read a shared variable (carefully). In Go, the idiomatic answer is channels.

Channels: Typed Communication Pipes

A channel is a conduit through which goroutines can send and receive values. Think of it as a thread-safe queue with a type constraint. This quote is often attributed to the Go community and Rob Pike, not Tony Hoare. While the sentiment aligns with CSP (Communicating Sequential Processes) concepts that Hoare developed, this particular phrasing is associated with Go's design philosophy and Rob Pike's presentations.

In Python, you'd typically share a variable between threads and protect it with a lock. Go encourages you to pass data through channels instead — which means you never have to worry about who "owns" the data at any given moment.

Creating a channel:

ch := make(chan string)      // unbuffered channel of strings
results := make(chan int, 10) // buffered channel with capacity 10

Sending to and receiving from a channel:

ch <- "hello"   // send "hello" into ch (blocks if channel is full/no receiver)
msg := <-ch     // receive from ch (blocks until a value is available)

Let's redo the user data example, but this time actually collect the results:

package main

import (
    "fmt"
    "time"
)

func fetchUserData(userID int, ch chan<- string) {
    time.Sleep(100 * time.Millisecond)
    ch <- fmt.Sprintf("data for user %d", userID)
}

func main() {
    ch := make(chan string, 5) // buffered so goroutines don't block each other

    for i := 1; i <= 5; i++ {
        go fetchUserData(i, ch)
    }

    // Collect all 5 results
    for i := 0; i < 5; i++ {
        result := <-ch
        fmt.Println("Received:", result)
    }
}

Now we get the actual results back. The main goroutine blocks on <-ch until each result arrives, so no hacky time.Sleep needed. The results might arrive in any order (whichever goroutine finishes first sends first), but we collect all five.

Buffered vs. Unbuffered Channels

This is where a lot of beginners get confused, so let's be precise about what's actually happening.

Unbuffered channels (make(chan T)) require both a sender and receiver to be ready simultaneously. A send blocks until someone receives, and a receive blocks until someone sends. It's a synchronization point — the two goroutines "meet" to exchange data.

Buffered channels (make(chan T, n)) have an internal queue of capacity n. A send only blocks if the buffer is full; a receive only blocks if the buffer is empty. This decouples the sender and receiver, letting them move at different paces.

graph LR
    subgraph Unbuffered
    A1[Goroutine A] -- "blocks until B receives" --> CH1[Channel]
    CH1 -- "blocks until A sends" --> B1[Goroutine B]
    end
    subgraph Buffered capacity=3
    A2[Goroutine A] -- "send up to 3, then block" --> CH2[Channel 📦📦📦]
    CH2 -- "receive whenever ready" --> B2[Goroutine B]
    end

When to use which:

Unbuffered: when you want to synchronize two goroutines — you want to ensure the receiver has processed the message before the sender continues.
Buffered: when the sender and receiver run at different speeds and you want to absorb bursts without blocking. A worker pool that queues up jobs is a classic use case.

Tip: If you're unsure, start with unbuffered. It forces you to be explicit about synchronization. Only add buffering when you have a specific reason — usually when you've profiled and found a bottleneck, or when your architecture genuinely needs decoupled throughput.

The `select` Statement: Waiting on Multiple Channels

select is one of Go's most powerful concurrency primitives. It's like a switch statement for channels — it waits for whichever channel operation is ready first and executes that case.

select {
case msg := <-ch1:
    fmt.Println("Received from ch1:", msg)
case msg := <-ch2:
    fmt.Println("Received from ch2:", msg)
case <-time.After(1 * time.Second):
    fmt.Println("Timeout!")
}

This is enormously useful in web services. The time.After pattern shown above is how you implement timeouts without a third-party library. If neither ch1 nor ch2 has data within one second, the timeout case fires.

The Python asyncio equivalent is asyncio.wait() with return_when=asyncio.FIRST_COMPLETED, which works but requires wrapping your coroutines in tasks first. select feels cleaner because it operates on the channel operations themselves, not on futures or tasks.

A select with a default case becomes non-blocking:

select {
case msg := <-ch:
    fmt.Println("Got:", msg)
default:
    fmt.Println("No message available, moving on")
}

This is useful for polling patterns where you want to check a channel without blocking.

WaitGroups: Coordinating Without Channels

Sometimes you don't need to pass data between goroutines — you just need to wait for a bunch of them to finish before continuing. Channels work for this (count completions), but sync.WaitGroup is the cleaner tool:

package main

import (
    "fmt"
    "sync"
    "time"
)

func processItem(id int, wg *sync.WaitGroup) {
    defer wg.Done() // signal completion when this function returns
    time.Sleep(100 * time.Millisecond)
    fmt.Printf("Processed item %d\n", id)
}

func main() {
    var wg sync.WaitGroup

    for i := 1; i <= 10; i++ {
        wg.Add(1)       // increment counter before launching goroutine
        go processItem(i, &wg)
    }

    wg.Wait() // block until counter reaches zero
    fmt.Println("All items processed")
}

The pattern is always: wg.Add(1) before launching, defer wg.Done() inside the goroutine, wg.Wait() to block until everything finishes. It's repetitive, but that repetition is a feature — it makes the intent crystal clear.

Warning: Always call wg.Add(1) before go yourFunction(), not inside it. There's a race condition if you call Add inside the goroutine — the main goroutine might reach Wait() before the counter is incremented.

WaitGroup is the right tool when you're fanning out work and don't need results back. Channels are right when you need results. In practice, you often use both together.

Real Concurrency Patterns for Web Servers

Let's move from toy examples to patterns you'll actually use building web services.

Fan-Out: Parallel API Calls

A common web handler pattern: you need data from three different services before you can respond. Sequential calls = slow. Concurrent calls = fast.

func handleDashboard(w http.ResponseWriter, r *http.Request) {
    type result struct {
        data interface{}
        err  error
    }

    userCh := make(chan result, 1)
    orderCh := make(chan result, 1)
    recoCh := make(chan result, 1)

    go func() { userCh <- fetchUser(r.Context()) }()
    go func() { orderCh <- fetchOrders(r.Context()) }()
    go func() { recoCh <- fetchRecommendations(r.Context()) }()

    user := <-userCh
    orders := <-orderCh
    recos := <-recoCh

    // all three complete before we proceed, but they ran in parallel
    renderDashboard(w, user.data, orders.data, recos.data)
}

If each service call takes 100ms, sequential takes 300ms. This concurrent version takes ~100ms (the slowest one). For a web handler, that's the difference between an acceptable response time and one that frustrates users.

Worker Pool: Bounded Concurrency

Fan-out is great when you have a few tasks. But if you need to process 10,000 items, you don't want 10,000 goroutines — you want a controlled pool of workers:

func workerPool(jobs <-chan int, results chan<- int, wg *sync.WaitGroup) {
    defer wg.Done()
    for job := range jobs {
        results <- processJob(job) // do the work, send result
    }
}

func main() {
    const numWorkers = 10
    jobs := make(chan int, 100)
    results := make(chan int, 100)

    var wg sync.WaitGroup
    for w := 0; w < numWorkers; w++ {
        wg.Add(1)
        go workerPool(jobs, results, &wg)
    }

    // Send jobs
    go func() {
        for i := 0; i < 1000; i++ {
            jobs <- i
        }
        close(jobs) // closing the channel signals "no more jobs"
    }()

    // Collect results in another goroutine
    go func() {
        wg.Wait()
        close(results)
    }()

    for result := range results {
        fmt.Println(result)
    }
}

The workers read from jobs until it's closed (range over a channel receives until it's closed), process each job, and send results. This is one of the most common patterns in Go backends. GetStream's concurrency guide covers this pattern in detail.

Timeouts with Context

In a web server, every handler should have a timeout. A downstream service hanging shouldn't hang your entire server. Go's context package is the standard solution:

func handleRequest(w http.ResponseWriter, r *http.Request) {
    // Create a context that cancels after 2 seconds
    ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
    defer cancel() // always cancel to release resources

    resultCh := make(chan string, 1)

    go func() {
        result, err := slowDatabaseQuery(ctx)
        if err != nil {
            resultCh <- ""
            return
        }
        resultCh <- result
    }()

    select {
    case result := <-resultCh:
        fmt.Fprintln(w, result)
    case <-ctx.Done():
        http.Error(w, "Request timeout", http.StatusGatewayTimeout)
    }
}

The ctx.Done() channel closes when the context is cancelled or times out. The select fires whichever happens first: we got a result, or we ran out of time. Every network call in your Go web service should accept and respect a context.Context — it's how cancellation propagates through your entire call stack without asyncio's function coloring problem.

Remember: defer cancel() is not optional. Context functions allocate resources (timers, goroutines) that are only released when you call cancel(). Forgetting it is a resource leak. The Go linter will catch this, but understand why it matters.

Race Conditions: How to Cause Them and How to Catch Them

Goroutines running in parallel share memory. If two goroutines read and write the same variable without synchronization, you have a race condition — behavior is undefined and bugs are non-deterministic (the worst kind).

Here's a classic example:

// WARNING: This code has a race condition
var counter int

func increment(wg *sync.WaitGroup) {
    defer wg.Done()
    counter++ // read-modify-write: NOT atomic!
}

func main() {
    var wg sync.WaitGroup
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go increment(&wg)
    }
    wg.Wait()
    fmt.Println(counter) // Prints something less than 1000, nondeterministically
}

counter++ looks atomic but isn't — it's three operations: read, increment, write. Two goroutines can read the same value, both increment it, and both write back the same result, losing one increment.

Go ships with a race detector built in. Just add -race to any build or test command:

go run -race main.go
go test -race ./...

The race detector instruments your code at runtime and reports data races with a stack trace showing exactly which goroutines conflicted. It's not a static analysis tool — it detects actual races during execution — so your test coverage matters. But it's remarkably good at catching bugs that would otherwise appear as mysterious production incidents.

==================
WARNING: DATA RACE
Write at 0x00c000014110 by goroutine 7:
  main.increment(...)
      /tmp/main.go:9
Read at 0x00c000014110 by goroutine 8:
  main.increment(...)
      /tmp/main.go:9
==================

Run -race in your CI pipeline. Always.

Mutex and Sync Primitives: When Channels Aren't Right

Channels are idiomatic for communicating between goroutines. But sometimes you genuinely just need to protect shared state — a cache, a counter, a map. For that, sync.Mutex is the right tool:

type SafeCounter struct {
    mu    sync.Mutex
    value int
}

func (c *SafeCounter) Increment() {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.value++
}

func (c *SafeCounter) Value() int {
    c.mu.Lock()
    defer c.mu.Unlock()
    return c.value
}

sync.RWMutex is a performance optimization for read-heavy workloads — multiple goroutines can hold a read lock simultaneously, but a write lock is exclusive:

type Cache struct {
    mu    sync.RWMutex
    items map[string]string
}

func (c *Cache) Get(key string) (string, bool) {
    c.mu.RLock()         // multiple readers OK
    defer c.mu.RUnlock()
    v, ok := c.items[key]
    return v, ok
}

func (c *Cache) Set(key, value string) {
    c.mu.Lock()          // exclusive write
    defer c.mu.Unlock()
    c.items[key] = value
}

sync.Map is also available for concurrent map access without manual locking, though it comes with trade-offs (no generic type safety, optimized for specific access patterns).

The practical heuristic: use channels when goroutines are communicating (passing data, signaling events). Use mutexes when goroutines are sharing state (a cache that many handlers read and occasionally update). Both are valid; the mistake is using one when the other fits better.

Context: Cancellation and Deadlines Everywhere

We touched on context.WithTimeout above, but the context package deserves its own moment. It's how Go propagates cancellation, deadlines, and request-scoped values through your entire call stack.

Every net/http request comes with a context: r.Context(). When the client disconnects, that context is automatically cancelled. If you pass this context through your application — to database queries, to downstream HTTP calls, to any blocking operation — everything cancels cleanly when the client goes away. No dangling goroutines finishing work for a user who's already gone.

// The context flows from the HTTP handler down through every call
func handleSearch(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    
    results, err := db.QueryContext(ctx, "SELECT ...")  // cancels if client disconnects
    if err != nil {
        http.Error(w, err.Error(), 500)
        return
    }
    
    enriched, err := enrichResults(ctx, results)  // same context, same cancellation
    // ...
}

This is Go's answer to a real problem: in Python's asyncio, cancellation is possible but requires careful handling of CancelledError at every level. Go's context-based cancellation is explicit but composable — every function that can be cancelled takes a context.Context as its first argument (by convention), and the cancellation signal flows through naturally.

graph TD
    A[HTTP Request arrives] --> B[r.Context created]
    B --> C{context.WithTimeout}
    C --> D[Database query ctx]
    C --> E[External API call ctx]
    C --> F[Cache lookup ctx]
    G[Client disconnects OR timeout] --> H[ctx.Done closes]
    H --> D
    H --> E
    H --> F

Putting It Together: Why This Changes Everything for Web Servers

Let's step back and appreciate what Go's concurrency model means for web development specifically.

Every incoming HTTP request in a Go server runs in its own goroutine. The standard library's net/http does this for you automatically. Because goroutines are cheap, a Go server can handle 50,000 simultaneous connections without breaking a sweat — each one gets its own goroutine, and the runtime schedules them efficiently across your CPU cores.

Compare this to Python's options: threaded servers cap out on OS thread limits and GIL contention; async servers (like uvicorn with FastAPI) are excellent but require your entire codebase to be async-aware; multiprocessing scales across cores but at significant memory cost per process.

Go sidesteps all of this. The programming model is simple — write normal sequential code, use go for concurrency, use channels to communicate. The runtime handles the hard parts. You don't need three different concurrency models; you have one model that composes well and performs well at scale.

Go's concurrency is genuinely one of its strongest differentiators from Python — not just in raw performance, but in how much simpler the mental model is once you understand goroutines and channels. The first time you replace a complex asyncio pipeline with a handful of goroutines and a channel, and the code is both faster and easier to read, Go tends to click in a very satisfying way.

The next section puts these primitives to work in net/http, where you'll see how Go's standard library web server uses this concurrency model to handle real requests.

Go Modules: How to Manage Dependencies Building Web Servers with Go's net/http Package

Only visible to you