How the Internet Actually Works: Networks, Protocols, and the Web Under the Hood

Section 9 of 17

How HTTPS and TLS Encryption Secure Data on Public Networks

TLS and HTTPS: Keeping Secrets on a Public Network

Here's something uncomfortable: HTTP is completely transparent. Every request and response travels across the network as plain text. Every router your data passes through — your home router, your ISP's routers, the backbone routers — can read every word. Sitting in a coffee shop on public Wi-Fi? Anyone on the same network can intercept your HTTP traffic with freely available tools.

This isn't some theoretical vulnerability. In the early days of the web, attackers routinely captured login credentials and credit card numbers from unencrypted HTTP connections. Even now, an attacker with access to network traffic — through a compromised router, a rogue Wi-Fi access point, or ISP-level eavesdropping — can silently observe everything you do. Some corporate networks and government firewalls still intercept HTTP traffic for monitoring purposes, which is exactly why HTTPS became non-negotiable for anything security-sensitive.

This is the problem TLS (Transport Layer Security) solves. When you add TLS to HTTP, you get HTTPS — and suddenly your data is encrypted end-to-end between your browser and the server.

TLS secures a network connection in three fundamental ways:

Encryption: Data is encrypted in transit. Even if someone intercepts it, they see gibberish.
Authentication: The client can verify it's actually talking to the real server (not an impostor). This happens via digital certificates.
Integrity: Any attempt to tamper with the data in transit is detectable.

Think of it like sending a letter in a locked box. Encryption ensures the contents are unreadable. Authentication ensures the box actually came from the claimed sender. Integrity ensures that if someone opens and rewrites the letter, you'll know it's been tampered with.

The Brilliant Problem: Key Exchange on a Public Channel

Here's the cryptographic puzzle at the heart of TLS: how do you establish a secret encryption key with a server when you've never communicated with it before, and everything you send is visible to potential eavesdroppers?

It seems impossible. Send the key? The eavesdropper sees it. Server sends the key? Same problem. For centuries, cryptography relied on the assumption that two parties had to share a secret beforehand — otherwise how could you establish one without being overheard?

Then came public-key cryptography — one of the most beautiful ideas in computer science, invented in the 1970s (independently by Diffie-Hellman and, secretly, by the British government's GCHQ). The core insight is surprisingly simple:

Imagine a padlock. You can give away thousands of copies of the opened padlock (the public key) — put it on a billboard, publish it in the newspaper, email it to everyone. But you keep the only key that opens it (the private key). Anyone can lock something inside a box using your padlock, but only you can unlock it.

In cryptographic terms: anything encrypted with the public key can only be decrypted with the private key. So if a server shares its public key with the world, any client can encrypt a message that only the server can read. Even though the public key is visible to eavesdroppers, they can't use it to decrypt messages — only to encrypt them. This completely breaks the ancient assumption that secrecy requires a pre-shared secret.

Why not use public-key encryption for everything? Because it's mathematically expensive. Encrypting and decrypting large amounts of data is slow. TLS uses public-key cryptography to solve the key exchange problem, then switches to faster symmetric encryption (where both sides use the same secret key) for the actual data transfer. You get the best of both worlds: the ability to establish shared secrets with strangers, plus the speed needed for practical data transfer.

Common Misconception: "Public Key Cryptography Means No Eavesdropper Can Read Anything"

This is only true for messages encrypted with the public key. The eavesdropper sees the ciphertext (encrypted gibberish) but can't decrypt it. However, other metadata is still visible — the domain being accessed, the size and timing of requests, the IP addresses involved. A sophisticated attacker can sometimes infer what you're doing based on these "side channels" alone. TLS provides confidentiality of content, but not perfect privacy of existence — if an eavesdropper can see that you're connecting to a bank's IP address, they know you're banking, even if they can't see which account you access.

The TLS Handshake (Simplified)

The TLS handshake happens after the TCP handshake is complete and before any HTTP data is exchanged. It's a carefully choreographed exchange where both sides prove their identity and agree on encryption parameters. Here's what TLS 1.3 (the current standard, finalized in 2018) looks like in action:

sequenceDiagram
    participant C as Client (Browser)
    participant S as Server
    C->>S: ClientHello (TLS version, supported cipher suites, random nonce)
    S->>C: ServerHello (chosen cipher, random nonce, Certificate)
    S->>C: ServerHelloDone
    C->>C: Verify certificate against trusted CA list
    C->>S: Encrypted pre-master secret (using server's public key)
    Note over C,S: Both derive same session key from shared secrets
    C->>S: Finished (encrypted with session key)
    S->>C: Finished (encrypted with session key)
    Note over C,S: TLS established — HTTP can now flow encrypted

Step 1 — ClientHello: The browser initiates the handshake by sending:

The TLS version(s) it supports (e.g., TLS 1.3, TLS 1.2)
A list of cipher suites — combinations of encryption algorithms it can use (e.g., TLS_AES_256_GCM_SHA384 means AES-256 encryption with SHA-384 hashing)
A random number (nonce) unique to this connection

That random number matters. It ensures that even if you visit the same server a hundred times, the encryption keys are different each time. This prevents replay attacks and statistical attacks based on repeated patterns.

Step 2 — ServerHello + Certificate: The server responds with:

Its chosen cipher suite (the strongest one both client and server support)
Its own random number
The digital certificate containing the server's public key and identity

The certificate is the cryptographic proof that "this public key belongs to example.com" — but how do we know the server isn't lying? That's where Certificate Authorities come in.

Step 3 — Certificate Verification: The browser examines the server's certificate, which contains:

The server's public key (the padlock)
The server's domain name(s) it's valid for (must match where you're trying to go)
An expiration date (typically 1 year or less)
A digital signature from a trusted Certificate Authority (CA)

The browser has a built-in list of trusted Certificate Authorities (Mozilla Foundation, Google, Apple, and Microsoft each maintain these lists, updated with each OS/browser release). The browser uses the CA's public key to verify that the certificate signature is genuine. If the signature checks out, the domain matches, and the expiration date is in the future, the browser trusts that it's talking to the legitimate server.

What if the certificate is invalid or expired? Modern browsers flat-out refuse and show a scary warning page. You can force it (browsers usually provide an "Advanced" option), but this is a massive red flag. A valid HTTPS connection is strong proof (though not absolute proof) that you're talking to who you think you are.

Step 4 — Key Exchange: The browser:

Generates a pre-master secret (a random value, typically 48 bytes)
Encrypts it with the server's public key (the padlock)
Sends the encrypted pre-master secret to the server

Only the server (with its private key) can decrypt this. Now both client and server have:

The pre-master secret
Both random numbers from steps 1-2

Using these three values (combined via a key derivation function), they generate the same symmetric session key. Here's the magic: the session key was derived locally on both sides, but only the server ever had the ability to decrypt the pre-master secret. An eavesdropper who saw the encrypted pre-master secret can't decrypt it without the server's private key.

Step 5 — Finished: Both sides confirm that:

They derived the same session key correctly
The handshake wasn't tampered with

Each side sends a "Finished" message encrypted with the session key. If both sides can decrypt each other's "Finished" messages, the handshake succeeds. This is an integrity check — if an attacker tried to modify the handshake, the derived keys would differ and both sides would catch it immediately.

All subsequent HTTP data is encrypted with the symmetric session key. AES (Advanced Encryption Standard) is commonly used. Symmetric encryption is dramatically faster than public-key encryption — it's used for bulk data because of this speed difference. A typical TLS connection might take 1-2 round trips for the handshake, then perform HTTP at nearly the speed of unencrypted connections.

Certificate Authorities: The Chain of Trust

The whole TLS system depends on Certificate Authorities being trustworthy. A CA is an organization (like DigiCert, Comodo, GlobalSign, or Let's Encrypt) that:

Verifies that a party actually owns the domain they claim (via DNS records, email checks, or other verification methods)
Signs a certificate with the CA's own private key

Since browsers trust the CA's public key (which ships with the browser or OS), and the CA signed the server's certificate, the browser can transitively trust the server.

This creates a chain of trust:

Browser trusts CA → CA verifies server identity → CA signs server's certificate → Browser trusts server

The system has failure modes. If a CA is compromised or acts maliciously, it can issue fraudulent certificates for any domain. In 2011, the Dutch CA DigiNotar was hacked, and attackers issued fraudulent certificates for Google, Gmail, and dozens of other major sites. These fake certificates would have been accepted by browsers as valid. The attacker used them to perform man-in-the-middle attacks on Iranian users. After this catastrophe, DigiNotar was removed from all browsers' trust stores and eventually filed for bankruptcy.

More recently, in 2015, the Chinese government is believed to have issued fraudulent certificates for Google properties without valid authorization. These incidents highlight that the public key infrastructure (PKI) is only as strong as its weakest CA — and there are hundreds of CAs in the trust store.

Let's Encrypt: Democratizing HTTPS

Let's Encrypt deserves special mention because it transformed web security. Before 2015, getting a TLS certificate was expensive (typically $10–$300/year) and required manual processes — fill out forms, wait for human verification, manage renewals yourself. This meant HTTPS was mostly used for banking and e-commerce; small sites often stuck with HTTP.

Let's Encrypt, a non-profit CA launched in 2015, changed everything:

Free certificates for any domain you can verify ownership of
Automated verification via DNS records or HTTP challenges (no human review needed)
Automatic renewal via scripts like certbot — your server can renew before expiration
Short certificate lifetimes (90 days) to limit damage if a key is compromised

The impact has been enormous. HTTPS adoption skyrocketed from around 50% of web traffic in 2015 to over 85% today. If you've deployed a web app recently, you've almost certainly used Let's Encrypt. Its success proves that removing friction can drive security adoption — when HTTPS was free and automatic, people used it.

How HTTPS Fails (and Why You Still Need It)

Important to understand: HTTPS has limitations.

Server compromise: If the server itself is hacked, HTTPS doesn't help. The attacker can read decrypted data inside the server, or inject malicious code into responses.
Weak passwords: HTTPS encrypts the channel, but if you use a weak password and the server is compromised, the attacker can crack it offline.
Certificate pinning edge cases: Certain attacks (like HPKP — HTTP Public Key Pinning — misconfiguration) can make sites inaccessible. HPKP is now deprecated due to these risks.
Metadata leakage: The domain, IP address, and approximate size of requests and responses are still visible. You can infer what someone is doing from timing and size alone, even if you can't see the content. VPNs and Tor address this, not HTTPS.

None of these are reasons to avoid HTTPS — they're reasons to use HTTPS plus other security practices. HTTPS is the foundation; authentication, rate limiting, encryption at rest, and secure code are the rest of the house.

HTTPS in Practice for Developers

When you deploy a Django app, you typically:

Obtain a TLS certificate: Use Let's Encrypt (free) via certbot, your cloud provider's certificate service, or a CDN like Cloudflare (which also offers free SSL).
Configure your web server to terminate TLS: Set up Nginx, Apache, or your platform's load balancer with:
- The certificate file
- The private key file
- Strong cipher suites (avoid deprecated ones like RC4 or DES)
The web server handles TLS encryption and decryption, forwarding plain HTTP to your Django app.
Redirect HTTP to HTTPS: Any request to http://yoursite.com should return a 301 redirect to https://yoursite.com. This ensures users always get the secure version.
Set security headers:
```
Strict-Transport-Security: max-age=31536000; includeSubDomains
```
This header tells browsers: "For the next year, only connect to this domain via HTTPS — never try HTTP, even if the user types it." This protects against downgrade attacks.

The beauty of TLS termination at the web server is that your Django code doesn't need to know anything about encryption — it just speaks plain HTTP with Nginx, and Nginx handles the TLS layer with clients. Your Django app can focus on business logic while the web server handles cryptography.

In development, you might not use HTTPS (or use self-signed certificates), but in production, it's non-negotiable. Every major cloud provider makes setting up HTTPS trivial — use it.

HTTP: The Language Websites Use to Communicate What Happens When You Type a URL and Press Enter

Only visible to you