How to Write Narration for Family Videos
Writing Narration That Earns Its Place
Now that you've built your radio edit — now that you've listened deeply to what your family actually says and learned to trust those voices — you're ready to ask a harder question: do you need to add more words on top of them?
There's a particular kind of family video that announces its own failure in the first thirty seconds. The footage is fine — maybe even beautiful. Grandma at the table, the kids running in the backyard, the cake with its candles. And then a voice begins. It says, "This is a video of my grandmother's eightieth birthday party. We gathered together as a family to celebrate this special occasion." The viewer watches Grandma blow out candles while a voice tells them they are watching Grandma blow out candles. Nobody needed those words. Nobody wanted them.
This is especially dangerous after you've done the radio edit work, because you've just spent an entire section learning to listen — to hear what your family actually says and let those words carry the film. Now comes the temptation to add more voices, more explanation, more of your commentary layered over theirs. This section is about resisting that temptation when it's wrong and embracing it when it's right — understanding when spoken narration genuinely serves your film, how to write it so it sounds like a human being and not a press release, and how to record it so it sounds like it belongs in the same room as your images. We'll also look at title cards, which can often do the work of narration with a tenth of the words and none of the recording anxiety.
When Narration Actually Matters
There's a profound difference between narration that says "here is my father working in his garden" (pure description, completely redundant if we're watching him garden) and narration that says "he started this garden the year he retired, and for the first few months, none of us understood why he spent so many hours out there. Now I think I do." The second version adds a layer of meaning the image cannot supply on its own.
Good narration does one of three things. The first is orientation. It anchors the viewer in time and place — not with the obvious facts necessarily, but with details that matter. "This was 1987" is fine. "This was 1987, the year everything changed" does more work because it signals that the year itself carries weight. "We were living in the apartment on Maple Street, the one she always said was too small" orients to place while embedding emotion. Orientation that serves the story is worth the words it takes.
The second is context. This is where narration reveals something the image can't — the history behind a moment, the cost of a decision, the reason someone's face looks the way it does. "He never learned to swim as a child" explains why he's anxious watching his grandkids in the pool. "She'd been saving for twenty years" transforms footage of a house purchase from a simple transaction into an achievement. Context that deepens what we're seeing earns its place.
The third is transition. When you're moving between different segments of a film — from your grandmother's childhood to her marriage, say, or from the road trip's departure to the moment three states later when everything went sideways — narration can bridge that gap economically and gracefully. A simple line like "she was twenty-three when she met him" or "we didn't know what was coming" can carry you from one chapter to the next without a clunky edit or a long stretch of awkward silence.
If your narration doesn't fall into one of these three categories — orienting, contextualizing, or transitioning — it's probably not earning its place. Cut it and see if the film breathes better. Nine times out of ten, it will.
The harder truth is that the impulse to write narration often signals a deeper problem: a scene that doesn't yet work, or an edit that hasn't found its logic. When a filmmaker reaches for the voiceover pen, it's worth pausing and asking whether more narration is actually the solution, or whether the scene needs restructuring, a different interview clip, or just the courage to let the images stand alone. The best narration in the world can't fix a fundamentally broken scene. But removing unnecessary narration can reveal that a scene was actually working fine all along.
The Cardinal Sin: Saying What the Viewer Can Already See
"say cow, see cow" If your narration says "my daughter loves to dance," and we're watching your daughter dance, you've wasted a sentence. The viewer already knows. You've told them something they can see, which means you've either underestimated their intelligence or you haven't thought hard enough about what your narration should actually be doing.
The rule sounds simple. In practice, it's one of the hardest habits to break, because the instinct to describe what we're showing runs deep. It feels clarifying. It feels like you're being helpful. But it's actually the opposite of helpful — it's a form of distrust, an implicit message to the viewer that the image isn't enough, that they need a caption.
What narration should do instead is work in counterpoint to the image. The image shows; the narration tells something different — something the image can't show. Together, the two pieces create something neither could create alone.
Here's the principle in practice. Imagine you're editing a sequence of your father teaching your son to fish on a lake in Montana. The images are beautiful: the early morning light, the patient casting lesson, the grandfather's weathered hands guiding the boy's small ones.
Weak narration: "My dad taught my son to fish on a trip to Montana. He was patient and kind with him."
The images already show us patient and kind. The images already show us Montana. This narration adds nothing.
Strong narration: "My father never taught me to fish. He was working two jobs through most of my childhood, and there wasn't time. Watching him with my son, I thought: this is where it gets repaired."
Now the images mean more than they did before. The narration revealed something the images couldn't show — the absence in the past that makes this present moment resonate. The viewer is watching a fishing lesson and understanding it as a form of healing. That's counterpoint.
Writing for the Ear, Not the Eye
Most people learned to write for readers, which means they learned to write for the eye. Written sentences can be long and complex because the reader can slow down, re-read, circle back. The eye is patient and non-linear. But narration goes into the ear, and the ear has no pause button. Whatever the narrator says is gone the moment it's spoken, replaced by the next sentence. The listener cannot circle back.
This changes everything about how sentences should be constructed.
The cardinal rule: be conversational. Use contractions. Keep sentences short. Avoid opening with elaborate phrases — you know the kind: "Having been raised in a small town in western Nebraska during the Depression years, my grandmother developed a resilience that would..." — because people don't speak that way, and ears immediately sense the difference between writing that was meant to be read and writing that was meant to be heard.
The test is brutally simple: read it aloud. If you stumble, rewrite it. If you have to take an unnatural breath in the middle of a sentence to get through it, the sentence is too long. If you find yourself speaking in a rhythm that doesn't sound like how you'd actually tell this story to a friend, you haven't finished writing yet.
Here are the practical principles:
Short sentences. Write short sentences. Like this. Then a slightly longer one to give the rhythm some variation, because monotony is its own problem. Then short again. The ear likes this.
Active voice. "The Germans signed the Armistice" is more vivid and direct than "The Armistice was signed by the Germans." In family documentary terms: "She raised five kids alone" hits harder than "Five children were raised by her." Active voice creates momentum — it implies agency and motion, which is exactly what screen storytelling needs.
Contractions. Write "she didn't" not "she did not." Write "we're" not "we are." These are not grammatical shortcuts — they are the sound of a human being speaking. Without them, narration sounds stiff and institutional, like a corporate training video.
No jargon, no elevated vocabulary. This is not the place to demonstrate your sophistication. The goal is clarity and emotional connection, not the appearance of effort. If a simple word works, use it. If a simpler sentence works, use that.
Pose questions. Questions pull listeners forward. "What was she thinking that morning? I've never been able to ask her." The viewer leans in. A question creates tension that demands resolution.
Here's a practical exercise worth trying. Take a paragraph of narration you've written and read it aloud into a voice memo on your phone. Don't listen to it immediately. Come back to it in an hour. When you play it back, you'll hear immediately where the writing is stiff, where sentences are too long, where you stumble because the construction is awkward. The ear hears what the eye forgives.
The Ellipsis as a Musical Rest
Broadcast writers — people who've spent careers writing narration for news, documentary, and long-form television — use a specific tool that most beginners never discover: the ellipsis as a pacing instruction.
In documentary scripts, an ellipsis (...) isn't just a grammatical device indicating omitted text. It's a musical rest. It tells the narrator to pause. Deliberately. Meaningfully. And this kind of purposeful silence can be more powerful than any sentence you could write.
Here's why this matters. When you're writing narration that will sit over images, there will be moments when the image is doing significant emotional work — a face registering grief, a child laughing at something just off-camera, an old man standing alone in a place he's described as the most important place of his life. In those moments, the narration needs to get out of the way, or it will step on the image's power.
The ellipsis is how you build that space into the script itself. It's not silence by accident — it's silence by design.
Practically, you might write a narration line like this:
"She moved into that house in 1952... and didn't leave until the ambulance came forty-seven years later."
The pause after 1952 lets the viewer absorb the number — think about that for a moment — before the narration delivers its quiet gut-punch. Without the pause, the line is still good. With the pause, it lands differently. The ellipsis is doing emotional work.
Professional broadcast writers sometimes also mark [PAUSE] in their scripts, particularly at transitions or when they want to let a piece of music or an ambient sound breathe before the narrator re-enters. If you're recording your own narration, build these pauses into your performance. If you're directing a family member to narrate, talk about pauses specifically — tell them silence is part of the performance, not a mistake to be covered.
graph LR
A[Image doing emotional work] --> B{Does narration need to speak here?}
B -->|No| C[Silence — let the image work]
B -->|Yes| D[Write a short line]
D --> E{Does it describe what we see?}
E -->|Yes| F[Cut it. It's redundant.]
E -->|No| G[Keep it. It's earning its place.]
C --> H[Use ellipsis to protect this silence in script]
First Person vs. Third Person: Which Voice Serves Your Film?
Family documentary exists on a spectrum of narration styles, and where you land on that spectrum should be a deliberate creative choice, not a default.
First person narration ("I," "we," "my family") is the most natural voice for family documentary, and usually the best choice. It positions you, or someone in your family, as a witness and participant in the story being told. It creates intimacy. It acknowledges that this is a story told from inside the experience, with all the subjectivity and emotional complexity that entails. When Ken Burns uses first-person diary entries read aloud, when a filmmaker narrates their own family's story in their own voice — this is first person at its most powerful. The viewer is not receiving information from an objective authority; they're receiving a story from someone who lived it.
The limitation of first person is that it constrains perspective. The narrator can only know what they know. This can be a feature rather than a bug — a grandmother narrating her own immigration story in first person is more moving than third person because it's her voice — but it means you need to think carefully about who your narrator is and what they can credibly know.
Third person narration ("she," "he," "the family") creates a more objective, authoritative distance. You hear this in traditional broadcast documentaries, in the PBS Frontline style — a disembodied voice that seems to know everything and stands outside the events it describes. For most family documentaries, this is actually the weaker choice, because it feels incongruous. A family film narrated in authoritative third person has an odd formal quality, like a news broadcast about your own birthday party. The distance that third person creates is the opposite of what family documentary usually needs.
That said, there are specific situations where third person works beautifully in family documentary. When you're telling a story about a family member who is no longer living — a great-grandparent you never met, or someone whose voice you want to honor without speaking for them — third person can have a kind of grave formality that feels appropriate. When the subject matter is more historical than personal, third person creates useful emotional distance.
The hybrid approach is often the most elegant: first-person narration from a clearly defined family voice, with third-person handling of characters or events at a remove. "My grandmother left Poland in 1937. She was nineteen years old and she traveled alone." The first sentence is third person — it delivers information. The second is still third person, but it carries emotional weight. A line or two later, the narration might shift: "When I asked her once whether she was frightened, she laughed in a way that told me I didn't understand the question."
The shift creates texture and positions the narrator as someone trying to understand, not someone who already knows — which is the most honest and interesting position a family filmmaker can occupy.
Using Family Members as Narrators
Here's a decision that will transform your film if you let it: stop writing narration in your own voice, and give the narration to someone else in the family.
This is not about having a better voice (though sometimes that helps). It's about access to authentic perspective that you cannot manufacture. When your eighty-year-old mother narrates a film about her own childhood, she brings a lifetime of relationship to those words. When your teenage son narrates a film about a family road trip from his perspective, the voice is irreplaceable. You could write narration that approximates what they might say. Or you could ask them what they actually want to say, give them the space to say it, and capture that.
The technique is simple but requires some facilitation. Rather than writing narration and handing it to a family member to read — which almost always sounds stilted and false — sit with them, explain what the scene needs, and ask them to talk. Record the conversation. Often the best narration lines come directly from transcribing what someone said when they weren't thinking of it as narration at all.
This connects directly back to what you learned in the interview section: the best material comes when people forget they're being recorded. The same principle applies to narration. If your subject feels like they're recording a voiceover for a film, they stiffen. If they feel like they're talking to you about something real, the language is alive.
When working with family narrators, a few practical things are worth keeping in mind:
Don't script them too tightly. Give them the idea — "I want you to talk about what it was like to leave that house after forty years" — and let them find their own words. You can direct them toward specific content, but the language should be theirs.
Do as many takes as you need. Unlike professional talent, family members may not hit their best version on the first try. Create an environment where it's normal to try things multiple times. "That was great — let's try it one more time and see if we get something even better" is less threatening than "Can you do that again?" Make them comfortable with repetition.
Capture the unplanned moments. Sometimes the best thing happens when the narrator finishes the "official" take and keeps talking. Keep recording. "That's great, just keep talking for a minute" can yield beautiful material.
Be honest about what you need. If the narration needs to cover a specific duration — say, twenty seconds to cover a montage — tell the narrator. They'll self-edit. People are remarkably good at this when given parameters.
Recording Narration at Home: The Practical Reality
You don't need a recording studio. You need a quiet room, a decent microphone, and some patience. Here's what actually matters versus what you might worry about unnecessarily.
The room is more important than the microphone. A great microphone in a room with bad acoustics will produce recording that sounds hollow, reverberant, or "roomy" — that peculiar sound of someone recording in a bathroom or a large empty space. A decent microphone in a small, soft, sound-absorbing room will produce narration that sounds professional. Closets are excellent. Bedrooms with carpets and heavy curtains are good. Libraries and living rooms with bookshelves are often better than you'd expect — the irregular surfaces and soft materials scatter and absorb sound reflections.
The worst places to record narration: kitchens (hard surfaces, appliance hum), bathrooms (obvious reverb), rooms adjacent to the HVAC system (constant low-frequency rumble), rooms with ceiling fans running. Before you record, sit silently in the room for a minute and listen for everything you'd normally tune out: the refrigerator, the air conditioning, traffic outside. That's your competition. Find a room where none of it is present or where it's very faint.
The USB condenser microphone is sufficient. For home narration recording, a USB condenser microphone in the $80-150 range will produce results that are entirely appropriate for family documentary. You are not making a Netflix film. You are making a family film. The standard is "sounds like a real person speaking clearly" — and that's achievable with modest equipment. The most important factor after room acoustics is microphone distance: six to twelve inches from the microphone, slightly off-axis (speaking slightly past the microphone rather than directly into it), which reduces plosive sounds from words that begin with "p" and "b."
Pop filters earn their keep. A pop filter — the circular screen that sits between the narrator's mouth and the microphone — eliminates the little explosive burst of air that turns "party" into a thump in the recording. They cost almost nothing and solve a real problem. Use one.
Record multiple takes, label them clearly, and let them sit. Record your narration in sections rather than all at once. After recording, export the files and come back to them the next day before editing. You'll hear things with fresh ears that you missed during the session — a line that sounds great while recording but actually sounds rushed, a stumble on a word you didn't notice in the moment. Record several versions of any line you're uncertain about. Hard drive space is free.
What "good enough" actually sounds like. In professional broadcast documentary, narration is recorded in a treated studio with professional equipment. For family documentary, the standard is lower — and appropriately so. Good enough means: no distracting background noise, no audible room reverb, the narrator's voice is clear and in the center of the frame, and there are no technical artifacts (clipping, dropouts, obvious editing cuts within a sentence). If you can meet those four standards, your narration recording is good enough. Don't chase perfection at the expense of authenticity.
Title Cards: The Minimalist Alternative
Some of the most powerful moments in documentary are achieved not with narration, not with interview clips, but with a simple line of text on a black screen.
Title cards — also called intertitles or text cards — are a structural device used in documentaries to organize content into segments, similar to how chapter breaks function in novels. They can orient the viewer in time and place ("August 1987. The summer everything changed."), deliver hard information that would feel awkward in narration ("Between 1942 and 1945, the family received no letters"), or provide an emotional punctuation mark at a scene's end.
Text on screen serves specific expository duties, freeing narration for voice and personality rather than pure information delivery. Dates, locations, names, statistics — these are often better served by title cards than by spoken narration, because text on screen can be read at the viewer's pace, and it visually signals that this is factual information rather than subjective narration.
For family documentary, title cards have a particular elegance because they don't require you to perform. There's no voice to record, no anxiety about how you sound. You write the words, you put them on screen, you move on. And the very starkness of text on a black screen can create emotional impact that a softer narration delivery might actually undercut.
Consider the difference:
Narration version: "My grandfather passed away on November 14th, 2009. He was eighty-three years old."
Title card version:
November 14, 2009.
The title card version is devastating in a way the narration cannot quite match, because it doesn't try to do anything. It just announces. It trusts the viewer to feel what needs to be felt.
The rules for effective title cards are few but worth observing:
- Keep them short. No more than two or three lines. If you need more than that, you need narration.
- White text on black background is the documentary default — it reads clearly and carries appropriate gravity. But some family filmmakers use warmer colors or semi-transparent overlays over footage; just make sure legibility is never compromised for style.
- Give viewers time to read. A common mistake is cutting away from a title card before the average viewer can read it twice. Read it once yourself, count an extra beat, then cut.
- Use a consistent style throughout the film. Mixing fonts, sizes, and colors between title cards creates a sense of visual randomness that undermines the film's authority.
Title cards and narration are not mutually exclusive. Many documentaries use narration for voice and personality while using title cards for hard information. This division of labor makes both elements more effective — the narrator sounds less like they're reading a Wikipedia article, and the text on screen feels purposeful rather than like a crutch.
The Minimal Narration Principle
Everything in this section has been pointing toward a single operating principle: less is almost always more.
This is counterintuitive for people new to the form. There's a feeling that narration is the responsible thing — it makes sure nothing is left unexplained, no viewer is left behind, no moment is underserved. But this is the logic of the tour guide who tells you what you're about to see, describes it while you see it, and then summarizes what you just saw. It's well-intentioned and completely exhausting.
The films that stay with you are the ones that trust you. They put something powerful on screen and let it be powerful. When they speak, it's because they have something to say that you couldn't get any other way.
Here's what minimal narration looks like in practice. Imagine you've assembled a five-minute film about your grandmother: her arrival in America, her forty years of work in a textile factory, her garden, her grandchildren, her final years. You have beautiful footage, strong interview clips, and archival photographs. The whole film runs beautifully — but there's a gap between the factory footage and the garden footage that needs bridging.
You could write three sentences explaining that she retired in 1978, that the garden became her occupation in retirement, that she grew the same vegetables her mother had grown in the old country. That's fine. It covers the information.
Or you could write one sentence: "She retired in 1978 and never stopped working."
That single sentence does the work of three. It orients (she retired in 1978), contextualizes (the garden was work, not leisure — a nuance that reframes the images), and it carries voice. The viewer's imagination fills in the rest. And because it's doing all that work in one precise line, it earns its place rather than taking up residence.
The discipline of minimal narration forces better writing. When you know you have one sentence instead of three, you think harder about what that sentence should be. You cut the redundant, the obvious, the merely informational. You look for the line that opens a door rather than leading the viewer through it.
As documentary professionals often say: scrutinize every line by asking — does the viewer need to know this? Not "would it be nice to know" or "is this interesting" — need. If the answer isn't clearly yes, cut it. "Don't tell me shit I already know."
A Note on Voice and What Makes It
Beyond all the craft principles — the active voice, the short sentences, the counterpoint with images — there's a quality in narration that's harder to describe but easy to recognize. Call it voice. Not the literal sound of someone's voice, but the sense of a distinct sensibility speaking: a person who sees the world a specific way, who has particular observations and a particular rhythm of thought, who has earned the right to tell this story.
Different documentary narration styles create completely different emotional registers — the warm authority of Morgan Freeman in March of the Penguins, the fierce intelligence of James Baldwin's words in I Am Not Your Negro, the plainspoken conviction of Ken Burns's subjects reading their own letters. What makes each of these voices work is not just the quality of the prose but the sense that the right person is speaking for this particular story.
For family documentary, finding your voice — or your family's voice — is not a problem to solve. It's already there. The way your father actually talks when he's not trying to be formal. The particular phrasing your grandmother uses that nobody else in the world uses. The slightly self-deprecating tone your brother brings to stories about himself. These are not challenges to overcome; they're assets you can capture.
Which is the deepest truth about narration in family documentary. You're not writing for an audience of strangers, trying to establish credibility and authority in the way a broadcast documentary must. You're writing — or recording — for people who already know you, who will recognize the voice, who will feel the authenticity of someone speaking from inside the experience. That's an advantage no professional narrator can replicate.
Use it.
Only visible to you
Sign in to take notes.