How to Interview on Camera for Documentary and Broadcast
There's a moment that happens on almost every documentary shoot, and every experienced filmmaker knows exactly when it's coming. The lights are set, the camera is rolling, and the subject — who was perfectly relaxed, perfectly funny, perfectly themselves during the setup — has somehow transformed into a stiff, vaguely alarmed version of themselves. They're sitting up straighter than they ever sit. They're speaking in complete sentences. They've started referring to themselves in the third person. Whatever made them interesting in the pre-interview conversation has temporarily evacuated the premises.
This is the camera effect, and it's the central challenge of documentary and broadcast interviewing. It's not a character flaw in your subject. It's a predictable, deeply human response to being recorded — to knowing that this moment is being preserved, that strangers will watch it, that it will outlast the conversation itself. Every human being on earth knows the difference between talking to a person and talking to a machine, even if they can't quite articulate it.
Your job as an interviewer is to understand why this happens and to develop a practice that works around it, through it, or — in the best cases — with it. That means understanding the specific demands that cameras and microphones place on the interview. It means learning the technical and philosophical approaches that working documentary filmmakers and broadcast journalists have developed in response. It means reckoning with the particular challenges of audio-only interviewing. But more than a collection of techniques, what follows is an argument: everything we've discussed about listening, trust, and genuine curiosity becomes more important when a camera is in the room, not less — because the technology amplifies every mistake and every success.
Most subjects have no practice navigating a camera, and the result is performance — sometimes stiff, sometimes overcooked, sometimes artificially authoritative — rather than conversation. Understanding all of this reframes your job. You're not trying to get people to "be natural." You're trying to create conditions under which naturalness becomes possible again, despite the environment you've constructed around them. That's an interior design problem as much as an interpersonal one.
The Problem of the Lens: Eye Contact and the Disembodied Camera
Here's the specific challenge of on-camera interviews that no amount of rapport-building fully solves: the camera lens is not a person. When your subject makes eye contact with you, they're having a human interaction. When they look into a lens, they're talking to a machine.
In most standard documentary setups, the camera is positioned slightly off to one side of the interviewer, who sits just out of frame. The subject is coached to look at the interviewer, not the lens. This creates the classic documentary interview look: subjects talking to someone just out of frame, slightly to camera left or right, eyes alive with interaction. It works. It has worked for decades. But it has a fundamental limitation: the subject and the audience never make eye contact. There's always a slight, subliminal sense of distance — you're watching someone talking to someone, not to you.
The implications for performance are significant. When subjects aren't looking into the lens, they're free to have a genuine conversation with the interviewer. But they're also aware, on some level, that they're being observed by a camera they're not addressing. It's like having a conversation in a room where you know someone is watching through a two-way mirror. You can get used to it. But it's there.
Some interviewers try to minimize this problem by positioning the camera as close to themselves as possible — putting the lens as near to their own eye line as the setup allows — so that looking at the interviewer and looking at the camera become nearly the same thing. It's a practical solution that many documentary filmmakers swear by, and it costs nothing. But it still doesn't solve the fundamental problem.
Errol Morris decided to solve it completely.
The Interrotron: A Philosophy Made Into Hardware
Errol Morris directed The Thin Blue Line, The Fog of War, and Standard Operating Procedure, but The Act of Killing was directed by Joshua Oppenheimer, not Errol Morris., in the way that only great artists are obsessed, with a very specific question: what does it mean to make real contact with someone through a camera?
His answer was a device he calls the Interrotron.
The mechanics are elegant in their simplicity. The Interrotron uses two teleprompter rigs — the kind politicians use to read speeches while appearing to look at the audience. Each teleprompter uses a half-silvered mirror (a beam splitter) that allows you to see an image projected onto it while also seeing through it. Morris sets up two of these facing each other. Camera A is positioned behind one beam splitter, filming the subject. Camera B films Morris himself. The system projects Morris's live image onto the beam splitter in front of the subject, and the subject's live image onto the beam splitter in front of Morris.
The result is remarkable: the subject looks at Morris's face, which appears to float exactly where the camera lens is. Looking at the interviewer is looking into the lens. The two are one and the same. When you watch the footage, the subject is making direct eye contact with you, the viewer — not the slightly-off-frame gaze of the traditional documentary setup, but genuine, unflinching contact.
graph LR
A[Errol Morris] --> B[Beam Splitter A]
B --> C[Morris's image projected to subject]
D[Subject] --> E[Beam Splitter B]
E --> F[Subject's image projected to Morris]
G[Camera A] --> H[Films subject through beam splitter]
I[Camera B] --> J[Films Morris through beam splitter]
C --> D
F --> A
Morris has talked at length about what this achieves. The technical innovation is interesting, but the philosophical implication is what matters to him: the Interrotron collapses the distance between subject and interviewer, and between subject and audience. Robert McNamara, the former Secretary of Defense who is the subject of The Fog of War, confesses to complicity in war crimes while looking directly into the camera — which is to say, directly at you. The effect is shattering in a way that's almost impossible to achieve with conventional setup. You're not watching McNamara talk to someone else about his guilt. He's talking to you.
"I thought the Interrotron was a way of preserving something that normally gets lost," Morris has said. The thing that gets lost in conventional documentary interviewing is the first-person quality of real conversation — the experience of being addressed directly, of being seen.
This is worth sitting with even if you never build an Interrotron, which most documentary filmmakers never will. It crystallizes something true about the problem: conventional camera setup sacrifices intimacy for technical convenience. Every time you put a camera off to the side and coach your subject to look at you instead of the lens, you're making a practical compromise. It's often the right compromise. But it is a compromise.
The Interrotron's real lesson isn't about beam splitters. It's about the value of direct address — and about how much ordinary documentary convention undercuts the very thing it's trying to capture.
The First Ten Minutes on Camera
Every documentary filmmaker has a version of the same practice, and every documentary filmmaker will tell you it's not really a trick — it's just common sense applied consistently. The first ten minutes of a recorded interview are almost never usable. They're the warm-up, the calibration, the period during which your subject is consciously aware of being recorded and performing accordingly. Your job in those first ten minutes is to burn through the self-consciousness, not to extract usable material.
Here's how working documentary filmmakers approach that period:
Start rolling before you say you're starting. The moment you announce "we're recording now" is the moment the performance mode engages. Many filmmakers ask the camera operator to begin rolling before formally starting the interview, using pre-interview conversation as the actual start of the session. You might warn subjects in advance that you do this — or you might simply roll and tell them afterward that anything they said before you "officially" started is fair game. The ethics of this vary, but the psychology behind it is sound: the best material often comes before the guard is fully up.
Begin with genuinely easy questions. Not softball questions designed to flatter, but questions whose answers require no vulnerability and little thought. "Tell me how you ended up in this work." "What does your day typically look like?" "Walk me through how this place came to exist." These questions do two things: they generate footage of your subject engaged in the world (more on that later), and they let subjects hear their own voice on camera without stakes attached. By the time you get to the questions that matter, talking on camera has become slightly less alien.
Make a deliberate fuss about the technical setup, then wave it away. This sounds counterintuitive, but it works. Letting subjects watch the camera being positioned, letting them see the monitor (briefly), letting them ask the "where should I look?" question and getting a clear answer — all of this demystifies the equipment. Then, once it's set and you stop referring to it, the equipment tends to recede from subjects' consciousness more completely than if you'd tried to minimize it from the start.
Forget the interview. The single most effective thing you can do in the first ten minutes of a recorded interview is have a conversation. Not conduct an interview — have a conversation. Ask a follow-up question that isn't on your list because you're genuinely curious. Let something they said in passing become a small tangent. Laugh at something funny. The moment the subject stops monitoring themselves is usually the moment you stop seeming like a journalist and start seeming like a person, and that transition — once it happens — tends to hold.
One working documentary filmmaker described it this way: "I basically run the interview for twenty minutes before I need anything. I'm not wasting time. I'm doing the most important part of the job, which is making the camera irrelevant."
Technical Setup as Interview Design
Here is something that gets said too rarely: the technical decisions you make before an interview begins are interview decisions. Where you put the camera, how you light the room, what the subject can see and hear — all of this shapes what the interview will be.
Camera distance has psychological weight. A tight close-up forces a kind of intimacy that can be revelatory or invasive, depending on the subject. A wider shot gives subjects more physical and psychological space — they feel less pinned. Documentary interviews typically settle somewhere in the mid-close-up range: head and shoulders, enough to read facial expressions, not so tight that every micro-expression becomes a statement. But the right choice depends on the subject and the material. For testimony about difficult experiences, a slightly wider frame can help. For subjects who need to project authority, a tighter frame often serves the storytelling better.
Lighting affects mood and comfort in ways that subjects experience even when they don't consciously register them. Harsh overhead lighting creates tension. Soft, natural-feeling light tends to relax people — it feels less like a production, more like a conversation that happens to be filmed. Many documentary filmmakers prefer to work with available light, supplemented rather than replaced, for exactly this reason. The aesthetic is usually more honest too: it looks like where the person actually lives or works.
Sound is where inexperienced documentary producers most often make mistakes that undermine their interviews. Background noise that the human ear filters out becomes a constant presence on a microphone. Air conditioning, refrigerators, nearby traffic — all of these compete with your subject's voice and, more importantly, with the silences that often contain the most important moments. Getting good sound isn't just a technical matter; it's an interview matter. Subjects who have to fight ambient noise to be heard get tired and distracted, and the pauses you want to let live on tape get swallowed by hum and whoosh.
The crew is part of the technical setup, and it matters more than most beginning filmmakers realize. Every person in the room is part of the audience your subject is performing for. A crew of five creates a very different psychological environment than a crew of two. The most intimate documentary interviews often happen with minimal crews — sometimes just a filmmaker and a camera operator, sometimes a filmmaker handling their own camera. The fewer people in the room, the easier it is to create a genuinely private-feeling conversation.
Managing a crew during an interview means briefing them on the stakes of silence and stillness. A crew member who shifts weight at the wrong moment, coughs during a pause, or exchanges a glance with another crew member can shatter a fragile atmosphere in seconds. The best documentary crews have internalized this. The interviewer's job, in the days before a shoot, includes communicating the emotional intelligence the moment requires — not just the technical specs.
Radio and Podcast Interviewing: A Different Animal
Everything we've discussed so far has assumed visual media — camera, picture, image. But some of the most consequential interviews of the last generation have happened in audio-only contexts, and audio creates its own distinct set of demands and possibilities.
The radio or podcast interview strips away everything except voice. No eye contact, no body language, no visual context, no nonverbal cues that can be read or misread. What remains is pure sound: the words, the rhythm, the pace, the quality of the silence. This is either a severe limitation or an extraordinary freedom, depending on how you work.
Terry Gross and Michael Barbaro have thought more publicly and more rigorously about audio interviewing craft than almost anyone else working in the medium, and their insights point in complementary directions.
Gross, who has hosted Fresh Air on NPR for more than forty years and has conducted thousands of long-form audio interviews, describes a particular challenge of radio: dead air. In television, silence can be read visually — the camera can hold on a face while a subject thinks, and viewers understand what they're seeing. In radio, silence is just absence. "In radio, dead air is really a scary thing," Gross has said. This changes the interviewer's relationship to pauses. The strategic silence that is so powerful in other contexts becomes more complicated in audio — you're balancing the value of letting a thought breathe against the technical reality that listeners may interpret a long pause as a broadcast problem.
At the same time, Gross describes how the absence of visual cues forces a deeper quality of listening. When she conducts long-distance interviews by phone — which is most of her interviews — she notes that "we both have to listen really intently because there's no body language... there are no other cues." This enforced attentiveness, she argues, can create a kind of intimacy that visual interviews sometimes lack. Without anything to look at, you're fully inside the voice.
Barbaro's practice on The Daily reflects a different audio-specific insight: the importance of narrative arc in a format where you can't use visuals to signal transitions or changes in tone. According to Barbaro, the production team thinks hard about the arc of every conversation before it begins: "Where does it start, where does it go, and where may it end." This isn't just editorial planning — it's an interview design problem. In audio, the listener has no visual anchors. The conversation itself has to generate its own momentum and coherence. If the interview wanders, the listener wanders. The structure has to be carried in the questions.
Barbaro also describes the value of the shortest possible questions — "Why?" "What were you thinking?" "What does that mean?" — as particularly important in audio. In visual media, an interviewer can use gesture, expression, or physical stillness to signal that they want more. In audio, the interviewer's presence is limited to sound, which means the question itself has to do all the work of invitation. Short questions, in audio, tend to feel more open — they give the guest the floor without cluttering it with the interviewer's framing.
Gross, meanwhile, uses a tactic that's powerful in audio as anywhere: echoing a subject's words back to them. "I'll take a key word that somebody has said, or a phrase, and I'll say, 'You said this, what do you mean by that?'" In audio, this technique is particularly powerful because it signals genuine listening — the guest can hear that you were actually paying attention to what they said, not just waiting for your next question. And in a medium where the listener can't see the interviewer's nodding head or engaged expression, verbal acknowledgment of what the guest has said is the only way to make listening audible.
One thing audio does extraordinarily well: it gets out of the way. Without the distraction of visual composition, lighting choices, or camera angle, the listener is entirely focused on what's being said. The best podcast interviewers understand this and resist the temptation to fill the space with production flourishes. The craft is in the conversation, and the conversation is the whole show.
graph TD
A[Interview Medium] --> B[Visual/Documentary]
A --> C[Audio/Podcast]
B --> D[Eye contact and body language as tools]
B --> E[Camera position shapes meaning]
B --> F[Silence readable via facial expression]
C --> G[Voice and rhythm carry everything]
C --> H[Narrative arc must be self-sustaining]
C --> I[Silence risks dead air — must be managed]
D --> J[Interrotron solves the lens problem]
G --> K[Short questions open the floor]
H --> L[Pre-interview arc planning essential]
Remote Interviewing: What You Lose on Zoom
The COVID-19 pandemic forced a mass experiment in remote interviewing, and the findings — if you talked to documentary filmmakers and broadcast journalists during and after — were honest. Remote interviews work. They produce usable material. They're often the only option. But they cost you something real, and being honest about what you're losing is the beginning of compensating for it.
What you lose on Zoom:
The first thing you lose is the physical co-presence that makes real rapport possible. Being in a room together is a shared experience — you're both inhabiting the same space, breathing the same air, experiencing the same ambient conditions. That shared context creates a form of connection that no video call fully replicates. The lag inherent in video calls — even minimal lag — disrupts the natural timing of conversation in ways that are subtle but cumulative. People talk over each other, hesitate, hold back because they're not sure if the other person is done. The overlapping, responsive quality of good conversation gets flattened.
The second thing you lose is the body. On Zoom, you can see faces — sometimes, connection quality willing. But you lose posture, gesture, the shift of weight that signals discomfort, the leaning forward that signals engagement. You lose the peripheral visual information that experienced interviewers unconsciously process and respond to. Terry Gross has noted the paradox here: long-distance interviews require more intense listening precisely because you've lost so many of the cues that ordinary conversation provides.
Third, you lose control of the environment. In an in-person interview, you can shape the lighting, the sound, the visual context, the feel of the room. On Zoom, your subject is sitting in their own space, with their own lighting (usually terrible), their own background (frequently cluttered), their own audio setup (often a laptop speaker that muddies the voice). You're at the mercy of their environment, and their environment is usually not optimized for conversation.
What you can do about it:
Before a remote interview, send a brief technical prep note. Ask subjects to find a quiet room, close windows, silence phones, and if possible use earbuds rather than speaker audio. Many documentary filmmakers now ship interview kits to remote subjects — a simple USB microphone, a ring light — before conducting video interviews. The difference in audio and visual quality is significant, and the difference in how subjects carry themselves when they look and sound good is real.
On the call itself, slow down. The artificial environment of video calls tends to accelerate conversation in unhelpful ways — people are slightly more anxious, slightly more eager to fill space, slightly less comfortable with pauses. Deliberately pacing your questions, leaving more space than you think you need, signals to your subject that there's no rush. This is also good practice for audio-only phone interviews, where the lack of any visual cues creates even more pressure to fill silence with words.
And be explicit about what you're compensating for. One documentary filmmaker who works extensively with remote subjects now opens every video interview by saying: "I'm going to be a little more explicit than I would be in person about following up on things I find interesting, because I can't read your body language the way I could if we were in the same room. So if I pause and look at you, I'm inviting you to keep going." This kind of transparency about process, counterintuitively, tends to relax subjects — it makes them feel like you're working with them rather than extracting from them.
B-Roll and the Second Conversation
Here is something that experienced documentary filmmakers know and beginning ones learn the hard way: the most important thing a subject says is often not said during the formal sit-down interview. It's said while they're showing you something, doing something, moving through a space — while the b-roll camera is rolling and nobody has told them that this moment matters.
B-roll, in documentary context, refers to the supplementary footage that editors cut away to during an interview — shots of subjects in their environments, doing their work, existing in the world outside the interview chair. Collecting it requires time in the field with your subject, time that is ostensibly "not the interview." But in practice, it often is the interview, or the most important part of it.
Why? Because the formal interview is a declared event with declared stakes. The subject knows they're being asked to testify, to articulate, to produce memorable statements. They're in performance mode, however successfully the first ten minutes managed to reduce it. The b-roll shoot is different. The camera is present but the focus is elsewhere — on the subject's hands as they work, on the environment, on the texture of daily life. The subject's guard, which has been carefully but never completely lowered during the interview, tends to drop further. People say things in passing. They volunteer information they wouldn't have volunteered in the interview chair. They contextualize what they said in the interview with what they actually think.
One documentary filmmaker who has worked in conflict zones described the phenomenon this way: "I always get my best material in the car. We've finished the interview, we're driving somewhere to get b-roll, and they're talking to me like I'm a person now, not a documentary filmmaker. That's when they say the thing that becomes the spine of the film."
The implication for how you run a documentary interview is significant: don't pack up when the sit-down is over. Stay. Let the conversation continue. Keep at least one camera ready. Some filmmakers keep a small camera rolling throughout the b-roll collection, specifically because they've learned that the transition out of "interview mode" often produces the most authentic material.
There's also a listening implication here. The best documentary filmmakers are conducting two parallel interviews throughout a shooting day: the formal one, which produces the testimony, and the informal one, which produces the truth. Staying attuned to both requires a form of sustained, low-intensity attention — what you might call background listening — that can be exhausting but is often what separates a film that illuminates from one that merely documents.
Editing and the Interview: Thinking Backward from the Cut
Documentary and broadcast interviews are almost never aired or published in full. They're edited — sometimes heavily — and knowing this should change how you conduct them. This is a dimension of broadcast and documentary interviewing that print journalists don't typically have to contend with, and it's one that beginning filmmakers often underestimate.
What editing does to an interview:
Editing compresses time, creates apparent continuity where there were pauses and detours, and shapes the meaning of individual statements by what appears before and after them. A statement that meant one thing in the context of the conversation can mean something different in the context of the edit. This is one of the central ethical tensions in documentary filmmaking — and we'll take it up more fully in the section on interview ethics — but it also has craft implications that are worth thinking about here.
If you know the interview will be edited, you should conduct it in ways that produce cleanly separable material. This means several practical things:
Give subjects time to complete thoughts before following up. In conversation, we often start responding before the other person finishes speaking. In edited documentary, this creates a problem: the interviewer's voice overlaps with the subject's, and the overlapping portion becomes unusable. Training yourself to wait — genuinely wait — until a subject has fully completed a thought before responding produces cleaner material and, as we discussed in the section on listening, better conversation.
Re-ask questions whose answers started badly. If a subject begins a response with "Well, you know, it's kind of complicated, but I think what I'm trying to say is..." you're not going to be able to use those first fifteen seconds. Many documentary filmmakers will stop a subject mid-answer and say, "Let me ask that again — can you start your answer in a slightly different way?" This is standard practice in broadcast, entirely acceptable to most subjects, and produces much more usable material.
Think in quotable units. The interviewer who knows their footage will be edited is always half-thinking about what a self-contained, citable, audiographic unit sounds like. Not in a manipulative way — not trying to trap subjects into producing sound bites — but in the way that a skilled portrait photographer thinks about light. You're working in a medium with particular formal constraints, and understanding those constraints allows you to serve both your subject and your audience more effectively.
Michael Barbaro and Terry Gross both describe this explicitly: the interview has an arc, and that arc is partly editorial. The conversation is designed — through preparation, through sequencing, through the shape of questions — to produce material that will tell a story. Barbaro describes sometimes coaching guests away from journalist-style "inverted pyramid" answers and toward chronological narrative, because chronology produces drama and drama serves the edit. Gross describes asking subjects to make answers shorter, not because brevity is a virtue in itself, but because length and drift make editing hard and meaning hard to find.
Leave breath in the room. Pauses, in documentary editing, are assets. An editor can tighten a response, but they can't insert silence that wasn't recorded. Interviewers who let subjects breathe, who don't immediately fill every pause with the next question, give their editors — and themselves — much more to work with. The moment after a subject finishes a difficult statement, before the interviewer responds, is often the most revealing footage in any documentary. It's where the weight of what was just said lands. You can't manufacture that in post. You have to let it happen in the room.
The Unified Discipline
What emerges from all of this — the Interrotron's philosophy of direct address, the first-ten-minutes warm-up, the technical setup as interview design, Terry Gross's enforced attentiveness in audio, the second conversation during b-roll, the editorial awareness that shapes how you ask questions — is not a collection of independent techniques but a unified discipline.
The camera changes the interview. The microphone changes the interview. The crew changes the interview. The knowledge that footage will be edited changes the interview. But these changes are not primarily technical challenges. They're extensions of the fundamental challenge of all interviewing: how do you create conditions under which a person can tell you something true?
The camera is, in the end, just a particularly aggressive form of the witness problem. Every interview has a witness problem — the subject is performing for you even without a camera, because performance is what human beings do when observed. The camera intensifies and concentrates that problem. The techniques documentary filmmakers and broadcast journalists have developed in response — the Interrotron, the warm-up, the b-roll conversation, the strategic audio silence — are all, at bottom, strategies for restoring human contact in an environment that has been technically optimized for recording rather than conversation.
The best documentary interviews feel, when you watch them, like you've stumbled on a private conversation. That feeling is not an accident. It's the result of very deliberate work, done by someone who understood that the goal was never to capture a subject. It was to reach one.
Only visible to you
Sign in to take notes.