Archive Diving: A Practical Guide to Researching History Using Primary Sources

Section 9 of 14

How to Evaluate and Analyze Primary Sources

Evaluating Primary Sources: The Art of Source Criticism

You've found it: a census record with your great-grandmother's name, a newspaper article about a trial that shaped your town, a court document that might rewrite a piece of history you thought you knew. The instinct at this moment is to believe it. It has a date. It has a signature. It's official. It must be true.

But you've already seen the problem with this instinct — throughout the last section, we watched how newspapers could challenge, expose, or distort reality in ways that official records couldn't capture. The same principle applies to every other document type you'll encounter in archives. A census record isn't a neutral snapshot; a court transcript isn't a perfect recording of what was said; a government file isn't a complete account of what actually happened. Each document is a product of specific choices, constraints, and interests. Understanding what you're actually looking at when you read a historical document — who made it, why, for whom, under what constraints, with what possible errors or blind spots — is what separates rigorous research from wishful reading.

This section formalizes the critical thinking you've already been practicing. We'll move from the intuitive skepticism you've applied to newspapers to a systematic framework you can apply to any primary source: census records, military files, court papers, immigration documents, and everything else you'll find in archives.

Is This Document What It Appears to Be?

External criticism asks the physical and biographical questions about a document before you worry about its content. Think of it as triage: you need to know what you're holding before you can use it.

Provenance: Where Has This Document Been?

Provenance is the chain of custody — the documented history of where a record has been since it was created. A letter held continuously in a family's private papers since 1863 carries a very different evidentiary weight than a letter that surfaced at a flea market in 1975 with no history attached.

In archival practice, provenance matters for two concrete reasons. First, breaks in the chain of custody create opportunities for fraud, alteration, or misidentification. Second, where a record was kept often tells you as much as what it says. A personnel file found in a company's own archive means something different than the same file found in a regulatory agency's records — the latter probably ended up there because of an investigation, which changes everything about its context.

When you access a document, always ask: How did this archive acquire it? Are there accession records, deed of gift agreements, or transfer documents? Archivists maintain these, and you can usually ask.

Physical Characteristics: Does This Document Look Right?

You don't need to be a forensic document examiner to spot problems. Professional historians and librarians have caught significant frauds by noticing things like:

A document uses a typeface that didn't exist at its claimed date
The paper is wrong — too white, too smooth, wrong watermark for the period
The ink is the wrong color or shows the wrong degradation pattern
A signature is in ballpoint pen on a document claimed to be from 1820
The vocabulary or spelling conventions don't match the period

The most useful question you can ask is: Does everything about this document fit what I know about documents from this time, place, and institutional context? A "1910 naturalization record" that looks crisper than the surrounding documents in the same file should at least make you pause.

For digital materials, physical examination becomes trickier — but metadata can fill in the gaps. A photograph's EXIF data, a document's creation timestamp, inconsistencies in digitization quality can all signal that something is off.

Dating

Sometimes documents lack explicit dates, or their dates are questionable. Dating a document externally means using everything other than the document's own claims to establish when it was made. What events does it reference? What terminology does it use? Does it mention institutions, laws, or technologies that narrow down the time frame?

A letter that mentions "the railroad" constrains things considerably. One that mentions "the railroad strike" might narrow it to specific years. Historians call this terminus post quem (the document can't be earlier than the latest thing it references) and terminus ante quem (the document can't be later than the earliest evidence of its existence).

Tip: When you're trying to date an undated newspaper clipping in a family scrapbook, look for the advertisement on the reverse side. Store sales, prices, and product names can often narrow the date range significantly — and the Library of Congress's Chronicling America database can help you confirm by searching known issues.

Forgeries and the Question of Authenticity

Outright forgeries in everyday archival research are rare — you're unlikely to encounter one in a county courthouse. But they're not unheard of, and the kind of forgery most likely to affect ordinary researchers isn't a document invented from whole cloth. It's:

Alteration: A genuine document with a name, date, or number changed
Misidentification: A genuine document incorrectly described or attributed
Anachronistic transcription: A genuinely old copy of a document, presented as if it were the original

The last one is particularly important for genealogists and local historians. Before photocopiers existed, archives routinely made handwritten or typed copies of important documents. A copy made in 1920 of an 1860 letter is a primary source about that letter, but it's not the original — and the copyist may have introduced errors, "corrected" spellings, or filled in illegible sections from guesswork.

Always try to establish whether you're looking at an original, a contemporary copy, a later copy, or a transcription. It matters more than you'd think.

Internal Criticism: What Does This Document Actually Mean?

Once you're satisfied that a document is what it appears to be, the real work begins. Internal criticism interrogates the content: Is it accurate? What does it mean? What did it leave out, and why?

The historical method framework defines the core question of internal criticism as: "What is the evidential value of its contents?" That sounds straightforward. It's not.

The Six Inquiries: Who, What, Where, When, Why, How

Garraghan's classic framework asks six questions of any source. Each one is harder than it looks.

When? When was this produced? For most archival documents, you'll know a date — but "date" is slipperier than it seems. The date a document was created, the date of the events it describes, and the date a copy was made can all differ. A soldier's pension file might be created in 1890 based on interviews about events in 1862, filtered through 28 years of memory and institutional expectations about what "evidence" looks like. That matters.

Where? Not just the physical location, but the institutional context. A death recorded in a hospital is different from one recorded in a church register is different from one recorded in a probate court. Each institution had its own reason for caring about the event, its own procedures, its own record-keepers with different levels of training.

Who? Who actually created this document? The person whose name is at the top isn't always the author. A census record was created by an enumerator visiting your ancestor's home — not by your ancestor. An obituary was written by a newspaper editor, often based on information supplied by the family. A court transcript was created by a court reporter following conventions about what gets recorded verbatim and what gets summarized.

From what pre-existing material? This is the question historians call analysis. Did the document's creator have firsthand knowledge, or were they working from earlier records, hearsay, or their own assumptions? A county history published in 1880 about events in 1820 was probably compiled from interviews with elderly residents, earlier local newspapers, and the author's judgment about which stories were worth telling. Each of those layers introduces potential distortion.

In what original form? Is this the document as it was originally created, or has it been edited, abridged, or summarized? Many document types that look comprehensive are actually abstracts. A passport application summary is not the same as the original application. A "register" of births is often a later compilation from individual records, not the records themselves.

What is the evidential value? This is the synthesis question. Given everything you've learned about who made this document, when, where, from what sources, in what form — what can you actually use it as evidence for? Not what it says, but what it proves. These are often quite different things.

How Institutional Creation Shapes What Gets Recorded

This is one of the most important ideas in all of archival research, and it's one that most guides don't emphasize enough: records don't exist because events happened. They exist because an institution had a reason to care about those events.

Think about what that means in practice. The federal census recorded households in which a free person was present and willing to answer the door. Before 1870, it didn't record enslaved people by name at all — just as tick marks in an age-and-sex column under their enslaver's entry. The reasons for this weren't accidental; they reflected the political and social decisions that shaped the census itself. You can find detailed analysis of how these design decisions affected what census records reveal in the Census Bureau's own historical documentation.

Consider what bureaucracies consistently don't record:

Events that were so routine they seemed unworthy of documentation
Events that would have been legally or politically embarrassing to record
People who actively avoided documentation (for very good reasons)
Events in communities the institution didn't reach, trust, or care about

A county probate court created records when estates went through official channels. When a family divided property informally among themselves, with no dispute and no court involvement, there's no record. The court's silence doesn't mean property wasn't transferred — it means the institution wasn't involved.

This is why understanding who created a record and why isn't a formality. It's the key to knowing what questions you can and can't answer with that record.

Remember: The absence of a record is not evidence that an event didn't happen. It's evidence that the institution responsible for recording it either didn't know about it, didn't care, or had reasons not to document it. These are very different things.

Reading Against the Grain

One of the more powerful techniques in a historian's toolkit is reading a document against the grain — finding information the document wasn't designed to reveal, or drawing inferences from what it chose not to say.

A pension application submitted by a Civil War veteran in 1890 was designed to establish that the applicant was disabled and had served honorably. That's what the official form asked for. But buried in the supporting affidavits — sworn statements from fellow soldiers and neighbors — you might find offhand references to prewar occupations, family relationships, the community's opinion of the applicant, or the circumstances of specific battles. None of that was the point of the document, but it's all there.

Similarly, a newspaper account of a labor strike in 1905 was designed to inform readers (and to serve the interests of the paper's owner and advertisers). As FSU's guide to newspaper analysis notes, newspapers "record historical events, but they do so in a way that reflects the concerns, opinions, and debates of their communities." A strike reported in a business-friendly daily will look different from the same strike reported in a labor paper. Both are telling you something true. But reading them against their own framing — asking whose voices are missing, what the paper assumes its readers already know, what it treats as obvious — can reveal as much as what's explicitly stated.

Reading against the grain requires holding two ideas simultaneously: taking a document seriously as evidence while remaining skeptical of its framing. It's not about assuming documents lie. It's about recognizing that all documents were made by humans with perspectives, constraints, and purposes — and that those purposes don't always align with yours as a researcher.

Corroboration: Why One Source Is Never Enough

The historical method framework is clear on this point: "If two independently created sources agree on a matter, the reliability of each is measurably enhanced." The inverse is equally true: a claim supported by only one source, however official-looking, should be held more lightly than a claim supported by three independent sources that reached similar conclusions by different paths.

Corroboration isn't just about confirming facts. It's about triangulating toward a more complete picture of what happened, how, and to whom.

Consider a practical example from genealogy research. You find a death certificate that lists your ancestor's cause of death as "pneumonia" in 1918. That's useful. But 1918 is the year of the influenza pandemic — and pneumonia was frequently recorded as the cause of death in influenza patients, because the flu created conditions for fatal secondary pneumonia. If you corroborate with:

Local newspaper death notices from the same week
Church burial records showing multiple deaths in the same household
County health department reports (if they survive) from that period

...you can build a much richer picture of whether your ancestor died of the flu pandemic, and what that meant for the community.

The rule of thumb: one source is a starting point, two sources in agreement are suggestive, three independent sources are getting toward solid ground. And "independent" matters — two records that both trace to the same informant (say, two documents where the family supplied the information) don't count as independent corroboration.

Absence of Evidence: What Silence Tells You

This is one of the trickiest problems in archival research. You search for a record and don't find it. What does that mean?

It could mean:

The event didn't happen
The event happened but was never recorded
The event was recorded, but those records were later destroyed
The records survive but haven't been indexed, digitized, or described in a way that your search found them
The records exist but in a location you haven't thought to look

Before concluding that the absence of a record means the absence of an event, you need to do some infrastructure work. What records should exist if this event happened? What institution would have created them? Do those records survive in general — or is this a period or place where record survival is known to be poor? Could there be a naming variation, a filing system quirk, or a geographic boundary issue that put the record in an unexpected location?

Absence of evidence becomes most meaningful when you can establish that:

Records of this type were systematically created during this period
Those records are substantially intact for this region and time
You've searched them thoroughly using all likely name variants and locations

Only then does "I couldn't find a record" carry genuine evidential weight — and even then, it means "not found" rather than "doesn't exist."

Warning: "No record = didn't happen" is one of the most common interpretive errors in amateur research. It's also one of the most consequential, because it can lead to abandoning a promising research direction entirely. Before concluding an event left no paper trail, ask whether you've found the right paper trail to search.

Handling Contradictory Sources

Sources disagree. They disagree constantly. This is not a problem to be solved — it's information to be interpreted.

The historiographic tradition offers several principles for navigating contradictions:

Don't default to majority rule. If four sources say one thing and one source says another, the majority version doesn't automatically win. You need to examine whether those four sources might all ultimately trace to the same original report.

Prefer the source with most authority for the specific claim. An eyewitness account of what someone said in a meeting is more authoritative than a secondhand report of that meeting — but the secondhand report might be more authoritative about the general context of why the meeting happened at all.

Consider what each source had to lose or gain. A defendant's own testimony in a trial and a prosecutor's summary of that testimony will both reflect their respective positions. Neither is simply wrong. They're both telling you something.

Look for what all versions agree on. When sources contradict each other, there's often a layer of shared facts beneath the disagreement. Those points of agreement are often more reliable than either version's interpretation of what those facts mean.

In practice, contradictory sources are often a sign that you're getting close to something interesting. Disagreements often cluster around contested events, disputed relationships, or situations where multiple parties had different interests. That friction is historiographically valuable — it's where the human drama is.

Common Interpretive Traps

Even experienced researchers fall into these. Knowing they exist is half the battle.

Presentism

Presentism is the error of interpreting past behavior, decisions, or values through contemporary moral and social frameworks. It's not wrong to have contemporary values — but it can make you systematically misread documents.

An 18th-century probate record that values household goods and enslaved people in the same inventory isn't telling you something unusual about that specific family. It's telling you something about the legal and economic structures of that society. If you respond with visceral shock rather than careful analysis, you might miss what the document is actually telling you about property, inheritance, and the specific circumstances of this case.

This doesn't mean detachment from the horror of what you're reading. It means holding your analysis separate from your moral response, so both can function.

False Precision

Historical documents often contain numbers, dates, and statistics that look precise but aren't. Ages in census records are famously approximate — many respondents genuinely didn't know their exact age, and enumerators often estimated. Heights on Civil War enlistment papers were often recorded to the nearest inch by men with no measuring equipment. Population figures in early censuses carried substantial undercounting.

Resist the urge to treat numbers in documents as more exact than they are just because they're written down. "Born circa 1847" is often more honest than "born 14 March 1847" when the evidence doesn't really support that level of specificity.

Confirmation Bias

You're looking for evidence that your great-grandfather was a doctor. You find a document that says he had a "practice" in town. You interpret this as medical practice.

Maybe it was. Maybe he had a legal practice, an accounting practice, or used the word in some other sense entirely.

Confirmation bias in archival research is insidious because archives reward persistence — and persistence can tip into looking for evidence that confirms what you already think, rather than following the evidence wherever it goes. The discipline is to ask: What would this document look like if my theory were wrong? Would I still be able to find it?

Worked Examples: Source Criticism in Practice

Theory is easiest to understand in motion. Here's how source criticism applies to three document types we've covered in this course.

Example 1: A Census Record

You're looking at the 1880 U.S. Census for a family in rural Georgia. The enumerator has recorded:

Head of household: James T. Hendricks, age 45, farmer
Wife: Martha Hendricks, age 40
Son: William Hendricks, age 16

External criticism: This is a federal government document created through a systematic enumeration process. The original schedule in NARA is almost certainly genuine. The digitized image you're seeing is a photographic reproduction — check whether you're looking at the original schedule or an indexed database entry, since transcription introduces errors.

Internal criticism: Who actually created this record? An enumerator hired locally, who knocked on doors and recorded answers verbally. He might have spelled the family name wrong. He might have guessed ages. He recorded what the person at the door told him — who may or may not have been James Hendricks himself. Martha's age may have been supplied by James. William's age may have been estimated.

The census records James as a "farmer" — but this category encompassed everything from prosperous landowners to sharecroppers to subsistence farmers. Without cross-referencing with agricultural schedule data or county tax records, "farmer" tells you relatively little about James's economic position.

What it does tell you with some reliability: this family unit existed in this place at this time, with approximately these relationships and ages. Everything else — the older children who may have moved out, the extended family members who may have lived nearby, the actual ages, the economic circumstances — requires corroboration from other sources.

Example 2: A Newspaper Article

You find an 1887 article in a small-town Ohio newspaper headlined "DARING ROBBERY AT HENDERSON'S STORE." The article describes a nighttime break-in, names a suspect who was arrested, and says the suspect confessed.

External criticism: Locate the newspaper itself — does it exist as an institution? (A search of Chronicling America can help confirm.) Is this issue consistent with the paper's known publication history? Is the typography and print quality consistent with other issues from that year?

Internal criticism: As FSU's guide to historical newspapers notes, newspapers are businesses shaped by their community and readership. An 1887 small-town paper was probably owned by someone with local business and political interests. Who was Henderson? Was the publisher connected to him?

More critically: the article says the suspect "confessed." Who told the newspaper this? Almost certainly the arresting officer or county sheriff, not the suspect himself. A reported confession in a newspaper is not the same as a documented confession in a court record. Check court records from the same county and period — did this case go to trial? What was the outcome? If the suspect was acquitted, that changes the meaning of the newspaper account considerably.

The article is excellent evidence that something happened at Henderson's store in 1887 and that there was a public suspect. It's weaker evidence for exactly what happened, and quite poor evidence for guilt.

Example 3: A Court Document

You're reading a plaintiff's complaint filed in federal district court in 1923, in which a Black sharecropper named Robert Williams alleges that a white landowner violated a contract.

External criticism: Federal court records are among the most rigorously maintained document types in American history. NARA holds these records systematically. Verify the case through the docket — if the case number appears in the court's docket book with matching parties and date, the document is almost certainly genuine.

Internal criticism: This document was created by Robert Williams's attorney, for the purpose of stating his client's best case as persuasively as possible. It is not a neutral description of events. It is advocacy.

Reading against the grain: the complaint will tell you what Williams and his attorney believed they could prove — which is itself valuable information. The specific allegations, the evidence they reference, the legal theory they chose to pursue all reveal the constraints and possibilities of a Black plaintiff bringing a contract claim in 1920s federal court.

What don't you have here? The defendant's answer. The testimony of witnesses. The judge's rulings. The outcome. Each of those would significantly change your understanding of the event. A complaint without the full case file is like reading the opening argument in a trial and assuming that's the whole story.

Putting It Together: The Researcher's Discipline

Source criticism isn't a checklist you run through once and set aside. It's a habit of mind — a permanent skepticism about your own interpretations, held alongside genuine enthusiasm for what documents can reveal.

The course thesis is worth restating here, because this section is where it becomes most concrete: primary sources don't speak for themselves. A census entry doesn't tell you its own limitations. A newspaper article doesn't announce its owner's political connections. A court complaint doesn't volunteer that you're only seeing one side of the dispute.

You have to bring those questions to the document. And once you have the habit of asking them — automatically, without thinking about it as a method — you'll find that documents become richer, not more suspicious. You start noticing things you would have walked past. The enumerator who recorded every family in a neighborhood but somehow skipped the Black households entirely. The newspaper that covered a union meeting with careful neutrality until the same paper's owner was mentioned — and then suddenly adopted a different tone. The pension applicant who described his service in curiously vague terms that nonetheless matched exactly what he would have needed to claim in order to qualify.

That's not cynicism. That's reading. And it's how archival research becomes genuinely illuminating rather than just a scavenger hunt for names and dates.

How to Find and Search Historical Newspapers How to Find Federal Court Records for Historical Research

Only visible to you