Research methodology
Before recruiting real people, we generate synthetic participants drawn from probability distributions. Each participant gets a consistent persona, realistic speech patterns, and responses that reflect their role, industry, and company. The result is a complete transcript dataset we can use to test and refine the entire research pipeline.
Real recruitment takes weeks. Scheduling and conducting 80 interviews takes more. By the time the first real transcripts arrive, any problems in the interview guide, the coding pipeline, or the analysis framework are expensive to fix.
Simulation inverts that order. We run the full research workflow on synthetic data first, identify and fix every structural problem, and arrive at real recruitment with a proven setup. Simulation also provides a clean dataset for internal methodology demonstrations and client-facing process documentation.
Run the full coding and analysis workflow before a single real interview is scheduled. Catch problems in the guide, codebook, or report structure while changes are cheap.
Simulate 20 participants first. If certain questions produce thin, uniform answers, the guide needs work. Better to learn that from synthetic data than from 80 real interviews.
Real recruitment rarely hits exact targets. Simulation lets us verify that our probability distributions produce the intended sample composition before committing to a recruiting spec.
Simulated transcripts are a clean, concrete demonstration of how the approach works. They power methodology documentation, client presentations, and internal training.
Every participant is defined by four independently sampled dimensions. The first three establish who the participant is. The fourth controls how they talk.
The structural characteristics of the participant. For B2B studies this is typically seniority, industry, and company size. Each variable is drawn from a defined probability distribution that reflects the target population.
Industry shapes vocabulary, tooling, regulatory context, and the specific pain points a participant will describe. Weight industries to reflect the target market, not equal splits.
Verboseness controls how much each participant talks and determines the total interview length. It is sampled independently of all other dimensions — a VP can be terse, a manager can be expansive. Response depth is calibrated against a speaking pace of 150 words per minute.
Very Verbose: Tells stories, shares multiple examples, goes on tangents, elaborates without prompting.
Somewhat Verbose: Reasonable depth, answers with context but does not over-share.
Not Verbose: Short, direct answers. Few stories unless prompted.
Before generating any transcripts, generate the complete list of all N participants with their assigned values. This locks in the sample distribution so you can verify it matches targets before running the expensive transcript generation step.
Simulation follows a structured sequence. Each step has a clear input and output, and steps 2 and 4 include explicit checkpoints before proceeding.
Create a participant spec document for the study. This is the single source of truth for the simulation. It specifies:
research/00 How to simulate participants/ Ask Claude to randomly assign each participant their dimension values by drawing from the defined distributions. Claude outputs a complete table of all N participants. Before proceeding:
Feed Claude the participant roster, the interview guide, and the simulation rules. Claude generates transcripts in batches of 25 participants. Each batch is saved as a separate JSON file to keep file sizes manageable for downstream processing.
For each participant, Claude must answer every question in the interview guide and calibrate response depth to hit the verboseness target.
transcripts-001-025.jsontranscripts-026-050.jsontranscripts-051-075.jsontranscripts-076-100.jsonresearch/Interview Projects/{Study Name}/ After each batch, review a sample of transcripts manually. Check three things:
If any transcript fails validation, ask Claude to redo that participant before saving the batch.
These rules apply to every participant in every study. They are included verbatim in the simulation prompt so Claude applies them consistently.
Every question in the interview guide must be asked to every participant. No questions may be skipped. Participants may say they do not know the answer, but they cannot skip it.
Calibrate response depth so estimated interview duration (word count / 150 words per minute) falls within the acceptable range for the participant's verboseness group. Redo if outside range.
Simulate at Maze-output level — cleaned but conversational. Omit um, uh, hmm, and heavy false starts (Maze strips these). Include hedge markers like "I think," "I mean," "I guess," "kind of," "you know" (2–4% of words). Target 15–22 words per sentence. Allow 0–1 subtle self-repairs per participant.
Each participant's answers must reflect their assigned seniority, industry, and company size. A VP at a 5,000-person tech company has different vocabulary, concerns, and frustrations than a manager at a 200-person healthcare company.
Studies are general market research unless specified otherwise. Do not skew awareness, sentiment, or behavior toward any specific brand, product, or vendor.
When participants mention company HQ location, weight toward high-density business states (California, Texas, New York, Florida, Illinois, Massachusetts) while including some spread across others.
Any spend figures, budget ranges, or willingness-to-pay amounts should scale realistically with company size. A 150-person company and a 5,000-person company operate on different budget scales entirely.
Larger companies know enterprise tools. Smaller companies know SMB tools. Mid-market companies know a mix. VPs and Directors have broader awareness than Managers. Awareness should follow these patterns, not be uniformly distributed.
After completing each participant, verify that every question has a response before moving to the next. If any question is missing, add it before proceeding.
Estimate interview duration after finishing each participant. If outside the acceptable range for their verboseness group, redo the participant before proceeding. Do not batch up out-of-range transcripts and fix them later.
Our interviews are recorded on Maze and AI-transcribed by Maze's platform. Maze's transcription pipeline does two things that shape the output significantly: it strips vocal disfluencies (um, uh, hmm) almost entirely, and it merges spoken clauses — so sentences in the transcript are longer than they were in real speech.
This means the right simulation target is Maze-output-level speech, not raw unedited speech. We simulate directly at that level rather than generating raw speech and running a separate cleaning pass.
"Um, so we were on Greenhouse — actually no, we were still on Lever at that point — and, uh, the biggest pain was just, like, getting the hiring managers to actually fill out their scorecards."
"So we were on Lever at that point and the biggest pain was getting the hiring managers to fill out their scorecards."
All transcripts are saved as JSON. Each file contains a batch of participants. The structure is consistent across studies, which means the coding pipeline can ingest simulated and real transcripts through the same process.
{
"study": "Study Name",
"participants": [
{
"participant_id": 1,
"seniority": "Director",
"industry": "Tech/Software",
"company_size": "100-500",
"verboseness": "Not Verbose",
"transcript": [
{
"question_id": "Q1.1",
"question": "Full question text...",
"response": "Participant's simulated response..."
},
{
"question_id": "Q1.2",
"question": "Next question...",
"response": "Next response..."
}
]
},
{
"participant_id": 2,
...
}
]
} 25 participants per file. Larger batches approach context window limits and make it harder to spot errors in any individual participant.
Participant-level fields (seniority, industry, company_size, verboseness) are stored alongside the transcript so coding agents can access them without joining to a separate roster file.
Every question carries both an ID and the full question text. IDs enable exact joins to the codebook; full text is available for coding agents that need it without a lookup.
The same structure is used for real participant transcripts. This means the coding pipeline, analysis scripts, and report templates work identically on both simulated and real data.
Synthetic participants are useful only if they produce data that exercises the research pipeline the same way real participants would. These are the quality signals we look for.
If multiple simulated participants give thin or nearly identical answers to a question, the question is probably too narrow, too leading, or positioned too late in the guide. This is exactly what simulation is for. Fix the guide and regenerate before real recruitment begins.
Similarly, if the codebook discovery pipeline produces very few distinct themes from a simulated dataset, the interview questions may not be generating enough variation to support meaningful segmentation.
| File type | Location | Notes |
|---|---|---|
| Participant spec / sampling framework | research/00 How to simulate participants/examples/ | Reusable methodology reference, not study-specific data |
| Participant roster | Appended to the participant spec document | The realized sample — stays with the spec |
| Transcript JSON files | research/Interview Projects/{Study Name}/ | One folder per study; batched by 25 |
| Coded data and analysis scripts | research/Interview Projects/{Study Name}/ | Output of the coding pipeline |
| Interview guide (study-specific) | research/Interview Projects/{Study Name}/ | Or in research/0 How to prepare interview guides/ if reusable |
Starting a new simulation requires two inputs: an interview guide and a participant spec. Everything else follows from those two documents. Here is exactly what to hand Claude to kick off a new study.
The guide should be a numbered list of questions with question IDs (Q1, Q2.1, Q3, etc.). If you have a guide already, that is your input. If not, create one first using the interview guide preparation workflow.
Save the guide to research/Interview Projects/{Study Name}/ before you begin.
Tell Claude the following for your study:
Hand Claude this prompt, filling in the bracketed values for your study. Claude will build the participant spec, generate the roster, and begin producing transcripts in batches of 25.
I want to simulate [N] interview participants for a study called "[Study Name]". PARTICIPANT POPULATION [Describe who these people are — role, function, company type] DIMENSIONS AND DISTRIBUTIONS Seniority: [Level]: [%] [Level]: [%] Industry: [Industry]: [%] [Industry]: [%] Company size: [Range]: [%] [Range]: [%] Verboseness: 50% Not Verbose / 30% Somewhat Verbose / 20% Very Verbose [Adjust if needed] INTERVIEW GUIDE [Paste your full interview guide here, with question IDs] INSTRUCTIONS 1. First, build the participant spec document and save it to research/00 How to simulate participants/examples/[study-name]-participants.md 2. Generate the participant roster (all [N] participants with assigned dimension values). Show me the roster table and the realized distribution before proceeding to transcripts. 3. Once I approve the roster, generate transcripts in batches of 25, saving each batch to: research/Interview Projects/[Study Name]/transcripts-001-025.json (and so on for each batch) 4. Apply the Maze-transcript speech style and all simulation rules from research/00 How to simulate participants/README.md
Claude will pause after generating the roster and show you the realized distribution. This is your checkpoint to verify the sample looks right before committing to the full transcript generation. Check:
Once you approve, Claude generates transcripts batch by batch. Each batch of 25 takes roughly 10–20 minutes and one API call.
Once all transcript batches are saved, run the coding pipeline (research/2 How to code transcripts/) pointing at your new study folder. The pipeline reads the same JSON format that simulation produces, so there is no conversion step. Simulated and real transcripts go through exactly the same downstream process.