Data Analysis Plan

Data Analysis Plan#

A Data Analysis Plan (DAP) is a lightweight internal document used to track how we structure and analyze data during a project. It serves as a central reference for the team and evolves over time as the analysis progresses.

This is not a formal preregistration or replication package. Instead, it helps coordinate analysis decisions, catch issues early, and make handoffs or writeups easier.

Context#

The term DAP can mean different things depending on where it appears:

  • In a project proposal, it’s a detailed plan of methods and outcomes.
  • In a paper, it includes all code and materials for reproducibility.
  • In our group, a DAP is an evolving planning document focused on:
    • What data we have,
    • What we want to report or visualize,
    • What we plan to analyze and how.

When to Update#

Update the DAP when:

  • New data is added
  • Variables or codes are revised
  • Thematic or analytic structures change
  • Preparing for handoff, manuscript writing, or review

Structure#

A DAP in our group should cover the following:


1. Data Sets#

For all data:

  • Type and amount (e.g., 50 survey responses, 12 interviews)
  • Source and format (e.g., survey CSV export, interview transcripts)
  • Data cleaning or preprocessing steps
  • Inclusion/exclusion criteria
  • Versioning notes (e.g., v3 excludes test participants)

For qualitative data:

  • Transcription details (e.g., anonymized, timestamped, AI model used)
  • Storage location and format (e.g., Markdown, docx, coded in Atlas.ti)

2. Descriptive Overview / Reporting#

Quantitative:

  • Variables used for descriptive statistics
  • Grouping/binning strategies
  • Planned summary tables or figures

Qualitative:

  • Participant or context metadata (if relevant)
  • Summary of common patterns or anticipated codes (optional)

3. Analysis#

Quantitative:

  • Variables used in models or statistical tests
  • Derived or recoded variables
  • Planned statistical methods (e.g., regression, clustering)

Qualitative:

  • Coding approach (inductive, deductive, hybrid)
  • Number of coders and division of labor
  • Conflict resolution strategy (e.g., consensus meetings)
  • Reference to codebook or thematic documents (include link)
  • Reference related work or papers where thematic framing is reused

Example: Quantitative#

(Fake study ChatGPT and I made up for this example)

Study title: LoginUX: Evaluating Usability Tradeoffs in Multi-Factor Authentication

1. Data Sets#

We collected simulated login task data from two participant pools.

Data set LoginUX.mturk#

Data Set Initial (mturkRaw) Drop-Out / Removed (mturkDropout / mturkInvalid) Valid (mturkValid)
LoginUX.mturk 183 10 / 1 172

Data set LoginUX.internal#

  • Subset LoginUX.internal.stem
  • Subset LoginUX.internal.nonstem
Data Set Initial (raw) Drop-Out / Removed (dropout / invalid) Valid
LoginUX.internal.stem 38 3 / 2 33
LoginUX.internal.nonstem 10 1 / 0 9
LoginUX.internal 48 4 / 2 42

Combined valid: totalValid = mturkValid + internalValid = 172 + 42 = 214

2. Descriptive Overview / Reporting#

  • Demographics:

    • Age groups: 18–24, 25–34, 35–49, 50+
    • Gender: F / M / Other
    • Technical background: STEM / Non-STEM
  • Task structure:

    • Participants completed three login tasks (SMS, App, WebAuthn)
    • 10-item SUS after each task
    • Open-ended feedback at session end (stored separately)
  • Planned outputs:

    • Task completion time by method (boxplots)
    • SUS score distributions by method and background
    • Error types per method

3. Analysis#

  • Key variables:

    • Dependent: sus_score
    • Independent: method, tech_background, task_order
    • Derived:
      • task_success (boolean: true if task completed)
      • interaction_time (log-transformed total time)
  • Planned models:

    • Mixed-effects: sus_score ~ method + tech_background + (1 | participant_id)
    • Logistic regression: task_success ~ method + tech_background
    • Exploratory: ANOVA on interaction_time
    • Holm-Bonferroni correction for post-hoc comparisons

Example: Qualitative#

(Fake study ChatGPT and I made up for this example)

Study title: DevPass: Developer Attitudes Toward Password Managers in Practice

This study combines two qualitative data sources to examine how developers perceive, adopt, and discuss password manager use: (1) semi-structured interviews, and (2) analysis of public GitHub issue threads in developer-facing security tools.


1. Data Sets#

Interviews:

  • 18 conducted; 15 included in analysis (2 audio issues, 1 withdrawn)
  • Length: 35–60 minutes
  • Format: Verbatim, manually transcribed, anonymized
  • Storage: Markdown files in data/interviews/, tracked in interview_index.md

GitHub Issues:

  • 102 issues sampled from 6 open-source repositories tagged with authentication- or password-related labels (e.g., “auth”, “security”, “password”)
  • Repositories selected based on usage of password managers or integration with external credential stores
  • Thread data exported using GitHub API, manually cleaned, and stored in data/issues/ as Markdown
  • Inclusion criteria: technical relevance, non-duplicate, at least one substantive comment

2. Descriptive Overview / Reporting#

Participants (interviews):

  • Metadata collected: role, experience level, company size, password manager use
  • Grouped summaries by adoption status and technical background

Artifacts (issues):

  • Metadata recorded: repo name, issue date, number of comments, tags
  • Manual labeling of issue type (bug, feature, integration, user confusion, etc.)

Planned descriptive outputs:

  • Quote-driven overview of perceived barriers to adoption
  • Common failure points and integration problems from GitHub threads
  • Summary counts: tools mentioned, integration types, and usability complaints

3. Analysis#

Coding strategy:

  • Hybrid inductive/deductive approach
  • Initial codebook derived from prior usable security studies (e.g., adoption frameworks, tool trust factors)
  • Codebook refined iteratively and versioned in codebook_v3.md

Process:

  • Interviews and issues coded using the same code set with source-specific annotations
  • Two coders (C1, C2), overlapping on 25% of the data
  • Disagreements discussed in calibration meetings; consensus coding applied before theme development

Theme construction:

  • Themes developed jointly across both sources
  • Special attention to cross-source convergence (e.g., when interview complaints match public support patterns)
  • Candidate themes documented in themes_v2.md with representative quotes from both data types

Related literature:

  • Supporting citations and reused framing concepts from prior work documented in related_work.md
  • Includes quote excerpts from existing studies for thematic comparison