Skip to main content

APOE Data Upload Process (NCRAD)

Introduction

This document outlines the step-by-step process NCRAD uses to:

  • Upload an APOE data CSV to Flywheel
  • Trigger the automated pipeline that splits files by center
  • Perform manual validation (“QC”) using staging outputs (no automated testing is performed by NCRAD or NACC)
  • Release the data to Alzheimer’s Disease Research Centers (ADRCs) by uploading a non-staging (production) file
  • Understand what ADRCs see in the ADRC Portal, including the download experience

Prerequisites

  1. Flywheel account access

    • NCRAD uploaders must have access to Flywheel.
    • If access is needed, contact NACC Help (nacchelp@uw.edu) to request Flywheel access.
  2. Correct Flywheel project access

    • You must know which Ingest project to upload APOE data to (e.g., ingest-apoe).
    • You must know which corresponding Staging project to use to review outputs (e.g., staging-apoe) when running in staging.
  3. Correct file naming

    • The center-splitting “gear” is triggered by specific filenames (regex rules).
    • Including “staging” in the filename routes outputs to staging for QC purposes (e.g., ncrad-apoe-staging-YYYY-MM-DD.csv).
    • Omitting “staging” from the filename routes outputs to center distribution projects (release). (e.g., ncrad-apoe-YYYY-MM-DD.csv).

Note: The exact required filenames may be finalized/maintained by NACC. If filenames change, the gear rules can be updated, but NCRAD must follow the current naming rules.

Key concepts (quick reference)

  • Ingest project: The Flywheel project where NCRAD uploads a single CSV with data for all centers. At minimum requires ADCID and PTID headers in the file.
  • Staging project: Internal validation area; split files appear here by ADCID so NCRAD can verify before release.
  • Distribution projects: Center-specific projects; outputs written here are visible to ADRCs (NCRAD may not be able to view all of them).
  • ADCID: Center identifier used for splitting (one output file per center).

(1) Upload staging file →(2) Validate staging splits →(3) Upload release file (no “staging” in filename) →(4) ADRCs download via ADRC Portal

A. Upload to Staging (Validation Run)

Step A1 — Log in to Flywheel

  1. Log in to Flywheel.
  2. In the left sidebar, click Projects. Confirm that you can see the NCRAD projects you have permission to access.

Screenshot: ncrad-apoe-01-flywheel-home-project-list.png

Step A2 — Open the correct Ingest project

  1. From the project list, open the Ingest project used for APOE data (e.g., ingest-apoe).

Screenshot: ncrad-apoe-02-ingest-project-selected.png

  1. In the project, select the Information tab (this is where uploads and attachments are managed).

Screenshot: ncrad-apoe-03-ingest-project-information-tab.png

Step A3 — Upload the staging CSV

  1. In the Information tab, under Attachments, click Upload. The “Drop files or click here to upload" dialog will appear.

Screenshot: ncrad-apoe-04-information-tab-upload-button.png

  1. Select the staging version of the APOE data file.
    • Ensure the filename includes “staging” so the pipeline writes split outputs to the staging project (e.g. *ncrad-apoe-staging-2026-01-16.csv).

Screenshot: ncrad-apoe-05-upload-dialog-select-staging-file.png

What happens next

  • The upload completes quickly. If overwriting a file, it may ask you to confirm first.

  • The upload triggers the CSV center splitter pipeline (a Flywheel gear) that splits the CSV into center-level files.

  • Center-level files are written to the Staging project (e.g. staging-apoe) for validation.

Step A4 — (Optional) Monitor the pipeline run in the Jobs log

  1. If you have access, open the project's Jobs Log.

Screenshot: ncrad-apoe-06-jobs-log-center-splitter-running.png

  1. Confirm a job starts for the CSV center splitter gear.

Screenshot: ncrad-apoe-07-jobs-log-center-splitter-complete.png

  1. Wait for completion. This step typically takes around 10 minutes.

Note: There is no automatic “pipeline complete” notification. You may need to check the job status manually or confirm completion by locating the outputs in the Staging project.

Step A5 — Open the Staging project and verify outputs

  1. Navigate to the corresponding Staging project for APOE data (e.g., staging-apoe).

Screenshot: ncrad-apoe-08-staging-project-selected.png

  1. Select the Information tab.

Screenshot: ncrad-apoe-9-staging-project-information-tab-split-files.png

  1. The ADCID is prepended to the original filename for the corresponding split.

Screenshot: ncrad-apoe-10-split-output-csvs.png

  1. Open one or more split files to validate contents.

Screenshot: ncrad-apoe-11-open-split-csv-validate.png

What to check (manual validation / “QC”)

  • Each split file contains rows for only one ADCID (one center).
  • Records for the center look complete and reasonable.
  • If there is a “center of concern,” spot-check that center’s file in particular.

B. Revise and Re-run Staging (If Issues Are Found)

Step B1 — Revise the source CSV

  1. If you find an issue in the staging outputs, update the source CSV on your side (NCRAD side).

Step B2 — Re-upload the corrected staging file

  1. Return to the Ingest project (see Step A2).
  2. Upload the corrected file using a filename that still matches the staging naming rule (i.e., includes “staging”) (see Step A3).

Screenshot: ncrad-apoe-12-reupload-corrected-staging-file.png

Notes

  • Many teams re-upload using the same filename (overwrite pattern) or a dated/round-labeled filename.
  • The key requirement is that the filename matches the gear’s regex rule for staging.

C. Release to ADRCs (Production/Distribution Run)

Once you have reviewed the staging center splits and confirmed they look correct, you can release the file to ADRCs by uploading a version of the file without “staging” in the filename. This triggers the production gear rule and writes the split outputs to center distribution projects.

Step C1 — Prepare the release filename (remove “staging”)

  1. Confirm staging outputs are correct (see Step A5).

  2. Create a release version of the file by removing “staging” from the filename.

    • Files with “staging" in the name are routed to the staging project for validation.

    • Files without “staging” in the name are routed to ADRCs.

Screenshot: ncrad-apoe-13-release-file-name-without-staging.png

Note: Gear rules are triggered by specific filenames (regex). If filenames change, the gear rules can be updated, but NCRAD must follow the currently required naming convention.

Step C2 — Upload the release file to the Ingest project

  1. In Flywheel, open the correct Ingest project for this data type (e.g., ingest-apoe).
  2. Go to the Information tab.
  3. Upload the release CSV (the filename does not include “staging"), for example, DEMO-ncrad-apoe-2026-01-16.csv.

Screenshot: ncrad-apoe-14-upload-release-file.png

What happens next

  • The pipeline runs the same splitting process, but instead of writing to staging, it writes outputs to each center’s distribution project.
  • Once the center-specific files land in the center’s distribution project, it triggers the Identifier Lookup pipeline which looks up the NACCID for each ADCID/PTID combination.
  • Subjects that are successfully linked to a NACCID are written to a new file in the center’s distribution project with an “_identifiers” suffix appended (e.g. 0_ncrad-apoe-YYYY-MM-DD_identifiers.csv).
  • Centers can then compare this with the original split file to determine which of their samples do not have a corresponding NACCID.
  • NCRAD may have limited visibility into distribution projects, because ADRCs typically only have access to their own projects.

D. What ADRCs See (ADRC Portal download view)

ADRCs typically access NCRAD-delivered files via the ADRC Portal, which presents downloads in a more user-friendly interface than Flywheel. For details, refer to the NCRAD APOE Data Download (for ADRCs) documentation.

E. QC checklist (Staging validation)

Use this checklist before releasing:

  • Expected split files exist (one per ADCID present in the upload)
  • No cross-center leakage (each split contains only that ADCID)
  • Spot-check priority centers (if any are known to be sensitive)
  • Ready to release (no “bad samples” present)

Troubleshooting / Support

Contact NACC operations/engineering support if:

  • You cannot access Flywheel projects
  • Upload does not trigger the pipeline (likely filename mismatch with regex rules)
  • Outputs do not appear after sufficient time
  • Center splits appear incorrect