NACC's Error Checking Process
NACC Data Validation Pipeline
Submitting a form data CSV file to center's Flywheel ingest project will trigger the NACC data validation pipeline in Flywheel.
For the centers who are using REDCap direct entry, the records that are marked as "Ready for Data Platform Upload"
will be transferred to the center's Flywheel ingest project each night by NACC, and the same validation pipeline will be triggered.
The stages of the validation pipeline are
-
CSV format check:
Check whether the submitted CSV has correct headers and datatypes. The CSV should include only NACC accepted variables for the module, and exactly match the NACC published Data Element Dictionary for the module. Entire file will be rejected if there are any extra fields. -
NACCID lookup:
For each record in the CSV file, look up the NACCID using the adcid and ptid for that record. Record will not be processed further if there's no matching NACCID. -
Generate JSON file:
For each record in the CSV file, apply any necessary data transformations for the module and generate a JSON file. These JSON files are stored in the ingest project as aquisition data. A Flywheel hierarchy (Subject/Session/Aquisition) will be created for the file if it doesn't already exist. -
Extract metadata:
Extract information from the JSON files and store as Flywheel metadata, these will be used for search, dataviews, reports, etc. -
NACC QC checks:
Run NACC data quality checks according to NACC published error checks.