Git-DRS¶
Note
The tools listed here are under development and may be subject to change.
Overview¶
Use case: As an analyst, in order to share data with collaborators, I need a way to create a project, upload files and associate those files with metadata. The system should be capable of adding files in an incremental manner.
The following guide details the steps a data contributor must take to submit a project to the CALYPR data commons.
Core Concepts¶
In a Gen3 data commons, a semantic distinction is made between two types of data: "data files" and "metadata". more
- Data File: Information like tabulated data values in a spreadsheet or a fastq/bam file containing DNA sequences. The contents are not exposed to the API as queryable properties.
- Metadata: Variables that help to organize or convey additional information about corresponding data files so they can be queried.
1. Setup¶
CALYPR project management is handled using standard Git workflows. you will need the Large File Storage (LFS) plugin to track genomic data files and the Git-DRS plugin to interface with CALYPR's storage and indexing systems.
Visit the Quick Start Guide for detailed, OS-specific installation instructions for these tools.
| Tool | Purpose |
|---|---|
| git-drs | Manages large file tracking, storage, and DRS indexing. |
| forge | Handles metadata validation, transformation (ETL), and publishing. |
| data-client | Administrative tool for managing collaborators and access requests. |
| {: .caption } |
Git DRS Workflows¶
For complete Git DRS documentation including project initialization, file management, and upload workflows, see the Git DRS Quick Start.