Skip to content

Git-DRS

Note

The tools listed here are under development and may be subject to change.

Overview

Use case: As an analyst, in order to share data with collaborators, I need a way to create a project, upload files and associate those files with metadata. The system should be capable of adding files in an incremental manner.

The following guide details the steps a data contributor must take to submit a project to the CALYPR data commons.

Core Concepts

In a Gen3 data commons, a semantic distinction is made between two types of data: "data files" and "metadata". more

  • Data File: Information like tabulated data values in a spreadsheet or a fastq/bam file containing DNA sequences. The contents are not exposed to the API as queryable properties.
  • Metadata: Variables that help to organize or convey additional information about corresponding data files so they can be queried.

1. Setup

CALYPR project management is handled using standard Git workflows. you will need the Large File Storage (LFS) plugin to track genomic data files and the Git-DRS plugin to interface with CALYPR's storage and indexing systems.

Visit the Quick Start Guide for detailed, OS-specific installation instructions for these tools.

Tool Purpose
git-drs Manages large file tracking, storage, and DRS indexing.
forge Handles metadata validation, transformation (ETL), and publishing.
data-client Administrative tool for managing collaborators and access requests.
{: .caption }

Git DRS Workflows

For complete Git DRS documentation including project initialization, file management, and upload workflows, see the Git DRS Quick Start.