.. _clitutorial: #### CLI #### :: $ pysradb usage: pysradb [-h] [--version] [--citation] {metadb,metadata,download,search,gse-to-gsm,gse-to-srp,gsm-to-gse,gsm-to-srp,gsm-to-srr,gsm-to-srs,gsm-to-srx,srp-to-gse,srp-to-srr,srp-to-srs,srp-to-srx,srr-to-gsm,srr-to-srp,srr-to-srs,srr-to-srx,srs-to-gsm,srs-to-srx,srx-to-srp,srx-to-srr,srx-to-srs} ... pysradb: Query NGS metadata and data from NCBI Sequence Read Archive. Citation: 10.12688/f1000research.18676.1 optional arguments: -h, --help show this help message and exit --version show program's version number and exit --citation how to cite subcommands: {metadb,metadata,download,search,gse-to-gsm,gse-to-srp,gsm-to-gse,gsm-to-srp,gsm-to-srr,gsm-to-srs,gsm-to-srx,srp-to-gse,srp-to-srr,srp-to-srs,srp-to-srx,srr-to-gsm,srr-to-srp,srr-to-srs,srr-to-srx,srs-to-gsm,srs-to-srx,srx-to-srp,srx-to-srr,srx-to-srs} metadata Fetch metadata for SRA project (SRPnnnn) download Download SRA project (SRPnnnn) search Search SRA/ENA for matching text gse-to-gsm Get GSM for a GSE gse-to-srp Get SRP for a GSE gsm-to-gse Get GSE for a GSM gsm-to-srp Get SRP for a GSM gsm-to-srr Get SRR for a GSM gsm-to-srs Get SRS for a GSM gsm-to-srx Get SRX for a GSM srp-to-gse Get GSE for a SRP srp-to-srr Get SRR for a SRP srp-to-srs Get SRS for a SRP srp-to-srx Get SRX for a SRP srr-to-gsm Get GSM for a SRR srr-to-srp Get SRP for a SRR srr-to-srs Get SRS for a SRR srr-to-srx Get SRX for a SRR srs-to-gsm Get GSM for a SRS srs-to-srx Get SRX for a SRS srx-to-srp Get SRP for a SRX srx-to-srr Get SRR for a SRX srx-to-srs Get SRS for a SRX ======================================== Getting metadata for a SRA project (SRP) ======================================== The most basic information associated with any SRA project is its list of experiments and run accessions. :: $ pysradb metadata SRP098789 study_accession experiment_accession sample_accession run_accession SRP098789 SRX2536403 SRS1956353 SRR5227288 SRP098789 SRX2536404 SRS1956354 SRR5227289 SRP098789 SRX2536405 SRS1956355 SRR5227290 SRP098789 SRX2536406 SRS1956356 SRR5227291 SRP098789 SRX2536407 SRS1956357 SRR5227292 SRP098789 SRX2536408 SRS1956358 SRR5227293 SRP098789 SRX2536409 SRS1956359 SRR5227294 Listing SRX and SRRs for a SRP is often not useful. We might want to take a quick look at the metadata associated with the samples: :: $ pysradb metadata SRP098789 study_accession experiment_accession sample_accession run_accession sample_attribute SRP098789 SRX2536403 SRS1956353 SRR5227288 source_name: Huh7_1.5 µM PF-067446846_10 min_ribo-seq || cell line: Huh7 || treatment time: 10 min || library type: ribo-seq SRP098789 SRX2536404 SRS1956354 SRR5227289 source_name: Huh7_1.5 µM PF-067446846_10 min_ribo-seq || cell line: Huh7 || treatment time: 10 min || library type: ribo-seq SRP098789 SRX2536405 SRS1956355 SRR5227290 source_name: Huh7_1.5 µM PF-067446846_10 min_ribo-seq || cell line: Huh7 || treatment time: 10 min || library type: ribo-seq SRP098789 SRX2536406 SRS1956356 SRR5227291 source_name: Huh7_0.3 µM PF-067446846_10 min_ribo-seq || cell line: Huh7 || treatment time: 10 min || library type: ribo-seq SRP098789 SRX2536407 SRS1956357 SRR5227292 source_name: Huh7_0.3 µM PF-067446846_10 min_ribo-seq || cell line: Huh7 || treatment time: 10 min || library type: ribo-seq SRP098789 SRX2536408 SRS1956358 SRR5227293 source_name: Huh7_0.3 µM PF-067446846_10 min_ribo-seq || cell line: Huh7 || treatment time: 10 min || library type: ribo-seq The example here came from a Ribosome profiling study and consists of a collection of both Ribo-seq and RNA-seq samples. We can filter out only the RNA-seq samples: :: $ pysradb metadata SRP098789 --detailed | grep 'study|RNA-Seq' SRP098789 SRX2536422 SRR5227307 RNA-Seq SINGLE - SRP098789 SRX2536424 SRR5227309 RNA-Seq SINGLE - SRP098789 SRX2536426 SRR5227311 RNA-Seq SINGLE - SRP098789 SRX2536428 SRR5227313 RNA-Seq SINGLE - A more complicated example will consist of multiple assays. For example `SRP000941`: :: $ pysradb metadata SRP000941 --detailed | tr -s ' ' | cut -f5 -d ' ' | sort | uniq -c 999 Bisulfite-Seq 768 ChIP-Seq 1 library_strategy 121 OTHER 353 RNA-Seq 28 WGS ==================================================== Experiment accessions for a project (SRP => SRX) ==================================================== A frequently encountered task involves getting all the experiments (SRX) for a particular study accession (SRP). Consider project `SRP048759`: :: $ pysradb srp-to-srx SRP048759 ================================================ Sample accessions for a project (SRP => SRS) ================================================ Each experiment involves one or multiple biological samples (SRS), that are put through different experiments (SRX). :: $ pysradb srp-to-srs --detailed SRP048759 study_accession sample_accession SRP048759 SRS718878 SRP048759 SRS718879 SRP048759 SRS718880 SRP048759 SRS718881 SRP048759 SRS718882 SRP048759 SRS718883 SRP048759 SRS718884 SRP048759 SRS718885 SRP048759 SRS718886 This is very limited information. It can again be detailed out using the `--detailed` flag: :: $ pysradb srp-to-srs --detailed SRP048759 study_accession sample_accession experiment_accession run_accession study_alias sample_alias experiment_alias run_alias SRP048759 SRS718878 SRX729552 SRR1608490 GSE62190 GSM1521543 GSM1521543 GSM1521543_r1 SRP048759 SRS718878 SRX729552 SRR1608491 GSE62190 GSM1521543 GSM1521543 GSM1521543_r2 SRP048759 SRS718878 SRX729552 SRR1608492 GSE62190 GSM1521543 GSM1521543 GSM1521543_r3 SRP048759 SRS718878 SRX729552 SRR1608493 GSE62190 GSM1521543 GSM1521543 GSM1521543_r4 SRP048759 SRS718879 SRX729553 SRR1608494 GSE62190 GSM1521544 GSM1521544 GSM1521544_r1 SRP048759 SRS718879 SRX729553 SRR1608495 GSE62190 GSM1521544 GSM1521544 GSM1521544_r2 =============================================== Run accessions for experiments (SRX => SRR) =============================================== Another frequently encountered task involves fetching the run accessions (SRR) for a particular experiment (SRX). Consider experiments `SRX217956` and `SRX2536403`. We want to be able to resolve the run accessions for these experiments: :: $ pysradb srx-to-srr SRX217956 SRX2536403 --detailed experiment_accession run_accession study_accession sample_attribute SRX217956 SRR649752 SRP017942 source_name: 3T3 cells || treatment: control || cell line: 3T3 cells || assay type: Riboseq SRX2536403 SRR5227288 SRP098789 source_name: Huh7_1.5 µM PF-067446846_10 min_ribo-seq || cell line: Huh7 || treatment time: 10 min || library type: ribo-seq =============================================== Experiment accessions for runs (SRR => SRX) =============================================== For fetching experiment accessions (SRX) for one or multiple run accessions (SRR): :: $ pysradb srr-to-srx SRR5227288 SRR649752 --detailed run_accession study_accession experiment_accession sample_attribute SRR649752 SRP017942 SRX217956 source_name: 3T3 cells || treatment: control || cell line: 3T3 cells || assay type: Riboseq SRR5227288 SRP098789 SRX2536403 source_name: Huh7_1.5 µM PF-067446846_10 min_ribo-seq || cell line: Huh7 || treatment time: 10 min || library type: ribo-seq =========================== Downaloading entire project =========================== :: $ pysradb metadata --detailed SRP098789 | pysradb download =========================================== GEO accessions for studies (SRP => GSE) =========================================== :: $ pysradb srp-to-gse SRP090415 study_accession study_alias SRP090415 GSE87328 But not all SRPs will have an associated GEO id (GSE): :: $ pysradb srp-to-gse SRP029589 study_accession study_alias SRP029589 PRJNA218051 =============================================== SRA accessions for GEO studies (GSE => SRP) =============================================== :: $ pysradb gse-to-srp GSE87328i study_alias study_accession GSE87328 SRP090415 Please see `quickstart `_ for all possible operations available through ``pysradb``.