An official website of the United States government
Here's how you knowFederal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
Submission Portal
Submit to the world's largest public repository of biological and scientific information
Type a few words about the sequence data you are submitting and select an option to learn more. You can also browse submission information below. What do you want to submit?
Enter a few words about your sequence data.
SRA accepts unassembled reads from high throughput sequencing platforms. Submitted data files should generally be minimally processed and include per-base quality scores.
Submit unassembled reads of SARS-CoV-2 with BioProject, BioSample, metadata and NGS files.
GEO accepts raw data, processed data and metadata for gene expression and epigenomics datasets generated by high-throughput sequencing and microarray technologies.
Genome
Submit assembled complete or incomplete/draft prokaryotic and eukaryotic genomes. Not for viral, phage, or single locus sequences (for example: 16S rRNA). Submit those to regular GenBank.
Computationally assembled transcribed RNA sequences representing a transcriptome derived from sequence reads submitted to Sequence Read Archive (SRA).
GenBank
Submit ribosomal RNA (rRNA), rRNA-ITS, SARS-CoV-2, Influenza, Norovirus, Dengue, metazoan COX1 or eukaryotic nuclear mRNA
GenBank
Submit assembled reads of SARS-CoV-2 with FASTA files and source metadata. Annotation for SARS-CoV-2 is not required.
GenBank
Eukaryotic nuclear mRNAs (limit 499 sequences)
BankIt
Submit genomic DNA, organelle, ncRNA, plasmids, other viruses, phages, mRNA and synthetic constructs.
dbGaP
Microarray data from clinical studies that require controlled access.
ClinVar
Information on human sequence variation and relationship to human health.
NIHMS
Submit the electronic version of a peer-reviewed manuscript for inclusion in PubMed Central.
ClinicalTrials.gov
Explore clinicial studies conducted around the world.
TLS
Large-scale sequencing projects for an individual loci are taken in GenBank & Sequence Read Archive (SRA).
Supplementary Files
Submit BioNano maps, Beta-lactamase gene, and PacBio methylation data.
Third Party Annotation (TPA)
NCBI takes data capturing experimental or inferential results supporting annotation dervied from GenBank primary data.
dbSNP
Small human genomic variation: single nucleotide, insertions, deletions, and microsatellites.
GTR
Genetic tests for inherited & somatic genetic variations, including arrays and multiplex panels.
BioProject and BioSample
Automatically create a BioProject and BioSample during sequence data submission.
We need a little more information. Add to your description above, indicating sequence data type and/or whether your data is assembled, unassembled or expression data.
GenBank is the world's largest nucleotide archive containing sequences from all branches of life. The archive is a foundation for medical and biological discovery.
Submit assembled SARS-CoV-2, Influenza, Norovirus, Dengue virus, rRNA, rRNA-ITS, metazoan COX1, Eukaryotic nuclear mRNA sequences.
Learn moreSubmitSubmit genomic DNA, organelle, ncRNA, plasmids, other viruses, phages, other mRNA, synthetic constructs.
Learn moreSubmitSubmit assembled prokaryotic and eukaryotic genomes.
Learn moreSubmitSRA is the largest publicly-available repository of high throughput sequencing data. The archive accepts data from all branches of life as well as metagenomic and environmental surveys.
Submit unassembled, high throughput sequencing reads
SARS-CoV-2 submission instructions
Submit computationally assembled, transcribed RNA sequences after submitting unassembled reads to SRA. Learn more
Submit RNA-seq, ChIP-seq, and other types of gene expression and epigenomics datasets. Learn more
Choose a tool above if submitting sequence data. Learn more
An accession number in bioinformatics is a unique identifier given to a DNA or protein sequence record to allow for tracking of different versions of that sequence record and the associated sequence over time in a single data repository. Because of its relative stability, accession numbers can be utilized as foreign keys for referring to a sequence object, but not necessarily to a unique sequence. All sequence information repositories implement the concept of "accession number" but might do so with subtle variation.
Please read the NLM GenBank and SRA Data Processing document which describes how sequence data are processed and made available to the public, responsibilities of the data submitter, responsibilities of NCBI, and defines data status. You may write to info@ncbi.nlm.nih.gov if you have questions about your submitted data or if you have questions about the document.
