Domain-Discipline Repositories Useful to AGU Journals

The data that supports the research reported in your paper must be deposited in a trusted repository. When identifying the most appropriate repositories for your data, first, refer to the AGU’s journal-specific data guidance. We recommend a repository that specializes in the data for your scientific discipline as this will maximize the probability that the deposited data will be findable, accessible, interoperable and reusable (FAIR). Otherwise, look to your institutional repository, your computing center, or a general repository (e.g. Zenodo, Dryad, Figshare, Science Data Bank).

The following is a list of useful repositories by journal:

GeoHealth

PANGAEA is operated as an Open Access library aimed at archiving, publishing and distributing georeferenced data from the earth and life sciences. The system guarantees long-term availability of its content through a commitment of the operating institutions. Each dataset can be identified, shared, published and cited by using a Digital Object Identifier (DOI). PANGAEA also allows data to be published as supplements to science articles or as citable data collections in combination with data journals like ESSD, Geoscience Data Journal, Scientific Data, or others. Additional information about PANGAEA can be found at their about, help, and submit pages and their contact page is available for further questions you may have.

The Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC) data management system provides researchers with a variety of tools to help manage data throughout the lifecycle of a project. For example, the GRIIDC Dataset Information Form (DIF) is a resource designed to assist researchers with data management planning. While the system assists researchers with multiple phases of data management, the main functions of the system are storing and sharing data. Researchers from diverse fields of study, including biology, chemistry, physical oceanography, sociology, political science and public health, are able to store their data in the GRIIDC system. Through the GRIIDC search page, researchers, policy makers, and the general public are able to search for and download this data. This shared data can be used to address innovative scientific research questions, assess policies and programs, and in educational initiatives. By providing a forum for both storing and sharing data the GRIIDC system increases the impact of scientific research in the Gulf of Mexico and beyond for the benefit of society. Learn more about the GRIIDC and access their FAQs, video tutorials, training & user guides, and additional support material via their Help section.

The GRIIDC along with a number of other DataONE member repositories, e.g. Environmental Data Initiative, are available for the community to consider. DataONE unites a network of data repositories operated by research centers, universities, non-profit organizations, citizen science initiatives, government and non-government organizations, and the like. Member institutions share data and infrastructure with DataONE and in return, they facilitate user access to data and interoperability between members. Review the DataONE member list to see if there is a relevant repository for your earth and life science research. Researchers wishing to share data via DataONE are encouraged to do so through member repositories within their scientific domain, organizational affiliation, or geographic region. You can contact DataONE if you have further questions and/or review their about page, training resources, and additional material under their Learning section.

Through OpenAIRE, find the appropriate earth and life sciences repository to deposit your research products of any type (publication, data, software, other) or to include in your data management plan. Search and browse for OpenAIRE compliant repositories registered in OpenDOAR and re3data. Find the repository to deposit your research or use the Zenodo repository. Access OpenAIRE’s support page to ask a question and/or for FAQs and training resources.

Re3data is a global registry of research data repositories that covers research data repositories from different academic disciplines including earth and life sciences. It includes repositories that enable permanent storage of and access to data sets to researchers, funding bodies, publishers, and scholarly institutions. The use of re3data is also recommended in the European Commission’s “Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020”. Re3data’s faceted search interface is available to discover repositories catering to the earth and life sciences communities. Each repository listed has a short description. Icons show if the repository has general information, is open access, has licenses, has persistent identifiers, has certificates and standards, and has policies. Learn more about re3data via their about page and visit their FAQs and contact page for additional help.

The NIH has created a list of supported domain-specific data repositories that make data accessible for reuse and are open for both submitting and accessing data. Submission is typically limited to data of a certain type or related to a certain discipline. The table provides links to information about submitting data to and accessing data from the listed repositories. Repositories in this list have current NIH funding, sustained support, open data submission and access, and open time frame for data deposit, based on information provided by the repository about funding and data availability. This non-exhaustive list is also available in a downloadable Excel version. The NIH has provided a contact us if you have questions, comments, and/or feedback.

JAMES (Journal of Advances in Modeling Earth Systems)

Zenodo - Zenodo builds and operates a simple and innovative service that enables researchers, scientists, EU projects and institutions to share and showcase multidisciplinary research results (data and publications) that are not part of the existing institutional or subject-based repositories of the research communities. ZENODO enables researchers, scientists, EU projects and institutions to: easily share the long tail of small research results in a wide variety of formats including text, spreadsheets, audio, video, and images across all fields of science. display their research results and get credited by making the research results citable and integrate them into existing reporting lines to funding agencies like the European Commission. easily access and reuse shared research results. Text description from re3data. Note: JAMES authors have increasingly used the GitHub-Zenodo integration to preserve, describe, and cite their software.

NCAR UCAR Digital Asset Services Hub Repository - The DASH Repository provides persistent data archiving and distribution for small-scale data collections from UCAR/NCAR researchers and projects. This data repository specifically focuses on providing long-term preservation and stewardship of NCAR’s small-scale data collections. Complementing other NCAR-managed data repositories, the DASH Repository helps NCAR researchers to enable long term access, interoperability, and reuse of NCAR datasets. Text description from re3data.

World Data Center for Climate (WDCC) - The German Climate Computing Center (DKRZ: Deutsches Klimarechenzentrum GmbH) dsaprovides a Long Term Archiving Service for large research data sets which are relevant for climate or Earth system research. This service includes archiving and retrieval capability of data for time periods of 10 years or longer. The long-term archive (LTA) of DKRZ is certified according to the criteria of the Core Trust Seal (CTS) and is, as World Data Centre for Climate (WDCC), accredited as regular member of the World Data System.

ESS-DIVE - The U.S. Department of Energy’s (DOE) Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) data archive serves Earth and environmental science data. ESS-DIVE is funded by the Data Management program within the Climate and Environmental Science Division under the DOE’s Office of Biological and Environmental Research program (BER), and is maintained by the Lawrence Berkeley National Laboratory. ESS-DIVE will archive and publicly share data obtained from observational, experimental, and modeling research that is funded by the DOE’s Office of Science under its Subsurface Biogeochemical Research (SBR) and Terrestrial Ecosystem Science (TES) programs within the Environmental Systems Science (ESS) activity. ESS-DIVE was launched in July 2017, and is designed to provide long-term stewardship and use of data from observational, experimental and modeling activities in the DOE in the Subsurface Biogeochemical Research (SBR) and Terrestrial Ecosystem Science (TES) Programs in the Environmental System Science (ESS) activity. Text description from re3data.

figshare - figshare allows researchers to publish all of their research outputs in an easily citable, sharable and discoverable manner. All file formats can be published, including videos and datasets. Optional peer review process. figshare uses creative commons licensing. Text description from re3data.

PANGAEA - PANGAEA is operated as an Open Access library aimed at archiving, publishing and distributing georeferenced data from the earth and life sciences. The system guarantees long-term availability of its content through a commitment of the operating institutions. Each dataset can be identified, shared, published and cited by using a Digital Object Identifier (DOI). PANGAEA also allows data to be published as supplements to science articles or as citable data collections in combination with data journals like ESSD, Geoscience Data Journal, Scientific Data, or others. Additional information about PANGAEA can be found at their about, help, and submit pages and their contact page is available for further questions you may have.

Incorporated Research Institutions for Seismology (IRIS) - IRIS offers free and open access to a comprehensive data store of raw geophysical time-series data collected from a large variety of sensors, courtesy of a vast array of US and International scientific networks, including seismometers (permanent and temporary), tilt and strain meters, infrasound, temperature, atmospheric pressure and gravimeters, to support basic research aimed at imaging the Earth’s interior. IRIS provides management of, and access to, observed and derived data for the global earth science community. This includes ground motion, atmospheric, infrasonic, hydrological, and hydroacoustic data. Text description from re3data.

EarthChem Library - The EarthChem Library is a data repository that archives, publishes and makes accessible data and other digital content from geoscience research (analytical data, data syntheses, models, technical reports, etc.) Text description from re3data.

JGR-Atmospheres

Atmospheric Science Data Center (ASDC) at NASA Langley Research Center - The Atmospheric Science Data Center (ASDC) at NASA Langley Research Center is responsible for processing, archiving, and distribution of NASA Earth science data in the areas of radiation budget, clouds, aerosols, and tropospheric chemistry. The ASDC specializes in atmospheric data important to understanding the causes and processes of global climate change and the consequences of human activities on the climate. Text from re3data.org.

Goddard Earth Sciences Data and Information Services Center (GES DISC) - One of twelve NASA Science Mission Directorate (SMD) Data Centers that provide Earth science data, information, and services to research scientists, applications scientists, applications users, and students. The GES DISC is the home (archive) of NASA Precipitation and Hydrology, as well as Atmospheric Composition and Dynamics remote sensing data and information. The DISC also houses the Modern Era Retrospective-Analysis for Research and Applications (MERRA) data assimilation datasets (generated by GSFC’s Global Modeling and Assimilation Office), and the North American Land Data Assimilation System (NLDAS) and Global Land Data Assimilation System (GLDAS) data products (both generated by GSFC’s Hydrological Sciences Branch). Text from re3data.org.

National Centers for Environmental Information (NCEI) - National Centers for Environmental Information (NCEI, formerly National Climatic Data Center, the National Geophysical Data Center, and the National Oceanographic Data Center) is the United States facility established to acquire, process, store, and disseminate environmental data from the United States and other countries. NCEI operates as a component of the National Environmental Satellite, Data, and Information Service (NESDIS) of the National Oceanic and Atmospheric Administration (NOAA) of the U.S. Department of Commerce. NCEI’s U.S. data holdings include unclassified data collected by Federal agencies including the Department of Defense (primarily the U.S. Navy); State and local government agencies; universities and research institutions; and private industry. NCEI does not conduct any data collection programs of its own; it serves solely as a repository, dissemination, and analysis facility for data collected by others.

A very large portion of the data held by NCEI is of foreign origin. We acquire foreign data through direct bilateral exchanges with other countries and organizations, and through the facilities of World Data System for Oceanography, Silver Spring. WDS (Oceanography) is operated by NCEI under the auspices of the U.S. National Academy of Sciences. It is one of the discipline subcenters within the World Data Center System that fosters international exchange of scientific data under guidelines issued by the International Council of Scientific Unions (ICSU).

Each year NCEI responds to thousands of requests from users in the United States and around the world. NCEI data support research and development in ocean resource development, marine environmental assessment, national defense, theoretical oceanography, ocean engineering, etc. As a service organization, NCEI welcomes inquiries from all potential users of marine data and data products.

https://www.nodc.noaa.gov/submit/submit-guide.html

National Geoscience Data Centre - The National Geoscience Data Centre is a data-rich organization with over 400 datasets in its care: including environmental monitoring data, digital databases, physical collections (borehole core, rocks, minerals and fossils), records and archives. Our data is managed by the National Geoscience Data Centre. Text from re3data.org.

PANGAEA - PANGAEA is operated as an Open Access library aimed at archiving, publishing and distributing georeferenced data from the earth and life sciences. The system guarantees long-term availability of its content through a commitment of the operating institutions. Each dataset can be identified, shared, published and cited by using a Digital Object Identifier (DOI). PANGAEA also allows data to be published as supplements to science articles or as citable data collections in combination with data journals like ESSD, Geoscience Data Journal, Scientific Data, or others. Additional information about PANGAEA can be found at their about, help, and submit pages and their contact page is available for further questions you may have.

Research Data Archive (RDA) at NCAR - The Research Data Archive (RDA) at NCAR contains a large and diverse collection of meteorological and oceanographic observations, operational and reanalysis model outputs, and remote sensing datasets to support atmospheric and geosciences research, along with ancillary datasets, such as topography/bathymetry, vegetation, and land use. Text from re3data.org.

The World Data Center for Remote Sensing of the Atmosphere - The World Data Center for Remote Sensing of the Atmosphere, WDC-RSAT, offers scientists and the general public free access (in the sense of a “one-stop shop”) to a continuously growing collection of atmosphere-related satellite-based data sets (ranging from raw to value added data), information products and services. Focus is on atmospheric trace gases, aerosols, dynamics, radiation, and cloud physical parameters. Complementary information and data on surface parameters (e.g. vegetation index, surface temperatures) is also provided. This is achieved either by giving access to data stored at the data center or by acting as a portal containing links to other providers. Text from re3data.org.

JGR-Biogeosciences

Pangaea - Pangaea accepts any data from earth, environmental and life sciences. When you start the data submission process, you will be redirected to the PANGAEA issue tracker that will assist you in providing metadata and uploading data files. Any communication with our editors will go through this issue tracker. For more details about the submission workflow see our tutorial. Please note: All data and metadata are quality checked, harmonized, and processed for machine readability, which allows efficient and reliable re-usage of your data. Depending on the extent and complexity of your data submission the editorial process and minting of DOI names might therefore take up to 8 weeks. [Earth and Environmental data]

Environmental Data Initiative (EDI) - EDI is an NSF-funded repository accepting environmental research data and relevant processing code. Trained curators assist researchers from field stations, individual laboratories, and research projects of all sizes and actively promote and enable curation and re-use of environmental data through outreach and training. EDI is committed to enable data that is Findable, Accessible, Interoperable, and Reusable through rich science metadata and assigning of DOIs. All data and metadata are quality checked and machine readable ensuring reliable reuse of data. Data submission is provided. [Earth and Environmental data]

Ameriflux -The AmeriFlux Network ensures the availability of the continuous, long term ecosystem measurements necessary to build effective models and multisite syntheses, while maximizing insight through robust, site-specific, independent research programs. With these consistent, high-quality environmental measurements, AmeriFlux helps ensure that critical decisions are supported by the most complete understanding and data. Independent scientists measure the flows of carbon between land and atmosphere, using a technique called eddy covariance, and then contribute their data to the AmeriFlux Network. The AmeriFlux Network works with scientists to standardize, quality check, and process data into common forms that the scientific community can use to examine crucial linkages between ecosystem processes and climate responses. Description of AmeriFlux data and How to upload data. [Ecosystem CO2, Water, Energy Fluxes]

ICOS - The Integrated Carbon Observation System (ICOS) is a distributed pan European research infrastructure producing high-quality data on greenhouse gas concentrations in the atmosphere, as well as on carbon fluxes between the atmosphere, the land surface and the oceans. [Ecosystem CO2, Water, Energy Fluxes]

Bolin Centre Database - The Bolin Centre Database is a storage and management facility for data collected and collated at the Bolin Centre for Climate Research. Most of the data are available with open access and can be used under the terms given in the data description. Our goal is to host all datasets produced within the Bolin Centre, to visualize the data and make the data publicly available. For inquiries contact: bolindata@su.se. [Institutional/Project Repositories]

MEMENTO database includes methane (CH4) and nitrous oxide (N2O) data from the global ocean (both open and coastal). The MEMENTO database is administered by the Kiel Data Management Team at GEOMAR Helmholtz Centre for Ocean Research and supported by the German BMBF project SOPRAN (Surface Ocean Processes in the Anthropocene). The database is accessible through the MEMENTO webpage. A login is required to access the data. Learn more here. [Methane (CH4) and Nitrous Xxide (N2O)]

ORNL DAAC - The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) for Biogeochemical Dynamics is a NASA Earth Observing System Data and Information System (EOSDIS) data center managed by the Earth Science Data and Information System (ESDIS) Project. The ORNL DAAC is operated by Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tennessee, and is a member of the Remote Sensing and Environmental Informatics Group of the Environmental Sciences Division (ESD). The ORNL DAAC is a CoreTrustSeal Certified Repository. [NASA – Funded Research (selection of Distributed Active Archive Centers)]

National Centers for Environmental Information (NCEI) - Formerly National Climatic Data Center, the National Geophysical Data Center, and the National Oceanographic Data Center. NOAA’s NCEI hosts and provides access to one of the most significant archives on earth, with comprehensive oceanic, atmospheric, and geophysical data. From the depths of the ocean to the surface of the sun and from million-year-old ice core records to near-real-time satellite images, NCEI is the Nation’s leading authority for environmental information. Contact information. [NOAA - Funded Research]

National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) - The SRA is NIH’s primary archive of high-throughput sequencing data and is part of the International Nucleotide Sequence Database Collaboration (INSDC) that includes at the NCBI Sequence Read Archive (SRA), the European Bioinformatics Institute (EBI), and the DNA Database of Japan (DDBJ). Data submitted to any of the three organizations are shared among them. SRA accepts data from all kinds of sequencing projects including clinically important studies that involve human subjects or their metagenomes, which may contain human sequences. These data often have controlled access via dbGaP (the database of Genotypes and Phenotypes). Details for submitting to SRA. [OMICs]

Metagenomics Rapid Annotation using Subsystem Technology (MG-RAST) - The metagenomics RAST server is a public resource for the automatic phylogenetic and functional analysis of metagenomes. Information about the services. [OMICs]

ESS-DIVE - The U.S. Department of Energy’s (DOE) Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) is a data repository for Earth and environmental science data. ESS-DIVE stores and publicly distributes data from observational, experimental, and modeling research funded by the DOE’s Office of Science under its Subsurface Biogeochemical Research (SBR) and Terrestrial Ecosystem Science (TES) programs within the Environmental Systems Science (ESS) activity. Information for Data Upload. [U.S. Department of Energy – Funded research]

ScienceBase- The primary data archive for recent and ongoing USGS investigations ScienceBase provides access to aggregated information derived from many data and information domains, including feeds from existing data systems, metadata catalogs, and scientists contributing new and original content. ScienceBase is a USGS Trusted Digital Repository. Reference here for information on preparing your data for public release on ScienceBase. [U.S. Geological Survey (USGS) – Funded Research]

National Water Information System - Water information is fundamental to national and local economic well-being, protection of life and property, and effective management of the Nation’s water resources. The USGS works with partners to monitor, assess, conduct targeted research, and deliver information on a wide range of water resources and conditions including streamflow, groundwater, water quality, and water use and availability. [U.S. Geological Survey (USGS) – Funded Research]

CUAHSI - Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is a 501(c)(3) research organization representing more than 130 U.S. universities and international water science-related organizations. CUAHSI receives support from the National Science Foundation (NSF) to develop infrastructure and services for the advancement of water science in the United States. Information about submitting data. [Water / Hydrologic data]

The Biological and Chemical Oceanography Data Management Office (BCO-DMO) - BCO-DMO is a publicly accessible earth science data repository created to curate, publicly serve (publish), and archive digital data and information from biological, chemical and biogeochemical research conducted globally in coastal, marine, great lakes and laboratory environments.

  • The BCO-DMO repository provides data management services at no additional cost to investigators funded through the following programs: NSF OCE Division Biological and Chemical Sections; NSF Division of Polar Programs Antarctic Organisms & Ecosystems; Gordon and Betty Moore Foundation Marine Microbiology Initiative (GBMF MMI). In certain circumstances, the office can provide data management services for a fee to investigators funded outside of these programs. Please contact the office at info@bco-dmo.org to learn more.
  • BCO-DMO works closely with individual investigators and data originators throughout the data life cycle, from data management planning support, quality control and metadata assembly, to DOI creation and archive with appropriate national facilities. The office ensures all contributed project data and metadata are in compliance with current funder policies (i.e., NSF OCE Sample and Data Policy, NSF 17-037) and offers investigators the option to embargo data (in accordance with funder policies) until publication. Dataset DOIs are obtained and available once investigators review final curated data packages for publishing. Dataset DOIs, once generated, may be used for scholarly publication and/or funder reporting. BCO-DMO accepts scholarly publication DOIs and can link these to their respective datasets.
  • Data accepted by BCO-DMO include all project output (observational data, derived and statistical products, analysis code, software and models, and supporting documentation such as reports and calibration information). The office accepts a wide variety of data types and formats, and works to publish a non-proprietary, research-ready, data package available to new research. To contribute data to BCO-DMO, please see the “How to Get Started Contributing >Data” page, located under the Resources tab of the BCO-DMO website. [Water / Hydrologic data]