Advanced Search

Where the data comes from

The NIAID Data Ecosystem Discovery Portal harvests metadata from a number of different data sources to make it easier to find allergic or infectious and immune-mediated disease (IID) datasets.

What is metadata?#

Metadata is data about data β€” who collected the data, how it was collected, what the data contains, etc. For something like a genome sequence, the data would be the actual sequence of nucleotides, while the metadata is the author of the data, the date the data was modified, the measurementTechniques used to collect the data, the healthCondition at the focus of dataset (like COVID-19, asthma, or autoimmune diseases), and more. Using the Discovery Portal, users can search these metadata to find datasets of interest.


Where does the data come from?#

The Discovery Portal collects metadata from data sources that are related to allergic, infectious and immune-mediated disease (IID). Some sources included in the Discovery Portal are also general biological data sources or generalist repositories. The breakdown of the number of resources by data source is found on the Discovery Portal homepage. The Sources page provides more details about each source and the last time the data was harvested. The sources used by the Discovery Portal is also constantly expanding.

List of data sources that are found on the NIAID Data Ecosystem

How are the repositories / sources chosen?#

Data sources were initially selected based on interviews with researchers about the data sources they use. As the Discovery Portal is continuously growing, the team is always looking to add other repositories to the search platform.

πŸ“˜ Want to suggest a new source?

Suggesting new data sources / features


How often is metadata harvested?#

Currently, metadata is harvested every quarter. The Sources page lists the last time the metadata was collected from the source.


Is the metadata manipulated?#

Some of the metadata provided by the sources is cleaned so it is more standardized and easily searchable. Mostly, this involves standardizing the names of metadata fields (variables) so they are consistent between sources (read more about data schemas). These changes are tracked on the schema tab of the Sources page:

Table with metadata properties from source and its associated property in the NIAID Data Ecosystem.

πŸ“˜ Accessing the standardized metadata

The standardized metadata can be freely accessed using the Discovery Portal's open metadata API


Last updated on

Policies

  • Accessibility
  • Copyright
  • Disclaimer
  • Privacy Policy
  • Freedom of Information Act (FOIA)
  • Vulnerability Disclosure Policy
  • No Fear Act Data
Contact Us