Data indexed by GloBI is generously provided by researchers and collections openly sharing their datasets. When using this data, please make sure to attribute the original data contributors, including citing the specific datasets in derivative work. Each record indexed by GloBI contains a reference and dataset citation. Also, please consider to contribute to improve access to existing species interaction data.
Species interaction datasets indexed by GloBI can be accessed in various ways. For most, this website and its pages may be helpful to poke around the data. Other projects like Encyclopedia of Life, and Ecosystem Explorer present GloBI data in a human readable format.
Exploratory, interactive queries can be executed through GloBI Search/Browse pages, or by using the REST-y GloBI Web API. For those that use R, rglobi is available to explore interaction data.
For research or other data intensive project, please use GloBI’s stable versioned integrated data published via doi:10.5281/zenodo.3950589 or, perhaps even better, consider using the original underlying datasets. Please see the process page to better understand how GloBI integrates data so that you can make an informed decision on what data to use for your studies. These data products also include a neo4j archive as well as a rdf/nquads if you’d like to load GloBI data into your own graph/triple store.
If you feel adventurous and would like to have the most recent data, you can use provided instable snapshots.
Data Schema and Definitions
You can find information related to the schema and definitions of field names for both the API and downloadable products. The definitions are in various formats including Java source code, JSON and CSV. The documentation may be incomplete or outdated so we welcome questions and comments via opening an issue.
Interaction Data Indexes
Table below is available as tab-separated values table via data.tsv.
data | description |
---|---|
citations.tsv.gz stable / snapshot |
contains data citations in a gzipped tab-separated values format. Note that each row in the interactions.tsv/csv files also contain citations. |
refuted-verbatim-interactions.tsv.gz stable / snapshot |
contains refuted species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic name are not interpreted, but included as documented in their sources. For column definitions, see data dictionary. |
interactions.nq.gz stable / snapshot |
contains species interactions expressed in the resource description framework in a gzipped rdf/quads format. |
taxonMap.tsv.gz stable / snapshot |
describes how names in existing datasets were mapped into existing naming schemes in a gzipped tab-separated values format. |
taxonCache.tsv.gz stable / snapshot |
contains hierarchies and identifiers associated with names from naming schemes in a gzipped tab-separated values format. |
dwca.zip stable / snapshot |
contains species interactions data as a zip file resembing a Darwin Core Archive using a custom, occurrence level, association extension. |
dwca-by-study.zip stable / snapshot |
contains species interactions data as a zip file resembling a Darwin Core Archive aggregated by study using a custom, occurrence level, association extension. |
neo4j-graphdb.zip stable / snapshot |
contains a neo4j v3.5.32 database containing a graph representation of the species interaction data. Want to run your own? Neo4j Server Community Edition is available download for mac, windows, and ubuntu/debian. Then, copy the graph.db folder into data/databases/ . |
datasets.tsv.gz stable / snapshot |
namespaces of indexed datasets in tab separated format. |
datasets.csv.gz stable / snapshot |
namespaces of indexed datasets in comma separated format. |
citations.csv.gz stable / snapshot |
contains data citations in a in a gzipped comma-separated values format. Note that each row in interactions.tsv/csv files also contain citations. |
interactions.tsv.gz stable / snapshot |
contains species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources. For column definitions, see data dictionary. |
interactions.csv.gz stable / snapshot |
contains species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources. For column definitions, see data dictionary. |
verbatim-interactions.tsv.gz stable / snapshot |
contains species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic name are not interpreted, but included as documented in their sources. For column definitions, see data dictionary. |
verbatim-interactions.csv.gz stable / snapshot |
contains species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic name are not interpreted, but included as documented in their sources. For column definitions, see data dictionary. |
refuted-interactions.tsv.gz stable / snapshot |
contains refuted species interactions tabulated as pair-wise interactions in a gzipped tab-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources. For column definitions, see data dictionary. |
refuted-interactions.csv.gz stable / snapshot |
contains refuted species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic names are interpreted using taxonomic alignment workflows and may be different than those provided by the original sources. For column definitions, see data dictionary. |
refuted-verbatim-interactions.csv.gz stable / snapshot |
contains refuted species interactions tabulated as pair-wise interactions in a gzipped comma-separated values format. Included taxonomic name are not interpreted, but included as documented in their sources. For column definitions, see data dictionary. |
sqlite, the most used database engine in the world. | create a sqlite3 database using:cat interactions.csv.gz | gunzip | sqlite3 -csv globi.db '.import /dev/stdin interactions' . If you’d like to reduce your database size, you can drop columns before importing them using powertools like cut or mlr/miller . See also importing csv files. |
If you’d like to better understand how the above integrated data products came about, please visit the Data Integration Process page. Also, see the Accessing Species Interaction Data wiki page for additional information about data access methods.
In case the provided methods to access species interactions data do not quite suit your needs, please open an issue or contact the author(s) of doi:10.1016/j.ecoinf.2014.08.005.