rOpenSci | Data Publication

Data Publication

Document and Release Your Data
Showing 10 of 12
dataspice
CRAN

Create Lightweight Schema.org Descriptions of Data

Bryce Mecum
Description

The goal of dataspice is to make it easier for researchers to create basic, lightweight, and concise metadata files for their datasets. These basic files can then be used to make useful information available during analysis, create a helpful dataset “README” webpage, and produce more complex metadata formats to aid dataset discovery. Metadata fields are based on the Schema.org and Ecological Metadata Language standards.

View Documentation
EML
CRAN

Read and Write Ecological Metadata Language Files

Carl Boettiger
Description

Work with Ecological Metadata Language (EML) files. EML is a widely used metadata standard in the ecological and environmental sciences, described in Jones et al. (2006), doi:10.1146/annurev.ecolsys.37.091305.110031.

View Documentation
piggyback
CRAN Peer-reviewed

Managing Larger Data on a GitHub Repository

Carl Boettiger
Description

Because larger (> 50 MB) data files cannot easily be committed to git, a different approach is required to manage data associated with an analysis in a GitHub repository. This package provides a simple work-around by allowing larger (up to 2 GB) data files to piggyback on a repository as assets attached to individual GitHub releases. These files are not handled by git in any way, but instead are uploaded, downloaded, or edited directly by calls through the GitHub API. These data files can be versioned manually by creating different releases. This approach works equally well with public or private repositories. Data can be uploaded and downloaded programmatically from scripts. No authentication is required to download data from public repositories.

Scientific use cases
  1. Boettiger, C. (2018). Managing Larger Data on a GitHub Repository. Journal of Open Source Software, 3(29), 971. https://doi.org/10.21105/joss.00971
View Documentation
codemetar
CRAN Peer-reviewed

Generate CodeMeta Metadata for R Packages

Carl Boettiger
Description

The Codemeta Project defines a JSON-LD format for describing software metadata, as detailed at https://codemeta.github.io. This package provides utilities to generate, parse, and modify codemeta.json files automatically for R packages, as well as tools and examples for working with codemeta.json JSON-LD more generally.

View Documentation
RNeXML
CRAN

Semantically Rich I/O for the NeXML Format

Carl Boettiger
Description

Provides access to phyloinformatic data in NeXML format. The package should add new functionality to R such as the possibility to manipulate NeXML objects in more various and refined way and compatibility with ape objects.

Scientific use cases
  1. Stöver, B. C., Wiechers, S., & Müller, K. F. (2019). JPhyloIO: a Java library for event-based reading and writing of different phylogenetic file formats through a common interface. BMC Bioinformatics, 20(1). https://doi.org/10.1186/s12859-019-2982-3
View Documentation
rfigshare
CRAN

An R Interface to figshare

Carl Boettiger
Description

An R interface to figshare.

Scientific use cases
  1. White, L., & Santy, S. (2018). DataDepsGenerators.jl: making reusing data easy by automatically generating DataDeps.jl registration code. Journal of Open Source Software, 3(31), 921. https://doi.org/10.21105/joss.00921
View Documentation
datasauce

Create and manipulate Schema.org Dataset metadata

Carl Boettiger
Description

What the package does (one paragraph).

View Documentation