pyDataverse

Release v0.3.1.

https://img.shields.io/github/v/release/gdcc/pyDataverse https://travis-ci.com/gdcc/pyDataverse.svg?branch=master https://img.shields.io/pypi/v/pyDataverse.svg https://img.shields.io/pypi/wheel/pyDataverse.svg https://img.shields.io/pypi/pyversions/pyDataverse.svg https://readthedocs.org/projects/pydataverse/badge/?version=latest https://coveralls.io/repos/github/gdcc/pyDataverse/badge.svg https://img.shields.io/github/license/gdcc/pydataverse.svg https://img.shields.io/badge/code%20style-black-000000.svg https://zenodo.org/badge/DOI/10.5281/zenodo.4664557.svg

pyDataverse is a Python module for Dataverse you can use for:

  • accessing the Dataverse API’s
  • manipulating and using the Dataverse (meta)data - Dataverses, Datasets, Datafiles

No matter, if you want to import huge masses of data into Dataverse, test your Dataverse instance after deployment or want to make basic API calls: pyDataverse helps you with Dataverse!

pyDataverse is fully Open Source and can be used by everybody.

Install

To install pyDataverse, simply run this command in your terminal of choice:

pip install pyDataverse

Find more options at Installation.

Requirements

pyDataverse officially supports Python 3.6–3.8

Python packages required:

External packages required:

Quickstart

Warning

Do not execute the example code on a Dataverse production instance, unless 100% sure!

Import Dataset metadata JSON

To import the metadata of a Dataset from Dataverse’s own JSON format, use ds.from_json(). The created Dataset can then be retrieved with get().

For this example, we use the dataset.json from tests/data/user-guide/ (GitHub repo) and place it in the root directory.

>>> from pyDataverse.models import Dataset
>>> from pyDataverse.utils import read_file
>>> ds = Dataset()
>>> ds_filename = "dataset.json"
>>> ds.from_json(read_file(ds_filename))
>>> ds.get()
{'citation_displayName': 'Citation Metadata', 'title': 'Youth in Austria 2005', 'author': [{'authorName': 'LastAuthor1, FirstAuthor1', 'authorAffiliation': 'AuthorAffiliation1'}], 'datasetContact': [{'datasetContactEmail': 'ContactEmail1@mailinator.com', 'datasetContactName': 'LastContact1, FirstContact1'}], 'dsDescription': [{'dsDescriptionValue': 'DescriptionText'}], 'subject': ['Medicine, Health and Life Sciences']}

Create Dataset by API

To access Dataverse’s Native API, you first have to instantiate NativeApi. Then create the Dataset through the API with create_dataset().

This returns, as all API functions do, a requests.Response object, with the DOI inside data.

Replace following variables with your own instance data before you execute the lines:

  • BASE_URL: Base URL of your Dataverse instance, without trailing slash (e. g. https://data.aussda.at))
  • API_TOKEN: API token of a Dataverse user with proper rights to create a Dataset
  • DV_PARENT_ALIAS: Alias of the Dataverse, the Dataset should be attached to.
>>> from pyDataverse.api import NativeApi
>>> api = NativeApi(BASE_URL, API_TOKEN)
>>> resp = api.create_dataset(DV_PARENT_ALIAS, ds.json())
Dataset with pid 'doi:10.5072/FK2/UTGITX' created.
>>> resp.json()
{'status': 'OK', 'data': {'id': 251, 'persistentId': 'doi:10.5072/FK2/UTGITX'}}

For more tutorials, check out User Guide - Basic Usage and User Guide - Advanced Usage.

Features

  • Comprehensive API wrapper for all Dataverse API’s and most of their endpoints
  • Data models for each of Dataverses data types: Dataverse, Dataset and Datafile
  • Data conversion to and from Dataverse’s own JSON format for API uploads
  • Easy mass imports and exports through CSV templates
  • Utils with helper functions
  • Documented examples and functionalities
  • Custom exceptions
  • Tested (Travis CI) and documented (Read the Docs)
  • Open Source (MIT)

Reference / API

If you are looking for information on a specific class, function, or method, this part of the documentation is for you.

Community Guide

This part of the documentation, which is mostly prose, details the pyDataverse ecosystem and community.

Thanks!

To everyone who has contributed to pyDataverse - with an idea, an issue, a pull request, developing used tools, sharing it with others or by any other means: Thank you for your support!

Open Source projects live from the cooperation of the many and pyDataverse is no exception to that, so to say thank you is the least that can be done.

Special thanks to Lars Kaczmirek, Veronika Heider, Christian Bischof, Iris Butzlaff and everyone else from AUSSDA, Slava Tykhonov and Marion Wittenberg from DANS and all the people who do an amazing job by developing Dataverse at IQSS, but especially to Phil Durbin for it’s support from the first minute.

pyDataverse is funded by AUSSDA - The Austrian Social Science Data Archive and through the EU Horizon2020 programme SSHOC - Social Sciences & Humanities Open Cloud (T5.2).

License

Copyright Stefan Kasberger and others, 2019-2021.

Distributed under the terms of the MIT license, pyDataverse is free and open source software.

Full License Text: LICENSE.txt