Use-Cases

For a basic introduction to pyDataverse, have a look at User Guide - Basic Usage, for advanced at User Guide - Advanced Usage.

Data Migration

Importing lots of data from data sources outside dataverse can be done with the help of the CSV templates. Add your data to the CSV files, import them into pyDataverse and upload the data and metadata later on via the API.

  • CSV 2 Dataverse (Tutorial)
  • Dataverse 2 Dataverse (mapping from Dataverse 2 pyDataverse missing)
  • DSpace 2 Dataverse (mapping from DSpace 2 pyDataverse missing)
  • NESSTAR 2 Dataverse (mapping from NESSTAR 2 pyDataverse missing)

It would be great to see some new mappings as contribution.

Testing

Create test data for integrity tests (DevOps)

Get full lists of all Dataverses, Datasets and Datafiles of an instance, or a subset of it. The results are stored in JSON files, which then can be used to do data integrity tests and look for data completeness. This could typically applied after an upgrade or a Dataverse migration. The data integrates easily into aussda_tests and to any CI build tools.

Mass removal of data in Dataverse (DevOps)

After testing, you often have to clean up a collection of Dataverse, with Datasets and Datafiles within. It can be tricky to remove them all at once, but pyDataverse helps you to do it only with a few commands:

This functionality is so far not fully implemented in pyDataverse, but you can find it in aussda_tests.

Data Science Pipeline

Use data and/or metadata from a Dataverse instance, and get the data by its API. Or you created data and want to automatically add it to your Dataset. PyDataverse connects your Data Science pipeline with your Dataverse instance.

Web-Applications / Microservices

As it is a direct and easy way to access Dataverses API’s and to manipulate its data models, it integrates really well into all kind of web-applications / microservices. For example, to visualize data, do some analysis, enrich it with other data sources and so on.