Use-Cases¶

For a basic introduction to pyDataverse, have a look at User Guide - Basic Usage, for advanced at User Guide - Advanced Usage.

Data Migration¶

Importing lots of data from data sources outside dataverse can be done with the help of the CSV templates. Add your data to the CSV files, import them into pyDataverse and upload the data and metadata later on via the API.

CSV 2 Dataverse (Tutorial)
Dataverse 2 Dataverse (mapping from Dataverse 2 pyDataverse missing)
DSpace 2 Dataverse (mapping from DSpace 2 pyDataverse missing)
NESSTAR 2 Dataverse (mapping from NESSTAR 2 pyDataverse missing)

It would be great to see some new mappings as contribution.

Testing¶

Create test data for integrity tests (DevOps)¶

Get full lists of all Dataverses, Datasets and Datafiles of an instance, or a subset of it. The results are stored in JSON files, which then can be used to do data integrity tests and look for data completeness. This could typically applied after an upgrade or a Dataverse migration. The data integrates easily into aussda_tests and to any CI build tools.

Collect a data tree with all Dataverses, Datasets and Datafiles (get_children())
Extract Dataverses, Datasets and Datafiles from the tree (dataverse_tree_walker())
Save extracted data (save_tree_data())

Mass removal of data in Dataverse (DevOps)¶

After testing, you often have to clean up a collection of Dataverse, with Datasets and Datafiles within. It can be tricky to remove them all at once, but pyDataverse helps you to do it only with a few commands:

Collect a data tree with all Dataverses and Datasets (get_children())
Extract Dataverses and Datasets from the tree (dataverse_tree_walker())
Save extracted data (save_tree_data())
Iterate over all Datasets to delete/destroy them (destroy_dataset() delete_dataset(), destroy_dataset())
Iterate over all Dataverses to delete them (delete_dataverse())

This functionality is so far not fully implemented in pyDataverse, but you can find it in aussda_tests.

Data Science Pipeline¶

Use data and/or metadata from a Dataverse instance, and get the data by its API. Or you created data and want to automatically add it to your Dataset. PyDataverse connects your Data Science pipeline with your Dataverse instance.

Web-Applications / Microservices¶

As it is a direct and easy way to access Dataverses API’s and to manipulate its data models, it integrates really well into all kind of web-applications / microservices. For example, to visualize data, do some analysis, enrich it with other data sources and so on.

Use-Cases¶

Data Migration¶

Testing¶

Create test data for integrity tests (DevOps)¶

Mass removal of data in Dataverse (DevOps)¶

Data Science Pipeline¶

Web-Applications / Microservices¶

pyDataverse

Navigation

Useful Links

This Page