Developer Interface

This part of the documentation covers all the interfaces of pyDataverse. For parts where pyDataverse depends on external libraries, we document the most important right here and provide links to the canonical documentation.

Api Interface

Dataverse API connector.

class Api(base_url, api_token=None, api_version='v1')[source]

API class.

Parameters:
  • base_url (string) – Base URL of Dataverse instance. Without trailing / at the end. e.g. http://demo.dataverse.org
  • api_token (string) – Authenication token for the api.
  • api_version (string) – Dataverse api version. Defaults to v1.
conn_started

datetime – Time when Api() was instantiated, the connection got established.

native_api_base_url

string – Url of Dataverse’s native Api.

base_url
api_token
api_version
create_dataset(dataverse, metadata, auth=True)[source]

Add dataset to a dataverse.

Dataverse Documentation

HTTP Request:

POST http://$SERVER/api/dataverses/$dataverse/datasets --upload-file FILENAME

Add new dataset with curl:

curl -H "X-Dataverse-key: $API_TOKEN" -X POST $SERVER_URL/api/dataverses/$DV_ALIAS/datasets --upload-file tests/data/dataset_min.json

Import dataset with existing persistend identifier with curl:

curl -H "X-Dataverse-key: $API_TOKEN" -X POST $SERVER_URL/api/dataverses/$DV_ALIAS/datasets/:import?pid=$PERSISTENT_IDENTIFIER&release=yes --upload-file tests/data/dataset_min.json

To create a dataset, you must create a JSON file containing all the metadata you want such as example file: dataset-finch1.json. Then, you must decide which dataverse to create the dataset in and target that datavese with either the “alias” of the dataverse (e.g. “root” or the database id of the dataverse (e.g. “1”). The initial version state will be set to DRAFT:

Status Code:
201: dataset created
Parameters:
  • dataverse (string) – “alias” of the dataverse (e.g. root) or the database id of the dataverse (e.g. 1)
  • metadata (string) –

    Metadata of the Dataset as a json-formatted string (e. g. dataset-finch1.json)

Returns:

Response object of requests library.

Return type:

requests.Response

create_dataverse(identifier, metadata, auth=True, parent=':root')[source]

Create a dataverse.

Generates a new dataverse under identifier. Expects a JSON content describing the dataverse.

HTTP Request:

POST http://$SERVER/api/dataverses/$id

Download the dataverse.json example file and modify to create dataverses to suit your needs. The fields name, alias, and dataverseContacts are required.

Status Codes:
200: dataverse created 201: dataverse created
Parameters:
  • identifier (string) – Can either be a dataverse id (long) or a dataverse alias (more robust). If identifier is omitted, a root dataverse is created.
  • metadata (string) – Metadata of the Dataverse as a json-formatted string.
  • auth (bool) – True if api authorization is necessary. Defaults to True.
  • parent (string) – Parent dataverse, if existing, to which the Dataverse gets attached to. Defaults to :root.
Returns:

Response object of requests library.

Return type:

requests.Response

delete_dataset(identifier, is_pid=True, auth=True)[source]

Delete a dataset.

Delete the dataset whose id is passed

HTTP Request:

DELETE http://$SERVER/api/datasets/$id
Status Code:
200: dataset deleted
Parameters:
  • identifier (string) – Identifier of the dataset. Can be a Dataverse identifier or a persistent identifier (e.g. doi:10.11587/8H3N93).
  • is_pid (bool) – True, if identifier is a persistent identifier.
Returns:

Response object of requests library.

Return type:

requests.Response

delete_dataverse(identifier, auth=True)[source]

Delete dataverse by alias or id.

HTTP Request:

DELETE http://$SERVER/api/dataverses/$id
Status Code:
200: Dataverse deleted
Parameters:identifier (string) – Can either be a dataverse id (long) or a dataverse alias (more robust).
Returns:Response object of requests library.
Return type:requests.Response
delete_request(query_str, auth=False, params=None)[source]

Make a DELETE request.

Parameters:
  • query_str (string) – Query string for the request. Will be concatenated to native_api_base_url.
  • auth (bool) – Should an api token be sent in the request. Defaults to False.
  • params (dict) – Dictionary of parameters to be passed with the request. Defaults to None.
Returns:

Response object of requests library.

Return type:

requests.Response

edit_dataset_metadata(identifier, metadata, is_pid=True, is_replace=False, auth=True)[source]

Edit metadata of a given dataset.

Offical documentation.

HTTP Request:

PUT http://$SERVER/api/datasets/editMetadata/$id --upload-file FILENAME

Add data to dataset fields that are blank or accept multiple values with the following

CURL Request:

curl -H "X-Dataverse-key: $API_TOKEN" -X PUT $SERVER_URL/api/datasets/:persistentId/editMetadata/?persistentId=$pid --upload-file dataset-add-metadata.json

For these edits your JSON file need only include those dataset fields which you would like to edit. A sample JSON file may be downloaded here: dataset-edit-metadata-sample.json

Parameters:
  • identifier (string) – Identifier of the dataset. Can be a Dataverse identifier or a persistent identifier (e.g. doi:10.11587/8H3N93).
  • metadata (string) – Metadata of the Dataset as a json-formatted string.
  • is_pid (bool) – True to use persistent identifier. False, if not.
  • is_replace (bool) – True to replace already existing metadata. False, if not.
  • auth (bool) – True, if an api token should be sent. Defaults to False.
Returns:

Response object of requests library.

Return type:

requests.Response

Examples

Get dataset metadata:

>>> data = api.get_dataset_metadata(doi, auth=True)
>>> resp = api.edit_dataset_metadata(doi, data, is_replace=True, auth=True)
>>> resp.status_code
200: metadata updated
get_datafile(identifier, is_pid=True)[source]

Download a datafile via the Dataverse Data Access API.

Get by file id (HTTP Request).

GET /api/access/datafile/$id

Get by persistent identifier (HTTP Request).

GET http://$SERVER/api/access/datafile/:persistentId/?persistentId=doi:10.5072/FK2/J8SJZB
Parameters:
  • identifier (string) – Identifier of the dataset. Can be datafile id or persistent identifier of the datafile (e. g. doi).
  • is_pid (bool) – True to use persistent identifier. False, if not.
Returns:

Response object of requests library.

Return type:

requests.Response

get_datafile_bundle(identifier)[source]

Download a datafile in all its formats.

HTTP Request:

GET /api/access/datafile/bundle/$id

Data Access API calls can now be made using persistent identifiers (in addition to database ids). This is done by passing the constant :persistentId where the numeric id of the file is expected, and then passing the actual persistent id as a query parameter with the name persistentId.

This is a convenience packaging method available for tabular data files. It returns a zipped bundle that contains the data in the following formats: - Tab-delimited; - “Saved Original”, the proprietary (SPSS, Stata, R, etc.) file from which the tabular data was ingested; - Generated R Data frame (unless the “original” above was in R); - Data (Variable) metadata record, in DDI XML; - File citation, in Endnote and RIS formats.

Parameters:identifier (string) – Identifier of the dataset.
Returns:Response object of requests library.
Return type:requests.Response
get_datafiles(pid, version='1')[source]

List metadata of all datafiles of a dataset.

Documentation

HTTP Request:

GET http://$SERVER/api/datasets/$id/versions/$versionId/files
Parameters:
  • pid (string) – Persistent identifier of the dataset. e.g. doi:10.11587/8H3N93.
  • version (string) – Version of dataset. Defaults to 1.
Returns:

Response object of requests library.

Return type:

requests.Response

get_dataset(identifier, auth=True, is_pid=True)[source]

Get metadata of a Dataset.

With Dataverse identifier:

GET http://$SERVER/api/datasets/$identifier

With persistent identifier:

GET http://$SERVER/api/datasets/:persistentId/?persistentId=$id
GET http://$SERVER/api/datasets/:persistentId/
?persistentId=$pid
Parameters:
  • identifier (string) – Identifier of the dataset. Can be a Dataverse identifier or a persistent identifier (e.g. doi:10.11587/8H3N93).
  • is_pid (bool) – True, if identifier is a persistent identifier.
Returns:

Response object of requests library.

Return type:

requests.Response

get_dataset_export(pid, export_format)[source]

Get metadata of dataset exported in different formats.

Export the metadata of the current published version of a dataset in various formats by its persistend identifier.

GET http://$SERVER/api/datasets/export?exporter=$exportformat&persistentId=$pid
Parameters:
  • pid (string) – Persistent identifier of the dataset. (e.g. doi:10.11587/8H3N93).
  • export_format (string) – Export format as a string. Formats: ddi, oai_ddi, dcterms, oai_dc, schema.org, dataverse_json.
Returns:

Response object of requests library.

Return type:

requests.Response

get_dataverse(identifier, auth=False)[source]

Get dataverse metadata by alias or id.

View metadata about a dataverse.

GET http://$SERVER/api/dataverses/$id
Parameters:identifier (string) – Can either be a dataverse id (long), a dataverse alias (more robust), or the special value :root.
Returns:Response object of requests library.
Return type:requests.Response
get_info_apiTermsOfUse()[source]

Get API Terms of Use url.

The response contains the text value inserted as API Terms of use which uses the database setting :ApiTermsOfUse.

HTTP Request:

GET http://$SERVER/api/info/apiTermsOfUse
Returns:Response object of requests library.
Return type:requests.Response
get_info_server()[source]

Get dataverse server name.

This is useful when a Dataverse system is composed of multiple Java EE servers behind a load balancer.

HTTP Request:

GET http://$SERVER/api/info/server
Returns:Response object of requests library.
Return type:requests.Response
get_info_version()[source]

Get the Dataverse version and build number.

The response contains the version and build numbers. Requires no api token.

HTTP Request:

GET http://$SERVER/api/info/version
Returns:Response object of requests library.
Return type:requests.Response
get_metadatablock(identifier)[source]

Get info about single metadata block.

Returns data about the block whose identifier is passed. identifier can either be the block’s id, or its name.

HTTP Request:

GET http://$SERVER/api/metadatablocks/$identifier
Parameters:identifier (string) – Can be block’s id, or it’s name.
Returns:Response object of requests library.
Return type:requests.Response
get_metadatablocks()[source]

Get info about all metadata blocks.

Lists brief info about all metadata blocks registered in the system.

HTTP Request:

GET http://$SERVER/api/metadatablocks
Returns:Response object of requests library.
Return type:requests.Response
get_request(query_str, params=None, auth=False)[source]

Make a GET request.

Parameters:
  • query_str (string) – Query string for the request. Will be concatenated to native_api_base_url.
  • params (dict) – Dictionary of parameters to be passed with the request. Defaults to None.
  • auth (bool) – Should an api token be sent in the request. Defaults to False.
Returns:

Response object of requests library.

Return type:

requests.Response

post_request(query_str, metadata=None, auth=False, params=None)[source]

Make a POST request.

Parameters:
  • query_str (string) – Query string for the request. Will be concatenated to native_api_base_url.
  • metadata (string) – Metadata as a json-formatted string. Defaults to None.
  • auth (bool) – Should an api token be sent in the request. Defaults to False.
  • params (dict) – Dictionary of parameters to be passed with the request. Defaults to None.
Returns:

Response object of requests library.

Return type:

requests.Response

publish_dataset(pid, type='minor', auth=True)[source]

Publish dataset.

Publishes the dataset whose id is passed. If this is the first version of the dataset, its version number will be set to 1.0. Otherwise, the new dataset version number is determined by the most recent version number and the type parameter. Passing type=minor increases the minor version number (2.3 is updated to 2.4). Passing type=major increases the major version number (2.3 is updated to 3.0). Superusers can pass type=updatecurrent to update metadata without changing the version number.

HTTP Request:

POST http://$SERVER/api/datasets/$id/actions/:publish?type=$type

When there are no default workflows, a successful publication process will result in 200 OK response. When there are workflows, it is impossible for Dataverse to know how long they are going to take and whether they will succeed or not (recall that some stages might require human intervention). Thus, a 202 ACCEPTED is returned immediately. To know whether the publication process succeeded or not, the client code has to check the status of the dataset periodically, or perform some push request in the post-publish workflow.

Status Code:
200: dataset published
Parameters:
  • pid (string) – Persistent identifier of the dataset (e.g. doi:10.11587/8H3N93).
  • type (string) – Passing minor increases the minor version number (2.3 is updated to 2.4). Passing major increases the major version number (2.3 is updated to 3.0). Superusers can pass updatecurrent to update metadata without changing the version number.
  • auth (bool) – True if api authorization is necessary. Defaults to False.
Returns:

Response object of requests library.

Return type:

requests.Response

publish_dataverse(identifier, auth=True)[source]

Publish a dataverse.

Publish the Dataverse pointed by identifier, which can either by the dataverse alias or its numerical id.

HTTP Request:

POST http://$SERVER/api/dataverses/$identifier/actions/:publish
Status Code:
200: Dataverse published
Parameters:
  • identifier (string) – Can either be a dataverse id (long) or a dataverse alias (more robust).
  • auth (bool) – True if api authorization is necessary. Defaults to False.
Returns:

Response object of requests library.

Return type:

requests.Response

put_request(query_str, metadata=None, auth=False, params=None)[source]

Make a PUT request.

Parameters:
  • query_str (string) – Query string for the request. Will be concatenated to native_api_base_url.
  • metadata (string) – Metadata as a json-formatted string. Defaults to None.
  • auth (bool) – Should an api token be sent in the request. Defaults to False.
  • params (dict) – Dictionary of parameters to be passed with the request. Defaults to None.
Returns:

Response object of requests library.

Return type:

requests.Response

upload_file(identifier, filename, is_pid=True)[source]

Add file to a dataset.

Add a file to an existing Dataset. Description and tags are optional:

HTTP Request:

POST http://$SERVER/api/datasets/$id/add

The upload endpoint checks the content of the file, compares it with existing files and tells if already in the database (most likely via hashing).

Parameters:
  • identifier (string) – Identifier of the dataset.
  • filename (string) – Full filename with path.
  • is_pid (bool) – True to use persistent identifier. False, if not.
Returns:

The json string responded by the CURL request, converted to a dict().

Return type:

dict

Models Interface

Dataverse data-types data model.

class Datafile(filename=None, pid=None)[source]

Base class for the Datafile model.

Parameters:
  • filename (string) – Filename with full path.
  • pid (type) – Description of parameter pid (the default is None).
description

string – Description of datafile

restrict

bool – Unknown

__attr_required_metadata

list – List with required metadata.

__attr_valid_metadata

list – List with valid metadata for Dataverse api upload.

__attr_valid_class

list – List of all attributes.

pid
filename
dict(format='dv_up')[source]

Create dict in different data formats.

Parameters:format (string) – Data format for dict creation. Available formats are: dv_up with all metadata for Dataverse api upload, and all with all attributes set.
Returns:Data as dict.
Return type:dict

Examples

Check if metadata is valid for Dataverse api upload:

>>> from pyDataverse.models import Datafile
>>> df = Datafile()
>>> data = {
>>>     'pid': 'doi:10.11587/EVMUHP',
>>>     'description': 'Test file',
>>>     'filename': 'tests/data/datafile.txt'
>>> }
>>> df.set(data)
>>> data = df.dict()
>>> data['description']
'Test file'
filename = None

Metadata

is_valid()[source]

Check if set attributes are valid for Dataverse api metadata creation.

Returns:True, if creation of metadata json is possible. False, if not.
Return type:bool

Examples

Check if metadata is valid for Dataverse api upload:

>>> from pyDataverse.models import Datafile
>>> df = Datafile()
>>> data = {
>>>     'pid': 'doi:10.11587/EVMUHP',
>>>     'description': 'Test file',
>>>     'filename': 'tests/data/datafile.txt'
>>> }
>>> df.set(data)
>>> df.is_valid
True
>>> df.filename = None
>>> df.is_valid
False
json(format='dv_up')[source]

Create json from attributes.

Parameters:format (string) – Data format of input. Available formats are: dv_up for Dataverse Api upload compatible format and all with all attributes named in __attr_valid_class.
Returns:json-formatted string of Dataverse metadata for api upload.
Return type:string

Examples

Get dict of Dataverse metadata:

>>> from pyDataverse.models import Datafile
>>> df = Datafile()
>>> data = {
>>>     'pid': 'doi:10.11587/EVMUHP',
>>>     'description': 'Test file',
>>>     'filename': 'tests/data/datafile.txt'
>>> }
>>> df.set(data)
>>> df.dict()
{'description': 'Test file',
 'directoryLabel': None,
 'restrict': None}
set(data)[source]

Set class attributes with a flat dict.

Parameters:data (dict) – Flat dict with data. Key’s must be name the same as the class attribute, the data should be mapped to.

Examples

Set Datafile attributes via flat dict:

>>> from pyDataverse.models import Datafile
>>> df = Datafile()
>>> data = {
>>>     'pid': 'doi:10.11587/EVMUHP',
>>>     'description': 'Test file',
>>>     'filename': 'tests/data/datafile.txt'
>>> }
>>> df.set(data)
>>> df.pid
'doi:10.11587/EVMUHP',
class Dataset[source]

Base class for the Dataset data model.

accessToSources = None

Metadata – geospatial

datafiles = None

Metadata – dataset

dict(format='dv_up')[source]

Create dicts in different data formats.

Parameters:format (string) – Data format for dict creation. Available formats are: dv_up with all metadata for Dataverse api upload, and all with all attributes set.
Returns:Data as dict.
Return type:dict

Examples

Get dict of Dataverse metadata:

>>> from pyDataverse.models import Dataset
>>> ds = Dataset()
>>> data = {
>>>     'title': 'pyDataverse study 2019',
>>>     'dsDescription': 'New study about pyDataverse usage in 2019'
>>> }
>>> ds.set(data)
>>> data = dv.dict()
>>> data['title']
'pyDataverse study 2019'
export_metadata(filename, format='dv_up')[source]

Export Dataset metadata to Dataverse api upload json.

Parameters:
  • filename (string) – Filename with full path.
  • format (string) – Data format for export. Available format is: dv_up with all metadata for Dataverse api upload.

Examples

Export metadata to json file:

>>> from pyDataverse.models import Dataset
>>> ds = Dataset()
>>> data = {
>>>     'title': 'pyDataverse study 2019',
>>>     'dsDescription': 'New study about pyDataverse usage in 2019'
>>>     'author': [{'authorName': 'LastAuthor1, FirstAuthor1'}],
>>>     'datasetContact': [{'datasetContactName': 'LastContact1, FirstContact1'}],
>>>     'subject': ['Engineering'],
>>> }
>>> ds.export_metadata('tests/data/export_dataset.json')
geographicBoundingBox = None

Metadata – socialscience

import_metadata(filename, format='dv_up')[source]

Import Dataset metadata from file.

Parameters:
  • filename (string) – Filename with full path.
  • format (string) – Data format of input. Available formats are: dv_up for Dataverse api upload compatible format.

Examples

Set Dataverse attributes via flat dict:

>>> from pyDataverse.models import Dataset
>>> ds = Dataset()
>>> ds.import_metadata('tests/data/dataset_full.json')
>>> ds.title
'Replication Data for: Title'
is_valid()[source]

Check if attributes available are valid for Dataverse api metadata creation.

The attributes required are listed in __attr_required_metadata.

Returns:True, if creation of metadata json is possible. False, if not.
Return type:bool

Examples

Check if metadata is valid for Dataverse api upload:

>>> from pyDataverse.models import Dataset
>>> ds = Dataset()
>>> data = {
>>>     'title': 'pyDataverse study 2019',
>>>     'dsDescription': 'New study about pyDataverse usage in 2019'
>>> }
>>> ds.set(data)
>>> ds.is_valid()
False
>>> ds.author = [{'authorName': 'LastAuthor1, FirstAuthor1'}]
>>> ds.datasetContact = [{'datasetContactName': 'LastContact1, FirstContact1'}]
>>> ds.subject = ['Engineering']
>>> ds.is_valid()
True
json(format='dv_up')[source]

Create Dataset json from attributes.

Parameters:format (string) – Data format of input. Available formats are: dv_up for Dataverse Api upload compatible format and all with all attributes named in __attr_valid_class.
Returns:json-formatted string of Dataverse metadata for api upload.
Return type:string

Examples

Get json of Dataverse api upload:

>>> from pyDataverse.models import Dataset
>>> ds = Dataset()
>>> data = {
>>>     'title': 'pyDataverse study 2019',
>>>     'dsDescription': 'New study about pyDataverse usage in 2019'
>>>     'author': [{'authorName': 'LastAuthor1, FirstAuthor1'}],
>>>     'datasetContact': [{'datasetContactName': 'LastContact1, FirstContact1'}],
>>>     'subject': ['Engineering'],
>>> }
>>> ds.set(data)
>>> data = ds.json()
otherDataAppraisal = None

Metadata – journal

set(data)[source]

Set class attributes with a flat dict as input.

Parameters:data (dict) – Flat dict with data. Key’s must be name the same as the class attribute, the data should be mapped to.

Examples

Set Dataverse attributes via flat dict:

>>> from pyDataverse.models import Dataset
>>> ds = Dataset()
>>> data = {
>>>     'title': 'pyDataverse study 2019',
>>>     'dsDescription': 'New study about pyDataverse usage in 2019'
>>> }
>>> ds.set(data)
>>> ds.title
'pyDataverse study 2019'
termsOfAccess = None

Metadata – citation

class Dataverse[source]

Base class for Dataverse data model.

dict(format='dv_up')[source]

Create dicts in different data formats.

dv_up: Checks if data is valid for the different dict formats.

Parameters:format (string) – Data format for dict creation. Available formats are: dv_up with all metadata for Dataverse api upload, and all with all attributes set.
Returns:Data as dict.
Return type:dict

Examples

Get dict of Dataverse metadata:

>>> from pyDataverse.models import Dataverse
>>> dv = Dataverse()
>>> data = {
>>>     'dataverseContacts': [{'contactEmail': 'test@example.com'}],
>>>     'name': 'Test pyDataverse',
>>>     'alias': 'test-pyDataverse'
>>> }
>>> dv.set(data)
>>> data = dv.dict()
>>> data['name']
'Test pyDataverse'
export_metadata(filename, format='dv_up')[source]

Export Dataverse metadata to Dataverse api upload json.

Parameters:
  • filename (string) – Filename with full path.
  • format (string) – Data format for export. Available format is: dv_up with all metadata for Dataverse api upload.

Examples

Export Dataverse metadata:

>>> from pyDataverse.models import Dataverse
>>> dv = Dataverse()
>>> data = {
>>>     'dataverseContacts': [{'contactEmail': 'test@example.com'}],
>>>     'name': 'Test pyDataverse',
>>>     'alias': 'test-pyDataverse'
>>> }
>>> dv.set(data)
>>> dv.export_metadata('tests/data/dataverse_export.json')
import_metadata(filename, format='dv_up')[source]

Import Dataverse metadata from file.

This simply parses in data with valid attribute naming as keys. Data must not be complete, and also attributes required for the metadata json export can be missing.

Parameters:
  • filename (string) – Filename with full path.
  • format (string) – Data format of input. Available formats are: dv_up for Dataverse Api upload compatible format.

Examples

Import metadata coming from json file:

>>> from pyDataverse.models import Dataverse
>>> dv = Dataverse()
>>> dv.import_metadata('tests/data/dataverse_min.json')
>>> dv.name
'Test pyDataverse'
is_valid()[source]

Check if set attributes are valid for Dataverse api metadata creation.

The attributes required are listed in __attr_required_metadata.

Returns:True, if creation of metadata json is possible. False, if not.
Return type:bool

Examples

Check if metadata is valid for Dataverse api upload:

>>> from pyDataverse.models import Dataverse
>>> dv = Dataverse()
>>> data = {
>>>     'dataverseContacts': [{'contactEmail': 'test@example.com'}],
>>>     'name': 'Test pyDataverse',
>>>     'alias': 'test-pyDataverse'
>>> }
>>> dv.set(data)
>>> dv.is_valid
True
>>> dv.name = None
>>> dv.is_valid
False
json(format='dv_up')[source]

Create json from attributes.

Parameters:format (string) – Data format of input. Available formats are: dv_up for Dataverse Api upload compatible format and all with all attributes named in __attr_valid_class.
Returns:json-formatted string of Dataverse metadata for api upload.
Return type:string

Examples

Get dict of Dataverse metadata:

>>> from pyDataverse.models import Dataverse
>>> dv = Dataverse()
>>> data = {
>>>     'dataverseContacts': [{'contactEmail': 'test@example.com'}],
>>>     'name': 'Test pyDataverse',
>>>     'alias': 'test-pyDataverse'
>>> }
>>> dv.set(data)
>>> data = dv.json()
>>> data
'{\n  "name": "Test pyDataverse",\n  "dataverseContacts": [\n    {\n      "contactEmail": "test@example.com"\n    }\n  ],\n  "alias": "test-pyDataverse"\n}'
pid = None

Metadata

set(data)[source]

Set class attributes with a flat dict.

Parameters:data (dict) – Flat dict with data. Key’s must be name the same as the class attribute, the data should be mapped to.

Examples

Set Dataverse attributes via flat dict:

>>> from pyDataverse.models import Dataverse
>>> dv = Dataverse()
>>> data = {
>>>     'dataverseContacts': [{'contactEmail': 'test@example.com'}],
>>>     'name': 'Test pyDataverse',
>>>     'alias': 'test-pyDataverse'
>>> }
>>> dv.set(data)
>>> dv.name
'Test pyDataverse'

Utils Interface

Dataverse utility functions.

dict_to_json(data)[source]

Convert dict() to JSON-formatted string.

See more about the json module at https://docs.python.org/3.5/library/json.html

Parameters:data (dict) – Data as Python Dictionary.
Returns:Data as a json-formatted string.
Return type:string
json_to_dict(data)[source]

Convert JSON to a dict().

See more about the json module at https://docs.python.org/3.5/library/json.html

Parameters:data (string) – Data as a json-formatted string.
Returns:Data as Python Dictionary.
Return type:dict
read_csv_to_dict(filename)[source]

Read in csv file and convert it into a list of dicts.

This offers an easy import functionality of csv files with dataset metadata.

Assumptions: 1) The header rows contains the column names, named after Dataverse’s dataset attribute standard naming convention. 2) One row contains one dataset

After the import, the created dict then can directly be used to set Dataset() attributes via Dataset.set(data).

Parameters:filename (string) – Filename with full path.
Returns:List with one dict per row (=dataset). The keys of the dicts are named after the columen names, which must be named after the Dataverse dataset metadata naming convention.
Return type:list
read_file(filename, mode='r')[source]

Read in a file.

Parameters:
Returns:

Returns data as string.

Return type:

string

read_file_csv(filename)[source]

Read in CSV file.

See more at csv.reader().

Parameters:filename (string) – Full filename with path of file.
Returns:Reader object, which can be iterated over.
Return type:reader
read_file_json(filename)[source]

Read in a json file.

See more about the json module at https://docs.python.org/3.5/library/json.html

Parameters:filename (string) – Filename with full path.
Returns:Data as a json-formatted string.
Return type:dict
write_file(filename, data, mode='w')[source]

Write data in a file.

Parameters:
write_file_json(filename, data, mode='w')[source]

Write data to a json file.

Parameters:

Exceptions

Find out more at https://github.com/AUSSDA/pyDataverse.

exception ApiAuthorizationError[source]

Raised if a user provides invalid credentials.

exception ApiResponseError[source]

Raised when the requests response fails.

exception ApiUrlError[source]

Raised when the request url is not valid.

exception DatafileNotFoundError[source]

Raised when a Datafile cannot be found.

exception DatasetNotFoundError[source]

Raised when a Dataset cannot be found.

exception DataverseApiError[source]

Base exception class for Dataverse-related api error.

exception DataverseError[source]

Base exception class for Dataverse-related error.

exception DataverseNotEmptyError[source]

Raised when a Dataverse has accessioned Datasets.

exception DataverseNotFoundError[source]

Raised when a Dataverse cannot be found.

exception OperationFailedError[source]

Raised when an operation fails for an unknown reason.

Install

Install from the local git repository, with all it’s dependencies:

git clone git@github.com:AUSSDA/pyDataverse.git
cd pyDataverse
virtualenv venv
source venv/bin/activate
pip install -r tools/tests-requirements.txt
pip install -r tools/lint-requirements.txt
pip install -r tools/docs-requirements.txt
pip install -r tools/packaging-requirements.txt
pip install -e .

Testing

Before you can execute tests, you need a Dataverse account with an api token on a working Dataverse instance. We recommend to use demo.dataverse.org, but you also can use your own instance or any other, but beware: To use a production instance can cause problems.

Before you can run the tests, you have to set the ENV variables for the Dataverse Api connection. This can be done via creation of a pytest.ini file:

[pytest]
env =
    API_TOKEN=**SECRET**
    DATAVERSE_VERSION=4.14
    BASE_URL=https://demo.dataverse.org/

or define them manually in the terminal:

export API_TOKEN=**SECRET**
export DATAVERSE_VERSION=4.14
export BASE_URL=https://demo.dataverse.org/

To run through all tests (e. g. different python versions, packaging, docs, flake8, etc.), simply call tox from the root directory:

tox

When you only want to run one test, e.g. the py36 test:

tox -e py36

To find out more about which tests are available, have a look inside the tox.ini file.

Create Coverage Reports

Run tests with coverage to create html and xml reports as an output. Again, call it via tox. This creates the created docs inside docs/coverage_html/.

tox -e coverage

Run Coveralls

To use Coveralls on local development:

tox -e coveralls

Documentation

Create Sphinx Docs

Use Sphinx to create class and function documentation out of the doc-strings. You can call it via tox. This creates the created docs inside docs/build.

tox -e docs