exporters

exporter_base

class hidebound.exporters.exporter_base.ExporterBase(metadata_types=['asset', 'file', 'asset-chunk', 'file-chunk'], dask={})[source]

Bases: object

Abstract base class for hidebound exporters.

__init__(metadata_types=['asset', 'file', 'asset-chunk', 'file-chunk'], dask={})[source]

Constructs a ExporterBase instance.

Parameters:
  • metadata_types (list[st], optional) – Default: [asset, file, asset-chunk, file-chunk].

  • dask (dict, optional) – {}.

_enforce_directory_structure(staging_dir)[source]
Ensure the following directory exist under given hidebound directory.
  • content

  • metadata

  • metadata/asset

  • metadata/file

  • metadata/asset-chunk

  • metadata/file-chunk

Parameters:

staging_dir (Path or str) – Hidebound directory.

Raises:

FileNotFoundError – If any of the directories have not been found.

Return type:

None

_export_asset(metadata)[source]

Exports metadata from single JSON file in hidebound/metadata/asset.

Parameters:

metadata (dict) – Asset metadata.

Raises:

NotImplementedError – If method is not implemented in subclass.

Return type:

None

_export_asset_chunk(metadata)[source]

Exports list of asset metadata to a single asset in hidebound/metadata/asset-chunk.

Parameters:

metadata (list[dict]) – asset metadata.

Raises:

NotImplementedError – If method is not implemented in subclass.

Return type:

None

_export_content(metadata)[source]

Exports from file from hidebound/content named in metadata. Metadata should have filepath, filepath_relative keys.

Parameters:

metadata (dict) – File metadata.

Raises:

NotImplementedError – If method is not implemented in subclass.

Return type:

None

_export_file(metadata)[source]

Exports metadata from single JSON file in hidebound/metadata/file.

Parameters:

metadata (dict) – File metadata.

Raises:

NotImplementedError – If method is not implemented in subclass.

Return type:

None

_export_file_chunk(metadata)[source]

Exports list of file metadata to a single file in hidebound/metadata/file-chunk.

Parameters:

metadata (list[dict]) – File metadata.

Raises:

NotImplementedError – If method is not implemented in subclass.

Return type:

None

export(staging_dir, logger=None)[source]

Exports data within given hidebound directory.

Parameters:
  • staging_dir (Path or str) – Hidebound directory.

  • logger (object, optional) – Progress logger. Default: None.

Return type:

None

class hidebound.exporters.exporter_base.ExporterConfigBase(raw_data=None, trusted_data=None, deserialize_mapping=None, init=True, partial=True, strict=True, validate=False, app_data=None, lazy=False, **kwargs)[source]

Bases: Model

A class for validating configurations supplied to S3Exporter.

metadata_types

List of metadata types for export. Default: [asset, file, asset-chunk, file-chunk].

Type:

list, optional

dask

{}.

Type:

dict, optional

_schema = <schematics.deprecated.patch_schema.<locals>.Schema object>
dask: ModelType = <ModelType(DaskConnectionConfig) instance on ExporterConfigBase as 'dask'>
metadata_types = <ListType(StringType) instance on ExporterConfigBase as 'metadata_types'>

girder_exporter

class hidebound.exporters.girder_exporter.GirderConfig(raw_data=None, trusted_data=None, deserialize_mapping=None, init=True, partial=True, strict=True, validate=False, app_data=None, lazy=False, **kwargs)[source]

Bases: Model

A class for validating configurations supplied to GirderExporter.

name

Name of exporter. Must be ‘girder’.

Type:

str

api_key

Girder API key.

Type:

str

root_id

ID of folder or collection under which all data will be exported.

Type:

str

root_type

Root entity type. Default: collection. Options: folder, collection

Type:

str, optional

host

Docker host URL address. Default: http://0.0.0.0

Type:

str, optional

port

Docker host port. Default: 8180.

Type:

int, optional

metadata_types

List of metadata types for export. Default: [asset, file].

Type:

list, optional

_schema = <schematics.deprecated.patch_schema.<locals>.Schema object>
api_key: StringType = <StringType() instance on GirderConfig as 'api_key'>
host: URLType = <URLType() instance on GirderConfig as 'host'>
metadata_types = <ListType(StringType) instance on GirderConfig as 'metadata_types'>
name: StringType = <StringType() instance on GirderConfig as 'name'>
port: IntType = <IntType() instance on GirderConfig as 'port'>
root_id: StringType = <StringType() instance on GirderConfig as 'root_id'>
root_type: StringType = <StringType() instance on GirderConfig as 'root_type'>
class hidebound.exporters.girder_exporter.GirderExporter(api_key, root_id, root_type='collection', host='http://0.0.0.0', port=8180, client=None, metadata_types=['asset', 'file'], **kwargs)[source]

Bases: ExporterBase

Export for Girder asset framework.

__init__(api_key, root_id, root_type='collection', host='http://0.0.0.0', port=8180, client=None, metadata_types=['asset', 'file'], **kwargs)[source]

Constructs a GirderExporter instances and creates a Girder client.

Args:

api_key (str): Girder API key. root_id (str): ID of folder or collection under which all data will

be exported.

root_type (str, optional): Root entity type. Default: collection.

Options: folder, collection

host (str, optional): Docker host URL address.

Default: http://0.0.0.0.

port (int, optional): Docker host port. Default: 8180. client (object, optional): Client instance, for testing.

Default: None.

metadata_types (list[str], optional): Metadata types to export.

Default: [asset, file].

):

Raises:

DataError: If config is invalid.

_export_asset(metadata)[source]

Export asset metadata to Girder. Metadata must contain these fields:

  • asset_type

  • asset_path_relative

Parameters:

metadata (dict) – Asset metadata.

Return type:

None

_export_asset_chunk(metadata)[source]

Exports content from asset log in hidebound/metadata/asset-chunk.

Parameters:

metadata (list[dict]) – Asset metadata chunk.

Return type:

None

_export_content(metadata)[source]

Export file content and metadata to Girder. Metadata must contain these fields:

  • filepath_relative

  • filename

  • filepath

Parameters:

metadata (dict) – File metadata.

Returns:

Response.

Return type:

object

_export_dirs(dirpath, metadata={}, exists_ok=False)[source]

Recursively export all the directories found in given path.

Parameters:
  • dirpath (Path or str) – Directory paht to be exported.

  • metadata (dict, optional) – Metadata to be appended to final directory. Default: {}.

Returns:

Response (contains _id key).

Return type:

dict

_export_file(metadata)[source]

Exports content from file metadata in hidebound/metadata/file.

Parameters:

metadata (dict) – File metadata.

Return type:

None

_export_file_chunk(metadata)[source]

Exports content from file log in hidebound/metadata/file-chunk.

Parameters:

metadata (list[dict]) – File metadata chunk.

Return type:

None

static from_config(config, client=None)[source]

Construct a GirderExporter from a given config.

Parameters:
  • config (dict) – Config dictionary.

  • client (object, optional) – Client instance, for testing. Default: None.

Raises:

DataError – If config is invalid.

Returns:

GirderExporter instance.

Return type:

GirderExporter

disk_exporter

class hidebound.exporters.disk_exporter.DiskConfig(raw_data=None, trusted_data=None, deserialize_mapping=None, init=True, partial=True, strict=True, validate=False, app_data=None, lazy=False, **kwargs)[source]

Bases: ExporterConfigBase

A class for validating configurations supplied to DiskExporter.

name

Name of exporter. Must be ‘disk’.

Type:

str

target_directory

Target directory.

Type:

str

_schema = <schematics.deprecated.patch_schema.<locals>.Schema object>
dask: ModelType = <ModelType(DaskConnectionConfig) instance on DiskConfig as 'dask'>
metadata_types = <ListType(StringType) instance on DiskConfig as 'metadata_types'>
name: StringType = <StringType() instance on DiskConfig as 'name'>
target_directory: StringType = <StringType() instance on DiskConfig as 'target_directory'>
class hidebound.exporters.disk_exporter.DiskExporter(target_directory, metadata_types=['asset', 'file', 'asset-chunk', 'file-chunk'], **kwargs)[source]

Bases: ExporterBase

__init__(target_directory, metadata_types=['asset', 'file', 'asset-chunk', 'file-chunk'], **kwargs)[source]

Constructs a DiskExporter instance. Creates target directory if it does not exist.

Parameters:
  • target_directory (str) – Target directory.

  • metadata_types (list, optional) – List of metadata types for export. Default: [asset, file, asset-chunk, file-chunk].

Raises:

DataError – If config is invalid.

_export_asset(metadata)[source]

Exports metadata from single JSON file in hidebound/metadata/asset.

Parameters:

metadata (dict) – Asset metadata.

Return type:

None

_export_asset_chunk(metadata)[source]

Exports content from single asset chunk in hidebound/metadata/asset-chunk.

Parameters:

metadata (list[dict]) – Asset metadata.

Return type:

None

_export_content(metadata)[source]

Exports content from filepath in given metadata.

Parameters:

metadata (dict) – File metadata.

Return type:

None

_export_file(metadata)[source]

Exports metadata from single JSON file in hidebound/metadata/file.

Parameters:

metadata (dict) – File metadata.

Return type:

None

_export_file_chunk(metadata)[source]

Exports content from single file chunk in hidebound/metadata/file-chunk.

Parameters:

metadata (list[dict]) – File metadata.

Return type:

None

static from_config(config)[source]

Construct a DiskExporter from a given config.

Parameters:

config (dict) – Config dictionary.

Raises:

DataError – If config is invalid.

Returns:

DiskExporter instance.

Return type:

DiskExporter

s3_exporter

class hidebound.exporters.s3_exporter.S3Config(raw_data=None, trusted_data=None, deserialize_mapping=None, init=True, partial=True, strict=True, validate=False, app_data=None, lazy=False, **kwargs)[source]

Bases: ExporterConfigBase

A class for validating configurations supplied to S3Exporter.

name

Name of exporter. Must be ‘s3’.

Type:

str

access_key

AWS access key.

Type:

str

secret_key

AWS secret key.

Type:

str

bucket

AWS bucket name.

Type:

str

region

AWS region name. Default: us-east-1.

Type:

str

_schema = <schematics.deprecated.patch_schema.<locals>.Schema object>
access_key: StringType = <StringType() instance on S3Config as 'access_key'>
bucket: StringType = <StringType() instance on S3Config as 'bucket'>
dask: ModelType = <ModelType(DaskConnectionConfig) instance on S3Config as 'dask'>
metadata_types = <ListType(StringType) instance on S3Config as 'metadata_types'>
name: StringType = <StringType() instance on S3Config as 'name'>
region: StringType = <StringType() instance on S3Config as 'region'>
secret_key: StringType = <StringType() instance on S3Config as 'secret_key'>
class hidebound.exporters.s3_exporter.S3Exporter(access_key, secret_key, bucket, region, metadata_types=['asset', 'file', 'asset-chunk', 'file-chunk'], **kwargs)[source]

Bases: ExporterBase

__init__(access_key, secret_key, bucket, region, metadata_types=['asset', 'file', 'asset-chunk', 'file-chunk'], **kwargs)[source]

Constructs a S3Exporter instances and creates a bucket with given name if it does not exist.

Parameters:
  • access_key (str) – AWS access key.

  • secret_key (str) – AWS secret key.

  • bucket (str) – AWS bucket name.

  • region (str) – AWS region.

  • metadata_types (list, optional) – List of metadata types for export. Default: [asset, file, asset-chunk, file-chunk].

Raises:

DataError – If config is invalid.

_export_asset(metadata)[source]

Exports metadata from single JSON file in hidebound/metadata/asset.

Parameters:

metadata (dict) – Asset metadata.

Return type:

None

_export_asset_chunk(metadata)[source]

Exports list of asset metadata to a single file in hidebound/metadata/asset-chunk.

Parameters:

metadata (list[dict]) – Asset metadata.

Return type:

None

_export_content(metadata)[source]

Exports metadata from single JSON file in hidebound/metadata/file.

Parameters:

metadata (dict) – File metadata.

Return type:

None

_export_file(metadata)[source]

Exports metadata from single JSON file in hidebound/metadata/file.

Parameters:

metadata (dict) – File metadata.

Return type:

None

_export_file_chunk(metadata)[source]

Exports list of file metadata to a single file in hidebound/metadata/file-chunk.

Parameters:

metadata (list[dict]) – File metadata.

Return type:

None

static from_config(config)[source]

Construct a S3Exporter from a given config.

Parameters:

config (dict) – Config dictionary.

Raises:

DataError – If config is invalid.

Returns:

S3Exporter instance.

Return type:

S3Exporter