ncsw_data.source.compound

The ncsw_data.source.compound package initialization module.

Submodules

Classes

CompoundDataSource

The chemical compound data source class.

Package Contents

class ncsw_data.source.compound.CompoundDataSource(logger: logging.Logger | None = None)

Bases: ncsw_data.source.base.base.DataSourceBase

The chemical compound data source class.

The __init__ method of the class.

Parameters:

logger – The logger. The value None indicates that the logger should not be utilized.

supported_data_sources
get_names_of_supported_data_sources() List[str]

Get the names of the supported data sources.

Returns:

The names of the supported data sources.

get_supported_versions(name: str) Dict[str, str]

Get the supported versions of a data source.

Parameters:

name – The name of the data source.

Returns:

The supported versions of the data source.

download(name: str, version: str, output_directory_path: str | os.PathLike[str], **kwargs) None

Download the data from a data source.

Parameters:
  • name – The name of the data source.

  • version – The version of the data source.

  • output_directory_path – The path to the output directory where the data should be downloaded.

extract(name: str, version: str, input_directory_path: str | os.PathLike[str], output_directory_path: str | os.PathLike[str], **kwargs) None

Extract the data from a data source.

Parameters:
  • name – The name of the data source.

  • version – The version of the data source.

  • input_directory_path – The path to the input directory where the data is downloaded.

  • output_directory_path – The path to the output directory where the data should be extracted.

format(name: str, version: str, input_directory_path: str | os.PathLike[str], output_directory_path: str | os.PathLike[str], **kwargs) None

Format the data from a data source.

Parameters:
  • name – The name of the data source.

  • version – The version of the data source.

  • input_directory_path – The path to the input directory where the data is extracted.

  • output_directory_path – The path to the output directory where the data should be formatted.