ncsw_data.source.reaction.uspto

The ncsw_data.source.reaction.uspto package initialization module.

Submodules

Classes

USPTOReactionDataset

The United States Patent and Trademark Office (USPTO)

Package Contents

class ncsw_data.source.reaction.uspto.USPTOReactionDataset(logger: logging.Logger | None = None)

Bases: ncsw_data.source.base.base.DataSourceBase

The United States Patent and Trademark Office (USPTO) chemical reaction dataset class.

The __init__ method of the class.

Parameters:

logger – The logger. The value None indicates that the logger should not be utilized.

static get_supported_versions() Dict[str, str]

Get the supported versions of the dataset.

Returns:

The supported versions of the dataset.

download(version: str, output_directory_path: str | os.PathLike[str], **kwargs) None

Download the data from the dataset.

Parameters:
  • version – The version of the dataset.

  • output_directory_path – The path to the output directory where the data should be downloaded.

extract(version: str, input_directory_path: str | os.PathLike[str], output_directory_path: str | os.PathLike[str], **kwargs) None

Extract the data from the dataset.

Parameters:
  • version – The version of the dataset.

  • input_directory_path – The path to the input directory where the data is downloaded.

  • output_directory_path – The path to the output directory where the data should be extracted.

format(version: str, input_directory_path: str | os.PathLike[str], output_directory_path: str | os.PathLike[str], **kwargs) None

Format the data from the dataset.

Parameters:
  • version – The version of the dataset.

  • input_directory_path – The path to the input directory where the data is extracted.

  • output_directory_path – The path to the output directory where the data should be formatted.