qsprpred.data.tables package

Subpackages

Submodules

qsprpred.data.tables.descriptor module

class qsprpred.data.tables.descriptor.DescriptorTable(calculator: DescriptorSet, name: str, df: DataFrame | None = None, store_dir: str = '.', overwrite: bool = False, index_cols: list[str] | None = None, n_jobs: int = 1, chunk_size: int | None = None, autoindex_name: str | None = None, random_state: int | None = None, store_format: str = 'pkl', parallel_generator: ParallelGenerator | None = None)[source]

Bases: PandasDataTable

Pandas table that holds descriptor data for modelling and other analyses.

Variables:

calculator (DescriptorSet) – DescriptorSet used for descriptor calculation.

Initialize a DescriptorTable object.

Parameters:
  • calculator (DescriptorSet) – DescriptorSet used for descriptor calculation.

  • name (str) – Name of the new descriptor table.

  • df (pd.DataFrame) – data frame containing the descriptors. If you provide a dataframe for a dataset that already exists on disk, the dataframe from disk will override the supplied data frame. Set ‘overwrite’ to True to override the data frame on disk.

  • store_dir (str) – Directory to store the dataset files. Defaults to the current directory. If it already contains files with the same name, the existing data will be loaded.

  • overwrite (bool) – Overwrite existing dataset.

  • index_cols (list) – list of columns to use as index. If None, the index will be a custom generated ID.

  • n_jobs (int) – Number of jobs to use for parallel processing. If <= 0, all available cores will be used.

  • chunk_size (int) – Size of chunks to use per job in parallel processing.

  • autoindex_name (str) – Column name to use for automatically generated IDs.

  • random_state (int) – Random state to use for shuffling and other random ops.

  • store_format (str) – Format to use for storing the data (‘pkl’ or ‘csv’).

  • parallel_generator (ParallelGenerator) – Generator to use for parallel processing. If None, a new generator will be created.

addEntries(ids: list[str], props: dict[str, list], raise_on_existing: bool = True)

Add entries to the data set.

Parameters:
  • ids (list[str]) – IDs of entries to add.

  • props (dict[str, list]) – Dictionary of properties to add.

  • raise_on_existing (bool) – If True, raise an error if any of the new entries are duplicates.

addProperty(name: str, data: list, ids: list[str] | None = None, ignore_missing: bool = False)

Add a property to the data frame.

Parameters:
  • name (str) – Name of the property.

  • data (list) – list of property values.

  • ids – IDs of entries to get properties for.

  • ignore_missing (bool) – If True, missing IDs are ignored.

apply(func: Callable[[dict[str, list[Any]] | DataFrame, ...], Any], func_args: tuple[Any, ...] | None = None, func_kwargs: dict[str, Any] | None = None, on_props: tuple[str, ...] | None = None, as_df: bool = False, chunk_size: int | None = None, n_jobs: int | None = None) Generator

Apply a function to the data frame. The properties of the data set are passed as the first positional argument to the function. This will be a dictionary of the form {'prop1': [...], 'prop2': [...], ...}. If as_df is True, the properties will be passed as a data frame instead.

Any additional arguments specified in func_args and func_kwargs will be passed to the function after the properties as positional and keyword arguments, respectively.

If on_props is specified, only the properties in this list will be passed to the function. If on_props is None, all properties will be passed to the function.

Parameters:
  • func (Callable) – Function to apply to the data frame.

  • func_args (list) – Positional arguments to pass to the function.

  • func_kwargs (dict) – Keyword arguments to pass to the function.

  • on_props (list[str]) – list of properties to send to the function as arguments

  • as_df (bool) – If True, the function is applied to chunks represented as data frames.

  • chunk_size (int) – Size of chunks to use per job in parallel processing. If None, the chunk size will be set to self.chunkSize. The chunk size will always be set to the number of rows in the data frame if n_jobs or `self.nJobs is 1.

  • n_jobs (int) – Number of jobs to use for parallel processing. If None, self.nJobs is used.

Returns:

Generator that yields the results of the function applied to each chunk of the data frame as determined by chunk_size and n_jobs. Each item in the generator will be the result of the function applied to one chunk of the data set.

Return type:

Generator

property baseDir: str

The base directory of the data set folder.

property chunkSize: int

Size of chunks to use per job in parallel processing.

clear(files_only: bool = True)

Remove all files associated with this data set from disk.

dropEmptyProperties(names: list[str])

Drop rows with empty target property value from the data set.

Parameters:

names (list[str]) – list of property names to check for empty values.

dropEntries(ids: Iterable[str], ignore_missing: bool = False)

Drop entries from the data set by their IDs.

Parameters:
  • ids (Iterable[str]) – IDs of entries to drop.

  • ignore_missing (bool) – If True, missing IDs are ignored.

fillMissing(fill_value: float, names: list[str] | None = None)[source]

Fill missing values in the descriptor table.

Parameters:
  • fill_value (float) – Value to fill missing values with.

  • names (list) – List of descriptor names to fill. If None, all descriptors are filled.

classmethod fromFile(filename: str) Any

Initialize a new instance from a JSON file.

Parameters:

filename (str) – path to the JSON file

Returns:

new instance of the class

Return type:

instance (object)

classmethod fromJSON(json: str) Any

Reconstruct object from a JSON string.

Parameters:

json (str) – JSON string of the object

Returns:

reconstructed object

Return type:

obj (object)

generateIndex(name: str | None = None, prefix: str | None = None)

Generate a custom index for the data frame automatically.

Parameters:
  • name (str | None) – name of the resulting index column.

  • prefix (str | None) – prefix to use for the index column values.

getDF()

Get the data frame this instance manages.

Returns:

The data frame this instance manages.

Return type:

pd.DataFrame

getDescriptorNames(active_only: bool = True) list[str][source]

Get the names of the descriptors in this represented by this table. By default, only active descriptors are returned. You can use active_only=False to get all descriptors saved in the table.

Parameters:

active_only (bool) – Whether to return only descriptors that are active in the current descriptor set. Defaults to True.

Returns:

list of descriptor names

Return type:

(list)

getDescriptors(active_only: bool = True) DataFrame[source]

Get the descriptors stored in this table.

Parameters:

active_only (bool) – Whether to return only active descriptors.

Returns:

The descriptors.

Return type:

pd.DataFrame

getProperties() list[str]

Get names of all properties/variables saved in the data frame (all columns).

Returns:

list of property names.

Return type:

(list[str])

getProperty(name: str, ids: tuple[str] | None = None, ignore_missing: bool = False) Series

Get property values from the data set.

Parameters:
  • name (str) – Name of the property to get.

  • ids – IDs of entries to get properties for.

  • ignore_missing (bool) – If True, missing IDs are ignored.

Returns:

List of values for the property.

Return type:

(pd.Series)

getSubset(properties: list[str], ids: list[str] | None = None, name: str | None = None, path: str | None = None, ignore_missing: bool = False) DescriptorTable[source]

Get a subset of the descriptor table.

Parameters:
  • properties (list) – List of properties to include in the subset.

  • ids (list, optional) – List of IDs to include in the subset.

  • name (str, optional) – Name of the new descriptor table.

  • path (str, optional) – Path to store the new descriptor table.

  • ignore_missing (bool, optional) – Whether to ignore missing IDs.

Returns:

The subset of the descriptor table.

Return type:

DescriptorTable

hasProperty(name: str) bool

Check whether a property is present in the data frame.

Parameters:

name (str) – Name of the property.

Returns:

Whether the property is present.

Return type:

bool

property idProp: str

Column name to use for automatically generated IDs.

iterChunks(size: int | None = None, on_props: tuple[str] | None = None, as_dict: bool = False) Generator[DataFrame | dict, None, None]

Batch a data frame into chunks of the given size.

Parameters:
  • size (int) – Size of chunks to use per job in parallel processing. If None, self.chunkSize is used.

  • on_props (list[str]) – list of properties to include, if None, all properties are included.

  • as_dict (bool) – If True, the generator yields dictionaries instead of data frames.

Returns:

Generator that yields batches of the data frame as smaller data frames.

Return type:

Generator[pd.DataFrame, None, None]

keepDescriptors(descriptors: list[str]) list[str][source]

Mark only the given descriptors as active in this set.

Parameters:

descriptors (list) – list of descriptor names to keep

Returns:

list of descriptor names that were kept

Return type:

list[str]

Raises:

ValueError – If any of the descriptors are not present in the table.

property metaFile

The path to the meta file of this data set.

property nJobs

Number of jobs to use for parallel processing.

property name: str

Name of the data set.

property randomState: int

Random state to use for all random operations for reproducibility.

reload()

Reload the data table from disk.

removeProperty(name)

Remove a property from the data frame.

Parameters:

name (str) – Name of the property to delete.

restoreDescriptors() list[str][source]

Restore all descriptors to active in this set.

Returns:

list of all active descriptor names

Return type:

list[str]

save() str

Save the data frame to disk and all associated files.

Returns:

Path to the saved data frame.

Return type:

(str)

searchOnProperty(prop_name: str, values: list[str], exact: bool = False) PandasDataTable

Search the molecules within this MoleculeDataSet on a property value and return the appropriate subset.

Parameters:
  • prop_name (str) – Name of the column to search on.

  • values (list[str]) – Values to search for.

  • exact (bool) – Whether to search for exact matches or not.

Returns:

A data set with the molecules that match the search.

Return type:

(PandasDataTable)

setIndex(cols: list[str])

Create and index column from several columns of the data set. This also resets the idProp attribute to be the name of the index columns joined by a ‘~’ character. The values of the columns are also joined in the same way to create the index. Thus, make sure the values of the columns are unique together and can be joined to a string.

Parameters:

cols (list[str]) – list of columns to use as index.

shuffle(random_state: int | None = None)

Shuffle the internal data frame.

Parameters:

random_state (int | None) – Random state to use for shuffling. If None, the random state of the data set is used.

property storeDir

The data set folder containing the data set files after saving.

property storePath

The path to the main data set file.

property storePrefix

The prefix of the data set files.

toFile(filename: str) str

Save the metafile and all associated files to a custom location.

Parameters:

filename (str) – absolute path to the saved metafile.

Returns:

Path to the saved data frame.

Return type:

(str)

toJSON() str
Serialize object to a JSON string. This JSON string should

contain all data necessary to reconstruct the object.

Returns:

JSON string of the object

Return type:

json (str)

transformProperties(names: list[str], transformer: Callable)

Transform property values using a transformer function.

Parameters:
  • names (list[str]) – list of column names to transform.

  • transformer (Callable) – Function that transforms the data in target columns to a new representation.

qsprpred.data.tables.mol module

class qsprpred.data.tables.mol.MoleculeTable(storage: ChemStore | None = None, name: str | None = None, path: str = '.', random_state: int | None = None, store_format: str = 'pkl')[source]

Bases: MoleculeDataSet, Parallelizable

Class that holds and prepares molecule data for modelling and other analyses organized as a collection of PandasDataTable objects.

Variables:
  • descriptors (list[DescriptorTable]) – List of descriptor tables attached to this data set.

  • randomState (int) – Random state to use for shuffling and other random ops.

  • storeFormat (str) – Format to use for storing the data set.

  • rootDir (str) – Path to the directory where the data set is stored.

  • storage (ChemStore) – The storage object that holds the molecule data.

  • path (str) – Path to the directory where the data set will be stored.

  • name (str) – Name of the data set.

Initialize a MoleculeTable object.

This object wraps a pandas dataframe and provides short-hand methods to prepare molecule data for modelling and analysis.

Parameters:
  • storage (ChemStore) – The storage object that holds the molecule data.

  • name (str) – Name of the data set.

  • path (str) – Path to the directory where the data set will be stored.

  • random_state (int) – Random state to use for shuffling and other random ops.

  • store_format (str) – Format to use for storing the data set.

addClusters(clusters: list[MoleculeClusters], recalculate: bool = False)[source]

Add clusters to the data frame.

A new column is created that contains the identifier of the corresponding cluster calculator.

Parameters:
  • clusters (list) – list of MoleculeClusters calculators.

  • recalculate (bool) – Whether to recalculate clusters even if they are already present in the data frame.

addDescriptors(descriptors: list[DescriptorSet], recalculate: bool = False, *args, **kwargs)[source]

Add descriptors to the data frame with the given descriptor calculators.

Parameters:
  • descriptors (list[DescriptorSet]) – List of DescriptorSet objects to use for descriptor calculation.

  • recalculate (bool) – Whether to recalculate descriptors even if they are already present in the data frame. If False, existing descriptors are kept and no calculation takes place.

  • *args – Additional positional arguments to pass to each descriptor set.

  • **kwargs – Additional keyword arguments to pass to each descriptor set.

addEntries(ids: list[str], props: dict[str, list], raise_on_existing: bool = True)[source]

Add entries to the data set.

Parameters:
  • ids (list[str]) – IDs of the entries to add.

  • props (dict[str, list]) – Properties to add.

  • raise_on_existing (bool)

  • exist. (Whether to raise an error if the entries already)

Raises:

NotImplementedError – Adding entries is not yet available for the data set.

addProperty(name: str, data: Sized, ids: list[str] | None = None)[source]

Add a property to the data frame.

Parameters:
  • name (str) – Name of the property.

  • data (Sized) – Property values.

  • ids (list[str], optional) – IDs of the molecules to add the property for.

Returns:

Whether the property was added successfully.

Return type:

(bool)

addScaffolds(scaffolds: list[Scaffold], add_rdkit_scaffold: bool = False, recalculate: bool = False)[source]

Add scaffolds to the data frame.

A new column is created that contains the SMILES of the corresponding scaffold. If add_rdkit_scaffold is set to True, a new column is created that contains the RDKit scaffold of the corresponding molecule.

Parameters:
  • scaffolds (list) – list of Scaffold calculators.

  • add_rdkit_scaffold (bool) – Whether to add the RDKit scaffold of the molecule as a new column.

  • recalculate (bool) – Whether to recalculate scaffolds even if they are already present in the data frame.

apply(func: callable, func_args: list | None = None, func_kwargs: dict | None = None, on_props: tuple[str, ...] | None = None, chunk_type: Literal['mol', 'smiles', 'rdkit', 'df'] = 'mol') Generator[Iterable[Any], None, None][source]

Apply a function to the data set.

Parameters:
  • func (callable) – Function to apply.

  • func_args (list, optional) – Positional arguments to pass to the function.

  • func_kwargs (dict, optional) – Keyword arguments to pass to the function.

  • on_props (tuple[str, ...], optional) – Properties to apply the function on.

  • chunk_type (Literal["mol", "smiles", "rdkit", "df"], optional) – Type of chunks to use for processing.

Returns:

Generator of the results.

Return type:

(Generator[Iterable[Any], None, None])

applyIdentifier(identifier: ChemIdentifier)[source]

Apply an identifier to the data set.

Parameters:

identifier (ChemIdentifier) – Identifier to apply.

applyStandardizer(standardizer: ChemStandardizer)[source]

Apply a standardizer to the data set.

Parameters:

standardizer (ChemStandardizer) – Standardizer to apply.

attachDescriptors(calculator: DescriptorSet, descriptors: DataFrame, index_cols: list)[source]

Attach descriptors to the data frame.

Parameters:
  • calculator (DescriptorsCalculator) – DescriptorsCalculator object to use for descriptor calculation.

  • descriptors (pd.DataFrame) – DataFrame containing the descriptors to attach.

  • index_cols (list) – List of column names to use as index.

property chunkSize: int

Get the size of chunks to use per job in parallel processing.

clear()[source]

Clear the data set from memory and disk.

createScaffoldGroups(mols_per_group: int = 10)[source]

Create scaffold groups.

A scaffold group is a list of molecules that share the same scaffold. New columns are created that contain the scaffold group ID and the scaffold group size.

Parameters:

mols_per_group (int) – Number of molecules per scaffold group.

property descriptorSets: list[DescriptorSet]

Get the descriptor calculators for this table.

property descsPath
dropDescriptorSets(descriptors: list[DescriptorSet | str], full_removal: bool = False)[source]

Drop descriptors from the given sets from the data frame.

Parameters:
  • descriptors (list[DescriptorSet | str]) – List of DescriptorSet objects or their names. Name of a descriptor set corresponds to the result returned by its __str__ method.

  • full_removal (bool) – Whether to remove the descriptor data (will perform full removal). By default, a soft removal is performed by just rendering the descriptors inactive. A full removal will remove the descriptorSet from the dataset, including the saved files. It is not possible to restore a descriptorSet after a full removal.

Raises:

AssertionError – If the data set does not contain any descriptors.

dropDescriptors(descriptors: list[str])[source]

Drop descriptors by name. Performs a simple feature selection by removing the given descriptor names from the data set.

Parameters:

descriptors (list[str]) – List of descriptor names to drop.

dropEmptyEntries(names: list[str])[source]

Drop rows with missing values in the properties.

Parameters:

names (list[str]) – list property names

dropEntries(ids: Iterable[str])[source]

Drop entries from the data set.

Parameters:

ids (Iterable[str]) – IDs of the entries to drop.

classmethod fromDF(name: str, df: DataFrame, path: str = '.', smiles_col: str = 'SMILES', **kwargs) MoleculeTable[source]

Create a MoleculeTable instance from a pandas DataFrame.

Parameters:
  • name (str) – Name of the data set.

  • df (pd.DataFrame) – DataFrame containing the molecule data.

  • path (str) – Path to the directory where the data set will be stored.

  • smiles_col (str) – Name of the column in the data frame containing the SMILES sequences.

  • **kwargs – Additional keyword arguments to pass to the MoleculeTable constructor.

Returns:

The created data set.

Return type:

(MoleculeTable)

classmethod fromFile(filename: str) Any

Initialize a new instance from a JSON file.

Parameters:

filename (str) – path to the JSON file

Returns:

new instance of the class

Return type:

instance (object)

classmethod fromJSON(json: str) Any

Reconstruct object from a JSON string.

Parameters:

json (str) – JSON string of the object

Returns:

reconstructed object

Return type:

obj (object)

classmethod fromSDF(name: str, filename: str, path: str, smiles_prop: str, *args, **kwargs)[source]

Create a MoleculeTable instance from an SDF file.

Parameters:
  • name (str) – Name of the data set.

  • filename (str) – Path to the SDF file.

  • path (str) – Path to the directory where the data set will be stored.

  • smiles_prop (str) – Name of the property in the SDF file containing the SMILES sequence.

  • *args – Additional arguments to pass to the MoleculeTable constructor.

  • **kwargs – Additional keyword arguments to pass to the MoleculeTable constructor.

classmethod fromSMILES(name: str, smiles: list, path: str, *args, **kwargs)[source]

Create a MoleculeTable instance from a list of SMILES sequences.

Parameters:
  • name (str) – Name of the data set.

  • smiles (list) – list of SMILES sequences.

  • path (str) – Path to the directory where the data set will be stored.

  • *args – Additional arguments to pass to the MoleculeTable constructor.

  • **kwargs – Additional keyword arguments to pass to the MoleculeTable constructor.

Returns:

The created data set.

Return type:

(MoleculeTable)

classmethod fromTableFile(name: str, filename: str, path: str, *args, sep='\t', **kwargs)[source]

Create a MoleculeTable instance from a file containing a table of molecules (i.e. a CSV file).

Parameters:
  • name (str) – Name of the data set.

  • filename (str) – Path to the file containing the table.

  • path (str) – Path to the directory where the data set will be stored.

  • sep (str) – Separator used in the file for different columns.

  • *args – Additional arguments to pass to the MoleculeTable constructor.

  • **kwargs – Additional keyword arguments to pass to the MoleculeTable constructor.

Returns:

The created data set.

Return type:

(MoleculeTable)

generateDescriptorDataSetName(ds_set: str | DescriptorSet, name: str | None = None) str[source]

Generate a descriptor set name from a descriptor set.

Parameters:
  • ds_set (str | DescriptorSet) – Name of the descriptor set.

  • name (str) – Name of the data set.

Returns:

Name of the descriptor set.

Return type:

(str)

getClusterNames(clusters: list[MoleculeClusters] | None = None) list[str][source]

Get the names of the clusters in the data frame.

Parameters:

clusters (list) – List of cluster calculators of clusters to include

Returns:

List of cluster names.

Return type:

(list[str])

getClusters(clusters: list[MoleculeClusters] | None = None)[source]

Get the subset of the data frame that contains only clusters.

Parameters:

clusters (list) – List of cluster calculators of clusters to include.

Returns:

Data frame containing only clusters.

Return type:

pd.DataFrame

getDF() DataFrame[source]

Get the data frame of the data set.

getDescriptorNames() list[str][source]

Get the names of the descriptors present for molecules in this data set.

Returns:

list of descriptor names.

Return type:

(list[str])

getDescriptors(active_only: bool = True) DataFrame[source]

Get the calculated descriptors as a pandas data frame.

Returns:

Data frame containing only descriptors.

Return type:

pd.DataFrame

getProperties() list[str][source]

Get the names of the properties in the data frame.

getProperty(name: str, ids: tuple[str] | None = None) Iterable[Any][source]

Get the property with the given name.

Parameters:
  • name (str) – Name of the property.

  • ids (tuple[str], optional) – IDs of the molecules to get the property for.

Returns:

Property values.

Return type:

(Iterable[Any])

getScaffoldGroups(scaffold_name: str, mol_per_group: int = 10) Series[source]

Get the scaffold groups for a given combination of scaffold and number of molecules per scaffold group.

Parameters:
  • scaffold_name (str) – Name of the scaffold.

  • mol_per_group (int) – Number of molecules per scaffold group.

Returns:

Series containing the scaffold groups.

Return type:

(pd.Series)

getScaffoldNames(scaffolds: list[Scaffold] | None = None, include_mols: bool = False) list[str][source]

Get the names of the scaffolds in the data frame.

Parameters:
  • scaffolds (list) – List of scaffold calculators of scaffolds to include.

  • include_mols (bool) – Whether to include the RDKit scaffold columns as well.

Returns:

List of scaffold names.

Return type:

(list[str])

getScaffolds(scaffolds: list[Scaffold] | None = None, include_mols: bool = False) DataFrame[source]

Get the subset of the data frame that contains only scaffolds.

Parameters:
  • scaffolds (list) – List of scaffold calculators of scaffolds to include.

  • include_mols (bool) – Whether to include the RDKit scaffold columns as well.

Returns:

Data frame containing only scaffolds.

Return type:

pd.DataFrame

getSubset(subset: Iterable[str], ids: Iterable[str] | None = None, name: str | None = None, path: str = '.', **kwargs) MoleculeTable[source]

Get a subset of the data frame.

Parameters:
  • subset (Iterable[str]) – List of properties to include in the subset.

  • ids (Iterable[str], optional) – IDs of the molecules to include in the subset.

  • name (str, optional) – Name of the new data set.

  • path (str) – Path to the directory where the data set will be stored.

  • **kwargs – Additional keyword arguments to pass to the MoleculeTable constructor.

Returns:

The created data set.

Return type:

(MoleculeTable)

getSummary() DataFrame[source]

Get a summary of the data set.

Returns:

Summary of the data set.

Return type:

(pd.DataFrame)

Raises:

NotImplementedError – Summary not yet available for MoleculeTable.

property hasClusters: bool

Check whether the data frame contains clusters.

Returns:

Whether the data frame contains clusters.

Return type:

bool

hasDescriptors(descriptors: list[DescriptorSet | str] | None = None) bool | list[bool][source]

Check whether the data frame contains given descriptors.

Parameters:

None) ((list[DescriptorSet | str] |) – List of descriptor objects or prefixes of descriptors to check for. If None, all descriptors are checked for and a single boolean is returned if any descriptors are found.

Returns:

Whether the data frame contains the given descriptors.

Return type:

(bool | list[bool])

hasProperty(name: str) bool[source]

Check whether a property is present in the data frame.

Parameters:

name (str) – Name of the property.

property hasScaffoldGroups: bool

Check whether the data frame contains scaffold groups.

Returns:

Whether the data frame contains scaffold groups.

Return type:

(bool)

property hasScaffolds: bool

Check whether the data frame contains scaffolds.

Returns:

Whether the data frame contains scaffolds.

Return type:

bool

property idProp: str

Get the name of the property that contains the molecule IDs.

property identifier: ChemIdentifier

Get the identifier to use for the data set.

iterChunks(size: int | None = None, on_props: list | None = None, chunk_type: Literal['mol', 'smiles', 'rdkit', 'df'] = 'mol') Generator[list[StoredMol], None, None][source]

Iterate over chunks of the data set.

Parameters:
  • size (int, optional) – Size of the chunks.

  • on_props (list, optional) – Properties to iterate over.

  • chunk_type (Literal["mol", "smiles", "rdkit", "df"], optional) – Type of chunks to use for processing.

Returns:

Generator of the chunks.

Return type:

(Generator[list[StoredMol], None, None])

property metaFile: str

Get the path to the meta file of the data set.

property nJobs: int

Get the number of jobs to use for parallel processing.

property name: str

Get the name of the data set.

processMols(processor: MolProcessor, proc_args: tuple[Any, ...] | None = None, proc_kwargs: dict[str, Any] | None = None, mol_type: Literal['smiles', 'mol', 'rdkit'] = 'mol', add_props: Iterable[str] | None = None) Generator[Any, None, None][source]

Process molecules in the data set.

Parameters:
  • processor (MolProcessor) – Processor to use for molecule processing.

  • proc_args (tuple, optional) – Positional arguments to pass to the processor.

  • proc_kwargs (dict, optional) – Keyword arguments to pass to the processor.

  • mol_type (Literal["smiles", "mol", "rdkit"], optional) – Type of molecules to process.

  • add_props (Iterable[str], optional) – Additional properties to add to the data frame.

Returns:

Generator of the results.

Return type:

(Generator[Any, None, None])

property randomState: int

Get the random state to use for shuffling and other random ops.

reload()[source]

Reload the data set from disk.

removeProperty(name: str) bool[source]

Remove a property from the data frame.

Parameters:

name (str) – Name of the property.

Returns:

Whether the property was removed successfully.

Return type:

(bool)

restoreDescriptorSets(descriptors: list[DescriptorSet | str])[source]

Restore descriptors that were previously removed.

Parameters:

descriptors (list[DescriptorSet | str]) – List of DescriptorSet objects or their names. Name of a descriptor set corresponds to the result returned by its __str__ method.

Raises:

ValueError – If any of the descriptors are not present in the data set.

sample(n: int, name: str | None = None, random_state: int | None = None) MoleculeTable[source]

Sample n molecules from the table.

Parameters:
  • n (int) – Number of molecules to sample.

  • name (str) – Name of the new table. Defaults to the name of the old table, plus the _sampled suffix.

  • random_state (int) – Random state to use for shuffling and other random ops.

Returns:

A dataframe with the sampled molecules.

Return type:

(MoleculeTable)

save()[source]

Save the whole storage to disk.

searchOnProperty(prop_name: str, values: list[float | int | str], exact=False, name: str | None = None, path: str = '.') MoleculeTable[source]

Search the data set based on a property.

Parameters:
  • prop_name (str) – Name of the property to search on.

  • values (list[float | int | str]) – Values to search for.

  • exact (bool) – Whether to perform an exact search.

  • name (str) – Name of the new table.

  • path (str) – Path to the directory where the new table will be stored.

Returns:

Data set containing the search results.

Return type:

(MoleculeTable)

searchWithSMARTS(patterns: list[str], operator: Literal['or', 'and'] = 'or', use_chirality: bool = False, name: str | None = None, path: str = '.') MoleculeTable[source]

Search the data set with SMARTS patterns.

Parameters:
  • patterns (list[str]) – List of SMARTS patterns to search for.

  • operator (Literal["or", "and"]) – Operator to use for combining the patterns.

  • use_chirality (bool) – Whether to use chirality in the search.

  • name (str) – Name of the new table.

  • path (str) – Path to the directory where the new table will be stored.

Returns:

Data set containing the search results.

Return type:

(MoleculeTable)

property smiles: Generator[str, None, None]

Generator of SMILES strings of all molecules in the data set.

property smilesProp: str

Get the name of the property that contains the SMILES strings.

property standardizer: ChemStandardizer

Get the standardizer to use for the data set.

toFile(filename: str)[source]

Save the data set to a file.

Parameters:

filename (str) – Path to the file to save the data set to.

toJSON() str
Serialize object to a JSON string. This JSON string should

contain all data necessary to reconstruct the object.

Returns:

JSON string of the object

Return type:

json (str)

transformProperties(names: list[str], transformer: Callable[[Iterable[Any]], Iterable[Any]])[source]

Transform the properties of the data frame.

Parameters:
  • names (list[str]) – List of property names to transform.

  • transformer (Callable) – Function to use for transformation.

qsprpred.data.tables.pnds module

class qsprpred.data.tables.pnds.PandasDataTable(name: str, df: DataFrame | None = None, store_dir: str = '.', overwrite: bool = False, index_cols: list[str] | None = None, n_jobs: int = 1, chunk_size: int | None = None, autoindex_name: str | None = None, random_state: int | None = None, store_format: str = 'pkl', parallel_generator: ParallelGenerator | None = None)[source]

Bases: PropertyStorage, Randomized

A pandas.DataFrame wrapper class for integration with QSPRpred API.

Variables:
  • name (str) – Name of the data set. You can use this name to load the dataset from disk anytime and create a new instance.

  • df (pd.DataFrame) – Pandas dataframe containing the data. You can modify this one directly, but note that removing rows, adding rows, or changing the index or other automatic properties of the data frame might break the data set. In that case, it is recommended to recreate the data set from scratch.

  • indexCols (List) – List of columns to use as index. If None, the index will be a custom generated ID. Note that if you specify multiple columns their values will be joined with a ‘~’ character rather than using the default pandas multi-index.

  • nJobs (int) – Number of jobs to use for parallel processing. If set to None or 0, all available cores will be set.

  • chunkSize (int) – Size of chunks to use per job in parallel processing. This is automatically set to the number of rows in the data frame divided by nJobs. However, you can also set it manually if you want to use a different chunk size. Set to None to again use the default value determined by nJobs.

  • randomState (int) – Random state to use for all random operations.

  • storeFormat (str) – Format to use for storing the data frame. Currently only ‘pkl’ and ‘csv’ are supported. Defaults to ‘pkl’ because it is faster. However, ‘csv’ is more portable and can be opened in other programs.

  • parallelGenerator (Callable) – A ParallelGenerator to use for parallel processing of chunks of data. Defaults to qsprpred.utils.parallel.MultiprocessingPoolGenerator. You can replace this with your own parallel generator function if you want to use a different parallelization strategy (i.e. utilize remote servers instead of local processes).

Initialize a PandasDataTable object. Args

name (str):

Name of the data set. You can use this name to load the dataset from disk anytime and create a new instance.

df (pd.DataFrame):

Pandas dataframe containing the data. If you provide a dataframe for a dataset that already exists on disk, the dataframe from disk will override the supplied data frame. Set ‘overwrite’ to True to override the data frame on disk.

store_dir (str):

Directory to store the dataset files. Defaults to the current directory. If it already contains files with the same name, the existing data will be loaded.

overwrite (bool):

Overwrite existing dataset.

index_cols (List):

List of columns to use as index. If None, the index will be a custom generated ID.

n_jobs (int):

Number of jobs to use for parallel processing. If <= 0, all available cores will be used.

chunk_size (int):

Size of chunks to use per job in parallel processing. If None, the chunk size will be set to the number of rows in the data frame divided by nJobs.

autoindex_name (str):

Column name to use for automatically generated IDs.

random_state (int):

Random state to use for all random operations for reproducibility. If not specified, the state is generated randomly. The state is saved upon save so if you want to change the state later, set it in the randomState property.

store_format (str):

Format to use for storing the data frame. Currently only ‘pkl’ and ‘csv’ are supported.

parallel_generator (ParallelGenerator | None):

A ParallelGenerator to use for parallel processing of chunks of data. Defaults to qsprpred.utils.parallel.MultiprocessingPoolGenerator. You can replace this with your own parallel generator function if you want to use a different parallelization strategy (i.e. utilize remote servers instead of local processes).

addEntries(ids: list[str], props: dict[str, list], raise_on_existing: bool = True)[source]

Add entries to the data set.

Parameters:
  • ids (list[str]) – IDs of entries to add.

  • props (dict[str, list]) – Dictionary of properties to add.

  • raise_on_existing (bool) – If True, raise an error if any of the new entries are duplicates.

addProperty(name: str, data: list, ids: list[str] | None = None, ignore_missing: bool = False)[source]

Add a property to the data frame.

Parameters:
  • name (str) – Name of the property.

  • data (list) – list of property values.

  • ids – IDs of entries to get properties for.

  • ignore_missing (bool) – If True, missing IDs are ignored.

apply(func: Callable[[dict[str, list[Any]] | DataFrame, ...], Any], func_args: tuple[Any, ...] | None = None, func_kwargs: dict[str, Any] | None = None, on_props: tuple[str, ...] | None = None, as_df: bool = False, chunk_size: int | None = None, n_jobs: int | None = None) Generator[source]

Apply a function to the data frame. The properties of the data set are passed as the first positional argument to the function. This will be a dictionary of the form {'prop1': [...], 'prop2': [...], ...}. If as_df is True, the properties will be passed as a data frame instead.

Any additional arguments specified in func_args and func_kwargs will be passed to the function after the properties as positional and keyword arguments, respectively.

If on_props is specified, only the properties in this list will be passed to the function. If on_props is None, all properties will be passed to the function.

Parameters:
  • func (Callable) – Function to apply to the data frame.

  • func_args (list) – Positional arguments to pass to the function.

  • func_kwargs (dict) – Keyword arguments to pass to the function.

  • on_props (list[str]) – list of properties to send to the function as arguments

  • as_df (bool) – If True, the function is applied to chunks represented as data frames.

  • chunk_size (int) – Size of chunks to use per job in parallel processing. If None, the chunk size will be set to self.chunkSize. The chunk size will always be set to the number of rows in the data frame if n_jobs or `self.nJobs is 1.

  • n_jobs (int) – Number of jobs to use for parallel processing. If None, self.nJobs is used.

Returns:

Generator that yields the results of the function applied to each chunk of the data frame as determined by chunk_size and n_jobs. Each item in the generator will be the result of the function applied to one chunk of the data set.

Return type:

Generator

property baseDir: str

The base directory of the data set folder.

property chunkSize: int

Size of chunks to use per job in parallel processing.

clear(files_only: bool = True)[source]

Remove all files associated with this data set from disk.

dropEmptyProperties(names: list[str])[source]

Drop rows with empty target property value from the data set.

Parameters:

names (list[str]) – list of property names to check for empty values.

dropEntries(ids: Iterable[str], ignore_missing: bool = False)[source]

Drop entries from the data set by their IDs.

Parameters:
  • ids (Iterable[str]) – IDs of entries to drop.

  • ignore_missing (bool) – If True, missing IDs are ignored.

classmethod fromFile(filename: str) Any

Initialize a new instance from a JSON file.

Parameters:

filename (str) – path to the JSON file

Returns:

new instance of the class

Return type:

instance (object)

classmethod fromJSON(json: str) Any

Reconstruct object from a JSON string.

Parameters:

json (str) – JSON string of the object

Returns:

reconstructed object

Return type:

obj (object)

generateIndex(name: str | None = None, prefix: str | None = None)[source]

Generate a custom index for the data frame automatically.

Parameters:
  • name (str | None) – name of the resulting index column.

  • prefix (str | None) – prefix to use for the index column values.

getDF()[source]

Get the data frame this instance manages.

Returns:

The data frame this instance manages.

Return type:

pd.DataFrame

getProperties() list[str][source]

Get names of all properties/variables saved in the data frame (all columns).

Returns:

list of property names.

Return type:

(list[str])

getProperty(name: str, ids: tuple[str] | None = None, ignore_missing: bool = False) Series[source]

Get property values from the data set.

Parameters:
  • name (str) – Name of the property to get.

  • ids – IDs of entries to get properties for.

  • ignore_missing (bool) – If True, missing IDs are ignored.

Returns:

List of values for the property.

Return type:

(pd.Series)

getSubset(properties: list[str], ids: list[str] | None = None, name: str | None = None, path: str | None = None, ignore_missing: bool = False) PandasDataTable[source]

Get a subset of the data set by providing a prefix for the column names or a column name directly.

Parameters:
  • properties (list[str]) – list of property names to get.

  • ids – IDs of entries to get subset of properties for.

  • name (str) – Name of the new data set.

  • path (str) – Path to save the new data set.

  • ignore_missing (bool) – If True, missing IDs are ignored.

Returns:

A new data set containing the subset of the properties

Return type:

(PandasDataTable)

hasProperty(name: str) bool[source]

Check whether a property is present in the data frame.

Parameters:

name (str) – Name of the property.

Returns:

Whether the property is present.

Return type:

bool

property idProp: str

Column name to use for automatically generated IDs.

iterChunks(size: int | None = None, on_props: tuple[str] | None = None, as_dict: bool = False) Generator[DataFrame | dict, None, None][source]

Batch a data frame into chunks of the given size.

Parameters:
  • size (int) – Size of chunks to use per job in parallel processing. If None, self.chunkSize is used.

  • on_props (list[str]) – list of properties to include, if None, all properties are included.

  • as_dict (bool) – If True, the generator yields dictionaries instead of data frames.

Returns:

Generator that yields batches of the data frame as smaller data frames.

Return type:

Generator[pd.DataFrame, None, None]

property metaFile

The path to the meta file of this data set.

property nJobs

Number of jobs to use for parallel processing.

property name: str

Name of the data set.

property randomState: int

Random state to use for all random operations for reproducibility.

reload()[source]

Reload the data table from disk.

removeProperty(name)[source]

Remove a property from the data frame.

Parameters:

name (str) – Name of the property to delete.

save() str[source]

Save the data frame to disk and all associated files.

Returns:

Path to the saved data frame.

Return type:

(str)

searchOnProperty(prop_name: str, values: list[str], exact: bool = False) PandasDataTable[source]

Search the molecules within this MoleculeDataSet on a property value and return the appropriate subset.

Parameters:
  • prop_name (str) – Name of the column to search on.

  • values (list[str]) – Values to search for.

  • exact (bool) – Whether to search for exact matches or not.

Returns:

A data set with the molecules that match the search.

Return type:

(PandasDataTable)

setIndex(cols: list[str])[source]

Create and index column from several columns of the data set. This also resets the idProp attribute to be the name of the index columns joined by a ‘~’ character. The values of the columns are also joined in the same way to create the index. Thus, make sure the values of the columns are unique together and can be joined to a string.

Parameters:

cols (list[str]) – list of columns to use as index.

shuffle(random_state: int | None = None)[source]

Shuffle the internal data frame.

Parameters:

random_state (int | None) – Random state to use for shuffling. If None, the random state of the data set is used.

property storeDir

The data set folder containing the data set files after saving.

property storePath

The path to the main data set file.

property storePrefix

The prefix of the data set files.

toFile(filename: str) str[source]

Save the metafile and all associated files to a custom location.

Parameters:

filename (str) – absolute path to the saved metafile.

Returns:

Path to the saved data frame.

Return type:

(str)

toJSON() str
Serialize object to a JSON string. This JSON string should

contain all data necessary to reconstruct the object.

Returns:

JSON string of the object

Return type:

json (str)

transformProperties(names: list[str], transformer: Callable)[source]

Transform property values using a transformer function.

Parameters:
  • names (list[str]) – list of column names to transform.

  • transformer (Callable) – Function that transforms the data in target columns to a new representation.

qsprpred.data.tables.qspr module

class qsprpred.data.tables.qspr.QSPRTable(storage: ChemStore | None = None, name: str | None = None, target_props: list[TargetSpec | dict] | None = None, path: str = '.', random_state: int | None = None, store_format: str = 'pkl', drop_empty_target_props: bool = True)[source]

Bases: QSPRDataSet, MoleculeTable

Implementation of QSPRDataSet using a collection of PandasDataTable objects.

Variables:

targetProperties (str) – property to be predicted with QSPRmodel

Construct QSPRdata, also apply transformations of output property if specified.

Parameters:
  • storage (ChemStore | None) – storage object to use for saving the data. Defaults to None.

  • name (str) – data name, used in saving the data

  • target_props (list[TargetSpec | dict] | None) – target properties, names should correspond with target column names in df. If None, target specifications will be inferred if this data set has been saved previously. Defaults to None.

  • path (str, optional) – path to the directory where the data set will be saved. Defaults to “.”.

  • random_state (int, optional) – random state for splitting the data.

  • store_format (str, optional) – format to use for storing the data (‘pkl’ or ‘csv’).

  • drop_empty_target_props (bool, optional) – whether to ignore entries with empty target properties. Defaults to True.

Raises:

ValueError – Raised if threshold given with non-classification task.

addClusters(clusters: list[MoleculeClusters], recalculate: bool = False)

Add clusters to the data frame.

A new column is created that contains the identifier of the corresponding cluster calculator.

Parameters:
  • clusters (list) – list of MoleculeClusters calculators.

  • recalculate (bool) – Whether to recalculate clusters even if they are already present in the data frame.

addDescriptors(descriptors: list[DescriptorSet], recalculate: bool = False, *args, **kwargs)

Add descriptors to the data frame with the given descriptor calculators.

Parameters:
  • descriptors (list[DescriptorSet]) – List of DescriptorSet objects to use for descriptor calculation.

  • recalculate (bool) – Whether to recalculate descriptors even if they are already present in the data frame. If False, existing descriptors are kept and no calculation takes place.

  • *args – Additional positional arguments to pass to each descriptor set.

  • **kwargs – Additional keyword arguments to pass to each descriptor set.

addEntries(ids: list[str], props: dict[str, list], raise_on_existing: bool = True)

Add entries to the data set.

Parameters:
  • ids (list[str]) – IDs of the entries to add.

  • props (dict[str, list]) – Properties to add.

  • raise_on_existing (bool)

  • exist. (Whether to raise an error if the entries already)

Raises:

NotImplementedError – Adding entries is not yet available for the data set.

addProperty(name: str, data: Sized, ids: list[str] | None = None)

Add a property to the data frame.

Parameters:
  • name (str) – Name of the property.

  • data (Sized) – Property values.

  • ids (list[str], optional) – IDs of the molecules to add the property for.

Returns:

Whether the property was added successfully.

Return type:

(bool)

addScaffolds(scaffolds: list[Scaffold], add_rdkit_scaffold: bool = False, recalculate: bool = False)

Add scaffolds to the data frame.

A new column is created that contains the SMILES of the corresponding scaffold. If add_rdkit_scaffold is set to True, a new column is created that contains the RDKit scaffold of the corresponding molecule.

Parameters:
  • scaffolds (list) – list of Scaffold calculators.

  • add_rdkit_scaffold (bool) – Whether to add the RDKit scaffold of the molecule as a new column.

  • recalculate (bool) – Whether to recalculate scaffolds even if they are already present in the data frame.

addSplit(split: DataSplit, name: str)[source]

Add a split to the dataset.

Performs the split and stores the split object and the indices of the split. If the split has a random state, it will be set to the random state of the dataset if it is not set.

Parameters:
  • split (DataSplit) – split to add

  • name (str) – name of the split

addTargetProperty(target_spec: TargetSpec | dict, drop_empty: bool = True)[source]

Add a target property to the dataset.

Parameters:
  • target_spec (TargetSpec | dict) – target property specification to add or dictionary to initialize a TargetSpec

  • drop_empty (bool) – whether to drop rows with empty target property values. Defaults to True.

apply(func: callable, func_args: list | None = None, func_kwargs: dict | None = None, on_props: tuple[str, ...] | None = None, chunk_type: Literal['mol', 'smiles', 'rdkit', 'df'] = 'mol') Generator[Iterable[Any], None, None]

Apply a function to the data set.

Parameters:
  • func (callable) – Function to apply.

  • func_args (list, optional) – Positional arguments to pass to the function.

  • func_kwargs (dict, optional) – Keyword arguments to pass to the function.

  • on_props (tuple[str, ...], optional) – Properties to apply the function on.

  • chunk_type (Literal["mol", "smiles", "rdkit", "df"], optional) – Type of chunks to use for processing.

Returns:

Generator of the results.

Return type:

(Generator[Iterable[Any], None, None])

applyIdentifier(identifier: ChemIdentifier)

Apply an identifier to the data set.

Parameters:

identifier (ChemIdentifier) – Identifier to apply.

applyStandardizer(standardizer: ChemStandardizer)

Apply a standardizer to the data set.

Parameters:

standardizer (ChemStandardizer) – Standardizer to apply.

attachDescriptors(calculator: DescriptorSet, descriptors: DataFrame, index_cols: list)

Attach descriptors to the data frame.

Parameters:
  • calculator (DescriptorsCalculator) – DescriptorsCalculator object to use for descriptor calculation.

  • descriptors (pd.DataFrame) – DataFrame containing the descriptors to attach.

  • index_cols (list) – List of column names to use as index.

checkClassification(target_property: str) bool[source]

Checks the validity of the target property for classification tasks.

Parameters:

target_property (str) – Name of the target property to use for classification

Returns:

True if the target property is correctly set up for classification, False otherwise.

Return type:

bool

property chunkSize: int

Get the size of chunks to use per job in parallel processing.

clear()

Clear the data set from memory and disk.

createScaffoldGroups(mols_per_group: int = 10)

Create scaffold groups.

A scaffold group is a list of molecules that share the same scaffold. New columns are created that contain the scaffold group ID and the scaffold group size.

Parameters:

mols_per_group (int) – Number of molecules per scaffold group.

property descriptorSets: list[DescriptorSet]

Get the descriptor calculators for this table.

property descsPath
dropDescriptorSets(descriptors: list[DescriptorSet | str], full_removal: bool = False)

Drop descriptors from the given sets from the data frame.

Parameters:
  • descriptors (list[DescriptorSet | str]) – List of DescriptorSet objects or their names. Name of a descriptor set corresponds to the result returned by its __str__ method.

  • full_removal (bool) – Whether to remove the descriptor data (will perform full removal). By default, a soft removal is performed by just rendering the descriptors inactive. A full removal will remove the descriptorSet from the dataset, including the saved files. It is not possible to restore a descriptorSet after a full removal.

Raises:

AssertionError – If the data set does not contain any descriptors.

dropDescriptors(descriptors: list[str])

Drop descriptors by name. Performs a simple feature selection by removing the given descriptor names from the data set.

Parameters:

descriptors (list[str]) – List of descriptor names to drop.

dropEmptyEntries(names: list[str])

Drop rows with missing values in the properties.

Parameters:

names (list[str]) – list property names

dropEntries(ids: Iterable[str])

Drop entries from the data set.

Parameters:

ids (Iterable[str]) – IDs of the entries to drop.

filter(table_filters: list[Callable])[source]

Filter the data set using the given filters.

Parameters:

table_filters (list[DataFilter]) – list of filters to apply

classmethod fromDF(name: str, df: DataFrame, target_props: list[TargetSpec | dict], path: str = '.', smiles_col: str = 'SMILES', drop_empty_target_props: bool = True, **kwargs) QSPRTable[source]

Create QSPRTable from a pandas DataFrame.

Parameters:
  • name (str) – name of the data set

  • df (pd.DataFrame) – data frame containing the data

  • target_props (list[TargetProperty | dict]) – target properties to use

  • path (str) – path to the directory where the data set will be saved

  • smiles_col (str) – name of the column containing SMILES

  • drop_empty_target_props (bool, optional) – whether to drop rows with empty target property values. Defaults to True.

  • **kwargs – additional keyword arguments for MoleculeTable constructor

Returns:

created data set

Return type:

QSPRTable

classmethod fromFile(filename: str) Any

Initialize a new instance from a JSON file.

Parameters:

filename (str) – path to the JSON file

Returns:

new instance of the class

Return type:

instance (object)

classmethod fromJSON(json: str) Any

Reconstruct object from a JSON string.

Parameters:

json (str) – JSON string of the object

Returns:

reconstructed object

Return type:

obj (object)

classmethod fromMolTable(mol_table: MoleculeTable, target_props: list[TargetSpec | dict], *args, path: str = '.', name: str | None = None, **kwargs) QSPRTable[source]

Create QSPRTable from a MoleculeTable.

Parameters:
  • mol_table (MoleculeTable) – MoleculeTable to use as the data source

  • target_props (list) – list of target properties to use

  • *args – additional positional arguments to pass to the constructor of QSPRTable

  • path (str) – path to the directory where the data set will be saved

  • name (str) – name of the data set

  • **kwargs – additional keyword arguments to pass to the constructor of QSPRTable

Returns:

created data set

Return type:

QSPRTable

classmethod fromSDF(name: str, filename: str, smiles_prop: str, *args, **kwargs)[source]

Create QSPRTable from SDF file.

It is currently not implemented for QSPRTable, but you can convert from ‘MoleculeTable’ with the ‘fromMolTable’ method.

Parameters:
  • name (str) – name of the data set

  • filename (str) – path to the SDF file

  • smiles_prop (str) – name of the property in the SDF file containing SMILES

  • *args – additional arguments for QSPRTable constructor

  • **kwargs – additional keyword arguments for QSPRTable constructor

classmethod fromSMILES(name: str, smiles: list, path: str, *args, **kwargs)

Create a MoleculeTable instance from a list of SMILES sequences.

Parameters:
  • name (str) – Name of the data set.

  • smiles (list) – list of SMILES sequences.

  • path (str) – Path to the directory where the data set will be stored.

  • *args – Additional arguments to pass to the MoleculeTable constructor.

  • **kwargs – Additional keyword arguments to pass to the MoleculeTable constructor.

Returns:

The created data set.

Return type:

(MoleculeTable)

classmethod fromTableFile(name: str, filename: str, path: str, *args, sep: str = '\t', target_props: list[TargetSpec | dict] | None = None, **kwargs)[source]

Create QSPRTable from table file (i.e. CSV or TSV).

Parameters:
  • name (str) – name of the data set

  • filename (str) – path to the table file

  • path (str) – path to the directory where the data set will be saved

  • *args – additional arguments for MolTable constructor

  • sep (str, optional) – separator in the table file. Defaults to “t”.

  • target_props (list[TargetProperty | dict], optional) – target properties to use. Defaults to None.

  • **kwargs – additional keyword arguments for MolTable constructor

Returns:

QSPRTable object

Return type:

QSPRTable

generateDescriptorDataSetName(ds_set: str | DescriptorSet, name: str | None = None) str

Generate a descriptor set name from a descriptor set.

Parameters:
  • ds_set (str | DescriptorSet) – Name of the descriptor set.

  • name (str) – Name of the data set.

Returns:

Name of the descriptor set.

Return type:

(str)

getClusterNames(clusters: list[MoleculeClusters] | None = None) list[str]

Get the names of the clusters in the data frame.

Parameters:

clusters (list) – List of cluster calculators of clusters to include

Returns:

List of cluster names.

Return type:

(list[str])

getClusters(clusters: list[MoleculeClusters] | None = None)

Get the subset of the data frame that contains only clusters.

Parameters:

clusters (list) – List of cluster calculators of clusters to include.

Returns:

Data frame containing only clusters.

Return type:

pd.DataFrame

getDF() DataFrame

Get the data frame of the data set.

getDescriptorNames() list[str]

Get the names of the descriptors present for molecules in this data set.

Returns:

list of descriptor names.

Return type:

(list[str])

getDescriptors(active_only: bool = True) DataFrame

Get the calculated descriptors as a pandas data frame.

Returns:

Data frame containing only descriptors.

Return type:

pd.DataFrame

getProperties() list[str]

Get the names of the properties in the data frame.

getProperty(name: str, ids: tuple[str] | None = None) Iterable[Any]

Get the property with the given name.

Parameters:
  • name (str) – Name of the property.

  • ids (tuple[str], optional) – IDs of the molecules to get the property for.

Returns:

Property values.

Return type:

(Iterable[Any])

getScaffoldGroups(scaffold_name: str, mol_per_group: int = 10) Series

Get the scaffold groups for a given combination of scaffold and number of molecules per scaffold group.

Parameters:
  • scaffold_name (str) – Name of the scaffold.

  • mol_per_group (int) – Number of molecules per scaffold group.

Returns:

Series containing the scaffold groups.

Return type:

(pd.Series)

getScaffoldNames(scaffolds: list[Scaffold] | None = None, include_mols: bool = False) list[str]

Get the names of the scaffolds in the data frame.

Parameters:
  • scaffolds (list) – List of scaffold calculators of scaffolds to include.

  • include_mols (bool) – Whether to include the RDKit scaffold columns as well.

Returns:

List of scaffold names.

Return type:

(list[str])

getScaffolds(scaffolds: list[Scaffold] | None = None, include_mols: bool = False) DataFrame

Get the subset of the data frame that contains only scaffolds.

Parameters:
  • scaffolds (list) – List of scaffold calculators of scaffolds to include.

  • include_mols (bool) – Whether to include the RDKit scaffold columns as well.

Returns:

Data frame containing only scaffolds.

Return type:

pd.DataFrame

getSplit(name: str, as_type: str = 'split') DataSplit | list[tuple[Index, Index]][source]

Get the split with the given name.

Parameters:

name (str) – name of the split

as_type (str): Determines the type of output. Can be one of:
  • “split”: Returns a DataSplit object.

  • “ids”: Returns train and test indices.

Returns:

split if as_type is “split” list[tuple[pd.Index, pd.Index]]:

train and test indices if as_type is “ids”

Return type:

DataSplit

getSubset(subset: list[str], ids: list[str] | None = None, name: str | None = None, path: str = '.', **kwargs) QSPRTable[source]

Get a subset of the data set.

Parameters:
  • subset (list[str]) – list of columns to include in the subset

  • ids (list[str], optional) – list of IDs to include in the subset. Defaults to None.

  • name (str, optional) – name of the subset. Defaults to None.

  • path (str, optional) – path to the directory where the subset will be saved. Defaults to “.”.

  • **kwargs – additional keyword arguments for the constructor of QSPRTable.

Returns:

subset of the data set

Return type:

QSPRTable

getSummary() DataFrame

Get a summary of the data set.

Returns:

Summary of the data set.

Return type:

(pd.DataFrame)

Raises:

NotImplementedError – Summary not yet available for MoleculeTable.

getTarget(name: str | TargetSpec) Series[source]

Get the target property values for the given target property.

Parameters:

name (str | TargetSpec) – name or specification of the target property

Returns:

target property values

Return type:

(pd.Series)

getTargetPropertiesNames() list[str]

Get the names of the target properties. :returns: list of target property names :rtype: (list[str])

getTargetSpec(name: str) TargetSpec[source]

Get the target specification of a single target property by its name.

Parameters:

name (str) – name of the target property

Returns:

target specification with the given name

Return type:

TargetSpec

Raises:

ValueError – if the target property with the given name is not found

getTargetSpecs(names: list | None) list[TargetSpec][source]

Get the target specifications with the given names.

Parameters:

names (list[str]) – name of the target properties

Returns:

list of target specifications

Return type:

(list[TargetSpec])

getTargets() DataFrame[source]

Get the target property values

Returns:

target property values

Return type:

(pd.DataFrame)

property hasClusters: bool

Check whether the data frame contains clusters.

Returns:

Whether the data frame contains clusters.

Return type:

bool

hasDescriptors(descriptors: list[DescriptorSet | str] | None = None) bool | list[bool]

Check whether the data frame contains given descriptors.

Parameters:

None) ((list[DescriptorSet | str] |) – List of descriptor objects or prefixes of descriptors to check for. If None, all descriptors are checked for and a single boolean is returned if any descriptors are found.

Returns:

Whether the data frame contains the given descriptors.

Return type:

(bool | list[bool])

hasProperty(name: str) bool

Check whether a property is present in the data frame.

Parameters:

name (str) – Name of the property.

property hasScaffoldGroups: bool

Check whether the data frame contains scaffold groups.

Returns:

Whether the data frame contains scaffold groups.

Return type:

(bool)

property hasScaffolds: bool

Check whether the data frame contains scaffolds.

Returns:

Whether the data frame contains scaffolds.

Return type:

bool

property idProp: str

Get the name of the property that contains the molecule IDs.

property identifier: ChemIdentifier

Get the identifier to use for the data set.

property isMultiTask: bool

Check if the dataset contains multiple target properties.

Returns:

True if the dataset contains multiple target properties

Return type:

(bool)

iterChunks(size: int | None = None, on_props: list | None = None, chunk_type: Literal['mol', 'smiles', 'rdkit', 'df'] = 'mol') Generator[list[StoredMol], None, None]

Iterate over chunks of the data set.

Parameters:
  • size (int, optional) – Size of the chunks.

  • on_props (list, optional) – Properties to iterate over.

  • chunk_type (Literal["mol", "smiles", "rdkit", "df"], optional) – Type of chunks to use for processing.

Returns:

Generator of the chunks.

Return type:

(Generator[list[StoredMol], None, None])

iterSplit(name: str, as_type: str = 'ids') Generator[tuple[Index, Index], None, None] | Generator[tuple[ndarray, ndarray, ndarray, ndarray], None, None] | Generator[tuple[DataFrame, DataFrame, DataFrame, DataFrame], None, None] | Generator[tuple[QSPRTable, QSPRTable], None, None][source]

Get the split with the given name.

Parameters:

name (str) – name of the split

as_type (str): Determines the type of output. Can be one of:
  • “ids”: yields train and test indices.

  • “numpy”: Yields train and test numpy arrays.

  • “pandas”: Yields train and test pandas DataFrames.

  • “QSPRTable”: Yields train and test QSPRTable objects.

Yields:

tuple[pd.Index, pd.Index] – train and test indices if as_type is “ids” tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:

train descriptors, train targets, test descriptors, test targets as_type is “numpy”

tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame, pd.DataFrame]:

train descriptors, train targets, test descriptors, test targets as_type is “pandas”

tuple[QSPRTable, QSPRTable]:

train and test QSPRTable objects if as_type is “QSPRTable”

makeClassification(target_property: str, th: list[float] | None = None)[source]

Switch to classification task using the given threshold values.

Parameters:
  • target_property (str) – Name of target property to use for classification

  • th (list[float], optional) – list of threshold values. If not provided, it is assumed that the target property is already discretized and can be used for classification.

makeRegression(target_property: str)[source]

Switch to regression task using the given target property.

Parameters:

target_property (str) – name of the target property to use for regression

property metaFile: str

Get the path to the meta file of the data set.

property nJobs: int

Get the number of jobs to use for parallel processing.

property nTargetProperties: int

Get the number of target properties in the dataset.

property name: str

Get the name of the data set.

processMols(processor: MolProcessor, proc_args: tuple[Any, ...] | None = None, proc_kwargs: dict[str, Any] | None = None, mol_type: Literal['smiles', 'mol', 'rdkit'] = 'mol', add_props: Iterable[str] | None = None) Generator[Any, None, None]

Process molecules in the data set.

Parameters:
  • processor (MolProcessor) – Processor to use for molecule processing.

  • proc_args (tuple, optional) – Positional arguments to pass to the processor.

  • proc_kwargs (dict, optional) – Keyword arguments to pass to the processor.

  • mol_type (Literal["smiles", "mol", "rdkit"], optional) – Type of molecules to process.

  • add_props (Iterable[str], optional) – Additional properties to add to the data frame.

Returns:

Generator of the results.

Return type:

(Generator[Any, None, None])

property randomState: int

Get the random state to use for shuffling and other random ops.

reload()

Reload the data set from disk.

removeProperty(name: str) bool

Remove a property from the data frame.

Parameters:

name (str) – Name of the property.

Returns:

Whether the property was removed successfully.

Return type:

(bool)

restoreDescriptorSets(descriptors: list[DescriptorSet | str])

Restore descriptors that were previously removed.

Parameters:

descriptors (list[DescriptorSet | str]) – List of DescriptorSet objects or their names. Name of a descriptor set corresponds to the result returned by its __str__ method.

Raises:

ValueError – If any of the descriptors are not present in the data set.

restoreTargetProperty(prop: TargetSpec | str)[source]

Reset target property to its original value.

Parameters:

prop (TargetProperty | str) – target property to reset

sample(n: int, name: str | None = None, random_state: int | None = None) MoleculeTable

Sample n molecules from the table.

Parameters:
  • n (int) – Number of molecules to sample.

  • name (str) – Name of the new table. Defaults to the name of the old table, plus the _sampled suffix.

  • random_state (int) – Random state to use for shuffling and other random ops.

Returns:

A dataframe with the sampled molecules.

Return type:

(MoleculeTable)

save()

Save the whole storage to disk.

searchOnProperty(prop_name: str, values: list[float | int | str], exact=False, name: str | None = None, path: str = '.') MoleculeTable

Search the data set based on a property.

Parameters:
  • prop_name (str) – Name of the property to search on.

  • values (list[float | int | str]) – Values to search for.

  • exact (bool) – Whether to perform an exact search.

  • name (str) – Name of the new table.

  • path (str) – Path to the directory where the new table will be stored.

Returns:

Data set containing the search results.

Return type:

(MoleculeTable)

searchWithSMARTS(patterns: list[str], operator: Literal['or', 'and'] = 'or', use_chirality: bool = False, name: str | None = None, path: str = '.') MoleculeTable

Search the data set with SMARTS patterns.

Parameters:
  • patterns (list[str]) – List of SMARTS patterns to search for.

  • operator (Literal["or", "and"]) – Operator to use for combining the patterns.

  • use_chirality (bool) – Whether to use chirality in the search.

  • name (str) – Name of the new table.

  • path (str) – Path to the directory where the new table will be stored.

Returns:

Data set containing the search results.

Return type:

(MoleculeTable)

setTargetProperties(target_props: list[TargetSpec | dict], drop_empty: bool = True)[source]

Set list of target properties for the dataset.

Parameters:
  • target_props (list[TargetSpec | dict]) – list of target properties specifications or dictionaries to initialize the TargetSpec objects from.

  • drop_empty (bool, optional) – whether to drop rows with empty target property values. Defaults to True.

property smiles: Generator[str, None, None]

Generator of SMILES strings of all molecules in the data set.

property smilesProp: str

Get the name of the property that contains the SMILES strings.

split(split: DataSplit) Generator[tuple[Index, Index], None, None][source]

Create folds from Descriptors and Targets. Can be used either for cross-validation, bootstrapping or train-test split.

Parameters:
  • split (DataSplit) – Split to apply to the data

  • X (pd.DataFrame) – data to apply the split to

  • y (pd.DataFrame | None) – target data to apply the split to

Yields:

pd.Index, pd.Index – indices of the train and test set

property standardizer: ChemStandardizer

Get the standardizer to use for the data set.

property targetProperties: list[TargetSpec]

Returns the specifications of target properties of the dataset.

toFile(filename: str)

Save the data set to a file.

Parameters:

filename (str) – Path to the file to save the data set to.

toJSON() str
Serialize object to a JSON string. This JSON string should

contain all data necessary to reconstruct the object.

Returns:

JSON string of the object

Return type:

json (str)

transformProperties(names: list[str], transformer: Callable[[Iterable[Any]], Iterable[Any]])

Transform the properties of the data frame.

Parameters:
  • names (list[str]) – List of property names to transform.

  • transformer (Callable) – Function to use for transformation.

unsetTargetProperty(name: str | TargetSpec)[source]

Unset a target property. It will not remove it from the data set, but will make it unavailable for training.

Parameters:

name (str | TargetSpec) – name or specification of the target property to drop

qsprpred.data.tables.tests module

class qsprpred.data.tables.tests.TestApply(methodName='runTest')[source]

Bases: DataSetsPathMixIn, QSPRTestCase

Tests the apply method of the data set.

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

classmethod addClassCleanup(function, /, *args, **kwargs)

Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).

addCleanup(function, /, *args, **kwargs)

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

addTypeEqualityFunc(typeobj, function)

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters:
  • typeobj – The data type to call this function on when both values are of the same type in assertEqual().

  • function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertCountEqual(first, second, msg=None)

Asserts that two iterables have the same elements, the same number of times, without regard to order.

self.assertEqual(Counter(list(first)),

Counter(list(second)))

Example:
  • [0, 1, 1] and [1, 0, 1] compare equal.

  • [0, 0, 1] and [0, 1] compare unequal.

assertDictEqual(d1, d2, msg=None)
assertEndsWith(s, suffix, msg=None)
assertEqual(first, second, msg=None)

Fail if the two objects are unequal as determined by the ‘==’ operator.

assertFalse(expr, msg=None)

Check that the expression is false.

assertGreater(a, b, msg=None)

Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)

Just like self.assertTrue(a >= b), but with a nicer default message.

assertHasAttr(obj, name, msg=None)
assertIn(member, container, msg=None)

Just like self.assertTrue(a in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)

Just like self.assertTrue(a is b), but with a nicer default message.

assertIsInstance(obj, cls, msg=None)

Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertIsNone(obj, msg=None)

Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNot(expr1, expr2, msg=None)

Just like self.assertTrue(a is not b), but with a nicer default message.

assertIsNotNone(obj, msg=None)

Included for symmetry with assertIsNone.

assertIsSubclass(cls, superclass, msg=None)
assertLess(a, b, msg=None)

Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)

Just like self.assertTrue(a <= b), but with a nicer default message.

assertListEqual(list1, list2, msg=None)

A list-specific equality assertion.

Parameters:
  • list1 – The first list to compare.

  • list2 – The second list to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertLogs(logger=None, level=None)

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

This method must be used as a context manager, and will yield a recording object with two attributes: output and records. At the end of the context manager, the output attribute will be a list of the matching formatted log messages and the records attribute will be a list of the corresponding LogRecord objects.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])
assertMultiLineEqual(first, second, msg=None)

Assert that two multi-line strings are equal.

assertNoLogs(logger=None, level=None)

Fail unless no log messages of level level or higher are emitted on logger_name or its children.

This method must be used as a context manager.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertNotEndsWith(s, suffix, msg=None)
assertNotEqual(first, second, msg=None)

Fail if the two objects are equal as determined by the ‘!=’ operator.

assertNotHasAttr(obj, name, msg=None)
assertNotIn(member, container, msg=None)

Just like self.assertTrue(a not in b), but with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)

Included for symmetry with assertIsInstance.

assertNotIsSubclass(cls, superclass, msg=None)
assertNotRegex(text, unexpected_regex, msg=None)

Fail the test if the text matches the regular expression.

assertNotStartsWith(s, prefix, msg=None)
assertRaises(expected_exception, *args, **kwargs)

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)

Asserts that the message in a raised exception matches a regex.

Parameters:
  • expected_exception – Exception class expected to be raised.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)

Fail the test unless the text matches the regular expression.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters:
  • seq1 – The first sequence to compare.

  • seq2 – The second sequence to compare.

  • seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)

A set-specific equality assertion.

Parameters:
  • set1 – The first set to compare.

  • set2 – The second set to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertStartsWith(s, prefix, msg=None)
assertTrue(expr, msg=None)

Check that the expression is true.

assertTupleEqual(tuple1, tuple2, msg=None)

A tuple-specific equality assertion.

Parameters:
  • tuple1 – The first tuple to compare.

  • tuple2 – The second tuple to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertWarns(expected_warning, *args, **kwargs)

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)
assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)

Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.

Parameters:
  • expected_warning – Warning class expected to be triggered.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

clearGenerated()

Remove the directories that are used for testing.

countTestCases()
createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)

Create a small dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a dataset for testing purposes from the given data frame.

Parameters:
  • df (pd.DataFrame) – data frame containing the dataset

  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

  • prep (dict) – dictionary containing preparation settings

  • n_jobs (int) – number of jobs to use for parallel processing

  • chunk_size (int) – size of chunks to use per job in parallel processing

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

debug()

Run the test without collecting errors in a TestResult

defaultTestResult()
classmethod doClassCleanups()

Execute all class cleanup functions. Normally called for you after tearDownClass.

doCleanups()

Execute all cleanup functions. Normally called for you after tearDown.

classmethod enterClassContext(cm)

Same as enterContext, but class-wide.

enterContext(cm)

Enters the supplied context manager.

If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.

fail(msg=None)

Fail immediately, with the given message.

failureException

alias of AssertionError

classmethod getAllDescriptorSets()

Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.

TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.

Returns:

list of DescriptorCalculator objects

Return type:

list

getBigDF()

Get a large data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

classmethod getDataPrepGrid()

Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.

Returns:

a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)

Return type:

grid

classmethod getDefaultCalculatorCombo()

Makes a list of default descriptor calculators that can be used in tests.

It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.

Returns:

list of created DescriptorCalculator objects

Return type:

list

static getDefaultPrep(add_imputer=None)

Return a dictionary with default preparation settings.

classmethod getPrepCombos()

Return a list of all possible preparation combinations as generated by getDataPrepGrid as well as their names. The generated list can be used to parameterize tests with the given named combinations.

Returns:

list of `list`s of all possible combinations of preparation

Return type:

list

getSmallDF()

Get a small data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

getStorage(df, name, n_jobs=1, chunk_size=None)
id()
longMessage = True
maxDiff = 640
static regularFunc(props, *args, **kwargs)[source]
run(result=None)
setUp()[source]

Hook method for setting up the test fixture before exercising it.

classmethod setUpClass()

Hook method for setting up class fixture before running tests in the class.

setUpPaths()

Create the directories that are used for testing.

shortDescription()

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

skipTest(reason)

Skip this test.

subTest(msg=<object object>, **params)

Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

tearDown()

Remove all files and directories that are used for testing.

classmethod tearDownClass()

Hook method for deconstructing the class fixture after running all tests in the class.

testRegular = None
testRegular_0(**kw)
testRegular_1(**kw)
testRegular_2(**kw)
testRegular_3(**kw)
class qsprpred.data.tables.tests.TestDataSetPreProcessing(methodName='runTest')[source]

Bases: DataSetsPathMixIn, DataPrepCheckMixIn, QSPRTestCase

Test as many possible combinations of data sets and their preparation settings. These can run potentially for a long time so use the skip decorator if you want to skip all these tests to speed things up during development.

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

classmethod addClassCleanup(function, /, *args, **kwargs)

Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).

addCleanup(function, /, *args, **kwargs)

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

addTypeEqualityFunc(typeobj, function)

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters:
  • typeobj – The data type to call this function on when both values are of the same type in assertEqual().

  • function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertCountEqual(first, second, msg=None)

Asserts that two iterables have the same elements, the same number of times, without regard to order.

self.assertEqual(Counter(list(first)),

Counter(list(second)))

Example:
  • [0, 1, 1] and [1, 0, 1] compare equal.

  • [0, 0, 1] and [0, 1] compare unequal.

assertDictEqual(d1, d2, msg=None)
assertEndsWith(s, suffix, msg=None)
assertEqual(first, second, msg=None)

Fail if the two objects are unequal as determined by the ‘==’ operator.

assertFalse(expr, msg=None)

Check that the expression is false.

assertGreater(a, b, msg=None)

Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)

Just like self.assertTrue(a >= b), but with a nicer default message.

assertHasAttr(obj, name, msg=None)
assertIn(member, container, msg=None)

Just like self.assertTrue(a in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)

Just like self.assertTrue(a is b), but with a nicer default message.

assertIsInstance(obj, cls, msg=None)

Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertIsNone(obj, msg=None)

Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNot(expr1, expr2, msg=None)

Just like self.assertTrue(a is not b), but with a nicer default message.

assertIsNotNone(obj, msg=None)

Included for symmetry with assertIsNone.

assertIsSubclass(cls, superclass, msg=None)
assertLess(a, b, msg=None)

Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)

Just like self.assertTrue(a <= b), but with a nicer default message.

assertListEqual(list1, list2, msg=None)

A list-specific equality assertion.

Parameters:
  • list1 – The first list to compare.

  • list2 – The second list to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertLogs(logger=None, level=None)

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

This method must be used as a context manager, and will yield a recording object with two attributes: output and records. At the end of the context manager, the output attribute will be a list of the matching formatted log messages and the records attribute will be a list of the corresponding LogRecord objects.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])
assertMultiLineEqual(first, second, msg=None)

Assert that two multi-line strings are equal.

assertNoLogs(logger=None, level=None)

Fail unless no log messages of level level or higher are emitted on logger_name or its children.

This method must be used as a context manager.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertNotEndsWith(s, suffix, msg=None)
assertNotEqual(first, second, msg=None)

Fail if the two objects are equal as determined by the ‘!=’ operator.

assertNotHasAttr(obj, name, msg=None)
assertNotIn(member, container, msg=None)

Just like self.assertTrue(a not in b), but with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)

Included for symmetry with assertIsInstance.

assertNotIsSubclass(cls, superclass, msg=None)
assertNotRegex(text, unexpected_regex, msg=None)

Fail the test if the text matches the regular expression.

assertNotStartsWith(s, prefix, msg=None)
assertRaises(expected_exception, *args, **kwargs)

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)

Asserts that the message in a raised exception matches a regex.

Parameters:
  • expected_exception – Exception class expected to be raised.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)

Fail the test unless the text matches the regular expression.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters:
  • seq1 – The first sequence to compare.

  • seq2 – The second sequence to compare.

  • seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)

A set-specific equality assertion.

Parameters:
  • set1 – The first set to compare.

  • set2 – The second set to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertStartsWith(s, prefix, msg=None)
assertTrue(expr, msg=None)

Check that the expression is true.

assertTupleEqual(tuple1, tuple2, msg=None)

A tuple-specific equality assertion.

Parameters:
  • tuple1 – The first tuple to compare.

  • tuple2 – The second tuple to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertWarns(expected_warning, *args, **kwargs)

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)
assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)

Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.

Parameters:
  • expected_warning – Warning class expected to be triggered.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

checkDescriptors(dataset: QSPRDataSet, target_props: list[dict | TargetSpec])

Check if information about descriptors is consistent in the data set. Checks if calculators are consistent with the descriptors contained in the data set. This is tested also before and after serialization.

Parameters:
  • dataset (QSPRDataSet) – The data set to check.

  • target_props (List of dicts or TargetProperty) – list of target properties

Raises:

AssertionError – If the consistency check fails.

checkFeatures(X_train, y_train, X_test=None, y_test=None)

Check if features matrices are the correct type and shape and if the indices are consistent between features and targets. Also check if there is no overlap between the train and test indices if both are provided.

checkPrep(dataset: QSPRDataSet, pipeline: DatasetPipeline, split: DataSplit | None = None)

Check if the data preparation is consistent before and after reloading

checkSplit(dataset: QSPRDataSet, name: str)

Check if the split has the data it should have after splitting.

clearGenerated()

Remove the directories that are used for testing.

countTestCases()
createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)

Create a small dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a dataset for testing purposes from the given data frame.

Parameters:
  • df (pd.DataFrame) – data frame containing the dataset

  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

  • prep (dict) – dictionary containing preparation settings

  • n_jobs (int) – number of jobs to use for parallel processing

  • chunk_size (int) – size of chunks to use per job in parallel processing

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

debug()

Run the test without collecting errors in a TestResult

defaultTestResult()
classmethod doClassCleanups()

Execute all class cleanup functions. Normally called for you after tearDownClass.

doCleanups()

Execute all cleanup functions. Normally called for you after tearDown.

classmethod enterClassContext(cm)

Same as enterContext, but class-wide.

enterContext(cm)

Enters the supplied context manager.

If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.

fail(msg=None)

Fail immediately, with the given message.

failureException

alias of AssertionError

classmethod getAllDescriptorSets()

Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.

TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.

Returns:

list of DescriptorCalculator objects

Return type:

list

getBigDF()

Get a large data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

classmethod getDataPrepGrid()

Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.

Returns:

a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)

Return type:

grid

classmethod getDefaultCalculatorCombo()

Makes a list of default descriptor calculators that can be used in tests.

It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.

Returns:

list of created DescriptorCalculator objects

Return type:

list

static getDefaultPrep(add_imputer=None)

Return a dictionary with default preparation settings.

classmethod getPrepCombos()

Return a list of all possible preparation combinations as generated by getDataPrepGrid as well as their names. The generated list can be used to parameterize tests with the given named combinations.

Returns:

list of `list`s of all possible combinations of preparation

Return type:

list

getSmallDF()

Get a small data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

getStorage(df, name, n_jobs=1, chunk_size=None)
id()
longMessage = True
maxDiff = 640
run(result=None)
setUp()[source]

Hook method for setting up the test fixture before exercising it.

classmethod setUpClass()

Hook method for setting up class fixture before running tests in the class.

setUpPaths()

Create the directories that are used for testing.

shortDescription()

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

skipTest(reason)

Skip this test.

subTest(msg=<object object>, **params)

Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

tearDown()

Remove all files and directories that are used for testing.

classmethod tearDownClass()

Hook method for deconstructing the class fixture after running all tests in the class.

testPrepCombos = None
testPrepCombos_00_MorganFP_None_None_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_None_None_None’, name=’MorganFP_None_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dfab650>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_01_MorganFP_None_None_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_None_None_OutlierFilter’, name=’MorganFP_None_None_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dfaba50>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951eb5e350>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_02_MorganFP_None_None_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_None_RepeatsFilter_None’, name=’MorganFP_None_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951e81a120>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951eb5dd10>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_03_MorganFP_None_None_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_None_RepeatsFilter_OutlierFilter’, name=’MorganFP_None_None_None_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951e81a210>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951eb5e210>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951eb5e5d0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_04_MorganFP_None_None_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_HighCorrelationFilter_None_None’, name=’MorganFP_None_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dabaa50>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951eb5e710>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_05_MorganFP_None_None_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_HighCorrelationFilter_None_OutlierFilter’, name=’MorganFP_None_None_HighCorrelationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dabab30>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951eb5e850>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dfa0d60>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_06_MorganFP_None_None_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_HighCorrelationFilter_RepeatsFilter_None’, name=’MorganFP_None_None_HighCorrelationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951db228f0>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dfa0e90>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dfa0fc0>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_07_MorganFP_None_None_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_HighCorrelat…ter_RepeatsFilter_OutlierFilter’, name=’MorganFP_None_None_HighCorrelat…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951eb81790>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dfa10f0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dfa1220>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dfa1350>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_08_MorganFP_None_StandardScaler_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_None_None_None’, name=’MorganFP_None_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951e3107d0>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_09_MorganFP_None_StandardScaler_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_None_None_OutlierFilter’, name=’MorganFP_None_StandardScaler_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dff7750>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951f64f770>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_10_MorganFP_None_StandardScaler_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_None_RepeatsFilter_None’, name=’MorganFP_None_StandardScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dff7800>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daec710>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_11_MorganFP_None_StandardScaler_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_None_RepeatsFilter_OutlierFilter’, name=’MorganFP_None_StandardScaler_None_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dcb6530>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dfc57b0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dfc56a0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_12_MorganFP_None_StandardScaler_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_HighCorrelationFilter_None_None’, name=’MorganFP_None_StandardScaler_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dce6c10>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daec830>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_13_MorganFP_None_StandardScaler_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_Hi…lationFilter_None_OutlierFilter’, name=’MorganFP_None_StandardScaler_Hi…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951da16f00>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dfc5480>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dfc5370>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_14_MorganFP_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_Hi…lationFilter_RepeatsFilter_None’, name=’MorganFP_None_StandardScaler_Hi…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dd5c250>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dfc5260>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dfc5150>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_15_MorganFP_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_Hi…ter_RepeatsFilter_OutlierFilter’, name=’MorganFP_None_StandardScaler_Hi…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951da2b750>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae4550>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae4650>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae4750>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_16_MorganFP_RandomSplit_None_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_None_None_None’, name=’MorganFP_RandomSplit_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951ddc4b50>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951eb5e990>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_17_MorganFP_RandomSplit_None_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_None_None_OutlierFilter’, name=’MorganFP_RandomSplit_None_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dd5f850>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951eb5ead0>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae4850>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_18_MorganFP_RandomSplit_None_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_None_RepeatsFilter_None’, name=’MorganFP_RandomSplit_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951da09e50>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dfa16e0>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae4950>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_19_MorganFP_RandomSplit_None_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_None_RepeatsFilter_OutlierFilter’, name=’MorganFP_RandomSplit_None_None_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe4f50>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dfa1810>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951e81a3f0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951e81a5d0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_20_MorganFP_RandomSplit_None_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_HighCorrelationFilter_None_None’, name=’MorganFP_RandomSplit_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe5050>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daec950>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae4a50>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_21_MorganFP_RandomSplit_None_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_HighC…lationFilter_None_OutlierFilter’, name=’MorganFP_RandomSplit_None_HighC…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe4b50>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dfc5040>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951e81a6c0>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951e81a7b0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_22_MorganFP_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_HighC…lationFilter_RepeatsFilter_None’, name=’MorganFP_RandomSplit_None_HighC…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe4c50>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dfc4f30>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951e81a8a0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951e81a990>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_23_MorganFP_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_HighC…ter_RepeatsFilter_OutlierFilter’, name=’MorganFP_RandomSplit_None_HighC…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe42d0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae4b50>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dabadd0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dabaeb0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dabaf90>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_24_MorganFP_RandomSplit_StandardScaler_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardScaler_None_None_None’, name=’MorganFP_RandomSplit_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe4cd0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae4c50>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_25_MorganFP_RandomSplit_StandardScaler_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardScaler_None_None_OutlierFilter’, name=’MorganFP_RandomSplit_StandardScaler_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe49d0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951e81ab70>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dabb150>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_26_MorganFP_RandomSplit_StandardScaler_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardScaler_None_RepeatsFilter_None’, name=’MorganFP_RandomSplit_StandardScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe45d0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951e81ac60>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dabb310>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_27_MorganFP_RandomSplit_StandardScaler_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…one_RepeatsFilter_OutlierFilter’, name=’MorganFP_RandomSplit_StandardSc…one_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe4950>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dabb3f0>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951db23380>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951db23860>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_28_MorganFP_RandomSplit_StandardScaler_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…HighCorrelationFilter_None_None’, name=’MorganFP_RandomSplit_StandardSc…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe40d0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dabb4d0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dabb5b0>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_29_MorganFP_RandomSplit_StandardScaler_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…lationFilter_None_OutlierFilter’, name=’MorganFP_RandomSplit_StandardSc…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe46d0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dbae000>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dbac390>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951e3119d0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_30_MorganFP_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…lationFilter_RepeatsFilter_None’, name=’MorganFP_RandomSplit_StandardSc…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbe4350>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951e311e50>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951e3137d0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951e312c90>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_31_MorganFP_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…ter_RepeatsFilter_OutlierFilter’, name=’MorganFP_RandomSplit_StandardSc…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7f951dbcdb50>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951e312390>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951e313a10>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951e311550>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951da6c1d0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_32_RDKitDescs_None_None_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_None_None_None’, name=’RDKitDescs_None_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951eb5ec10>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_33_RDKitDescs_None_None_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_None_None_OutlierFilter’, name=’RDKitDescs_None_None_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951eb5ee90>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dff7ac0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_34_RDKitDescs_None_None_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_None_RepeatsFilter_None’, name=’RDKitDescs_None_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dfa1940>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dff7b70>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_35_RDKitDescs_None_None_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_None_RepeatsFilter_OutlierFilter’, name=’RDKitDescs_None_None_None_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dfa1a70>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dff7c20>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dff7cd0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_36_RDKitDescs_None_None_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_HighCorrelationFilter_None_None’, name=’RDKitDescs_None_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951f64f890>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dff7d80>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_37_RDKitDescs_None_None_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_HighCorrelationFilter_None_OutlierFilter’, name=’RDKitDescs_None_None_HighCorrelationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dfc4e20>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dff7e30>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dcb7ed0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_38_RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_None’, name=’RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dfc4d10>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951df26170>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951df26df0>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_39_RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_HighCorrel…ter_RepeatsFilter_OutlierFilter’, name=’RDKitDescs_None_None_HighCorrel…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dae4d50>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951da749b0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951da74690>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dce4550>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_40_RDKitDescs_None_StandardScaler_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_None_None_None’, name=’RDKitDescs_None_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dae4e50>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_41_RDKitDescs_None_StandardScaler_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_None_None_OutlierFilter’, name=’RDKitDescs_None_StandardScaler_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951e81ad50>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951da17c80>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_42_RDKitDescs_None_StandardScaler_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_None_RepeatsFilter_None’, name=’RDKitDescs_None_StandardScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951e81ae40>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951da17da0>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_43_RDKitDescs_None_StandardScaler_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…one_RepeatsFilter_OutlierFilter’, name=’RDKitDescs_None_StandardScaler_…one_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dabb690>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dbe4850>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dba9250>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_44_RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_None’, name=’RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dabb770>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951da17e30>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_45_RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…lationFilter_None_OutlierFilter’, name=’RDKitDescs_None_StandardScaler_…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951db22c30>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dbaba50>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dba91d0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_46_RDKitDescs_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…lationFilter_RepeatsFilter_None’, name=’RDKitDescs_None_StandardScaler_…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951da6c290>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dbab650>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951dba9450>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_47_RDKitDescs_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…ter_RepeatsFilter_OutlierFilter’, name=’RDKitDescs_None_StandardScaler_…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951da6c350>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951da4b850>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951da4b8c0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951da4b930>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_48_RDKitDescs_RandomSplit_None_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_None_None_None’, name=’RDKitDescs_RandomSplit_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dff7ee0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daa4050>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_49_RDKitDescs_RandomSplit_None_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_None_None_OutlierFilter’, name=’RDKitDescs_RandomSplit_None_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa4100>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daa41b0>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951da4b9a0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_50_RDKitDescs_RandomSplit_None_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_None_RepeatsFilter_None’, name=’RDKitDescs_RandomSplit_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951da74410>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951da74370>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951da4ba10>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_51_RDKitDescs_RandomSplit_None_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_None_RepeatsFilter_OutlierFilter’, name=’RDKitDescs_RandomSplit_None_None_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951da742d0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951da74230>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951da4baf0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9130>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_52_RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_None’, name=’RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951da17ec0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951da17f50>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951da4bbd0>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_53_RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_Hig…lationFilter_None_OutlierFilter’, name=’RDKitDescs_RandomSplit_None_Hig…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951db82f50>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dbab2d0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae91f0>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9250>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_54_RDKitDescs_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_Hig…lationFilter_RepeatsFilter_None’, name=’RDKitDescs_RandomSplit_None_Hig…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951dbab1d0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dbaa3d0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae92b0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951da4bd90>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_55_RDKitDescs_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_Hig…ter_RepeatsFilter_OutlierFilter’, name=’RDKitDescs_RandomSplit_None_Hig…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951da4bf50>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daa0050>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9310>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa00c0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9190>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_56_RDKitDescs_RandomSplit_StandardScaler_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_StandardScaler_None_None_None’, name=’RDKitDescs_RandomSplit_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa01a0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daa0130>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_57_RDKitDescs_RandomSplit_StandardScaler_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_StandardScaler_None_None_OutlierFilter’, name=’RDKitDescs_RandomSplit_StandardScaler_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa0280>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae9430>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9370>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_58_RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_None’, name=’RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa02f0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae94f0>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa0440>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_59_RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…one_RepeatsFilter_OutlierFilter’, name=’RDKitDescs_RandomSplit_Standard…one_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa0520>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae9550>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa0590>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9490>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_60_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…HighCorrelationFilter_None_None’, name=’RDKitDescs_RandomSplit_Standard…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa0600>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae9610>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9670>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_61_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…lationFilter_None_OutlierFilter’, name=’RDKitDescs_RandomSplit_Standard…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa07c0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae95b0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae96d0>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9730>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_62_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…lationFilter_RepeatsFilter_None’, name=’RDKitDescs_RandomSplit_Standard…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa08a0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae97f0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9850>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa0830>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_63_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…ter_RepeatsFilter_OutlierFilter’, name=’RDKitDescs_RandomSplit_Standard…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7f951daa0980>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951dae9790>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae98b0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa09f0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9910>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_64_MorganFP_RDKitDescs_None_None_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_None_None_None’, name=’MorganFP_RDKitDescs_None_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa0ad0>), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_65_MorganFP_RDKitDescs_None_None_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_None_None_OutlierFilter’, name=’MorganFP_RDKitDescs_None_None_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa0a60>), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9a30>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_66_MorganFP_RDKitDescs_None_None_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_None_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_None_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa0c20>), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa0bb0>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_67_MorganFP_RDKitDescs_None_None_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_None_RepeatsFilter_OutlierFilter’, name=’MorganFP_RDKitDescs_None_None_None_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa0d00>), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa0d70>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9970>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_68_MorganFP_RDKitDescs_None_None_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_HighCorrelationFilter_None_None’, name=’MorganFP_RDKitDescs_None_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa0de0>), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9bb0>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_69_MorganFP_RDKitDescs_None_None_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_H…lationFilter_None_OutlierFilter’, name=’MorganFP_RDKitDescs_None_None_H…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa0fa0>), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9c10>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9b50>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_70_MorganFP_RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_H…lationFilter_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_None_None_H…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1010>), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9cd0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa1160>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_71_MorganFP_RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_H…ter_RepeatsFilter_OutlierFilter’, name=’MorganFP_RDKitDescs_None_None_H…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1240>), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9d30>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa12b0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9c70>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_72_MorganFP_RDKitDescs_None_StandardScaler_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_StandardScaler_None_None_None’, name=’MorganFP_RDKitDescs_None_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1390>), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_73_MorganFP_RDKitDescs_None_StandardScaler_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…dScaler_None_None_OutlierFilter’, name=’MorganFP_RDKitDescs_None_Standa…dScaler_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1320>), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9e50>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_74_MorganFP_RDKitDescs_None_StandardScaler_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…dScaler_None_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_None_Standa…dScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1470>), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa15c0>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_75_MorganFP_RDKitDescs_None_StandardScaler_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…one_RepeatsFilter_OutlierFilter’, name=’MorganFP_RDKitDescs_None_Standa…one_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa16a0>), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa1710>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9eb0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_76_MorganFP_RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…HighCorrelationFilter_None_None’, name=’MorganFP_RDKitDescs_None_Standa…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa17f0>), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9d90>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_77_MorganFP_RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…lationFilter_None_OutlierFilter’, name=’MorganFP_RDKitDescs_None_Standa…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa18d0>), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951dae9f70>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9f10>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_78_MorganFP_RDKitDescs_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…lationFilter_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_None_Standa…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa19b0>), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daea030>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa1940>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_79_MorganFP_RDKitDescs_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…ter_RepeatsFilter_OutlierFilter’, name=’MorganFP_RDKitDescs_None_Standa…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1a90>), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daea090>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa1b00>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951dae9fd0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_80_MorganFP_RDKitDescs_RandomSplit_None_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit_None_None_None_None’, name=’MorganFP_RDKitDescs_RandomSplit_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1be0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea150>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_81_MorganFP_RDKitDescs_RandomSplit_None_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit_None_None_None_OutlierFilter’, name=’MorganFP_RDKitDescs_RandomSplit_None_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1b70>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea0f0>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951daea210>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_82_MorganFP_RDKitDescs_RandomSplit_None_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit_None_None_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_RandomSplit_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1cc0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea2d0>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa1e10>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_83_MorganFP_RDKitDescs_RandomSplit_None_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…one_RepeatsFilter_OutlierFilter’, name=’MorganFP_RDKitDescs_RandomSplit…one_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1ef0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea330>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa1f60>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951daea270>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_84_MorganFP_RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…HighCorrelationFilter_None_None’, name=’MorganFP_RDKitDescs_RandomSplit…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa1fd0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea3f0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daea450>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_85_MorganFP_RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…lationFilter_None_OutlierFilter’, name=’MorganFP_RDKitDescs_RandomSplit…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2190>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea390>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daea4b0>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951daea510>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_86_MorganFP_RDKitDescs_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…lationFilter_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_RandomSplit…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2270>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea5d0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daea630>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa2200>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_87_MorganFP_RDKitDescs_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…ter_RepeatsFilter_OutlierFilter’, name=’MorganFP_RDKitDescs_RandomSplit…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2350>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea570>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daea690>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa23c0>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951daea6f0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_88_MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_None_None’, name=’MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa24a0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea7b0>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_89_MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…dScaler_None_None_OutlierFilter’, name=’MorganFP_RDKitDescs_RandomSplit…dScaler_None_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2580>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea810>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951daea750>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_90_MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…dScaler_None_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_RandomSplit…dScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa25f0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea8d0>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa2740>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_91_MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…one_RepeatsFilter_OutlierFilter’, name=’MorganFP_RDKitDescs_RandomSplit…one_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2820>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea930>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa2890>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951daea870>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_92_MorganFP_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_None_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…HighCorrelationFilter_None_None’, name=’MorganFP_RDKitDescs_RandomSplit…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2970>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea9f0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daeaa50>, data_filter=None, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_93_MorganFP_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_None_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…lationFilter_None_OutlierFilter’, name=’MorganFP_RDKitDescs_RandomSplit…lationFilter_None_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2900>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daea990>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daeaab0>, data_filter=None, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951daeab10>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_94_MorganFP_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…lationFilter_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_RandomSplit…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2ac0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daeabd0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daeac30>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa2a50>, applicability_domain=None].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

testPrepCombos_95_MorganFP_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_OutlierFilter(**kw)

Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…ter_RepeatsFilter_OutlierFilter’, name=’MorganFP_RDKitDescs_RandomSplit…ter_RepeatsFilter_OutlierFilter’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7f951daa2ba0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7f951daeab70>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7f951daeac90>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7f951daa2c10>, applicability_domain=<qsprpred.data.processing.data_f…Filter object at 0x7f951daeacf0>].

This generates a large number of parameterized tests. Use the skip decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined by DataSetsPathMixIn.getPrepCombos().

class qsprpred.data.tables.tests.TestMolTable(methodName='runTest')[source]

Bases: DataSetsPathMixIn, QSPRTestCase

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

classmethod addClassCleanup(function, /, *args, **kwargs)

Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).

addCleanup(function, /, *args, **kwargs)

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

addTypeEqualityFunc(typeobj, function)

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters:
  • typeobj – The data type to call this function on when both values are of the same type in assertEqual().

  • function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertCountEqual(first, second, msg=None)

Asserts that two iterables have the same elements, the same number of times, without regard to order.

self.assertEqual(Counter(list(first)),

Counter(list(second)))

Example:
  • [0, 1, 1] and [1, 0, 1] compare equal.

  • [0, 0, 1] and [0, 1] compare unequal.

assertDictEqual(d1, d2, msg=None)
assertEndsWith(s, suffix, msg=None)
assertEqual(first, second, msg=None)

Fail if the two objects are unequal as determined by the ‘==’ operator.

assertFalse(expr, msg=None)

Check that the expression is false.

assertGreater(a, b, msg=None)

Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)

Just like self.assertTrue(a >= b), but with a nicer default message.

assertHasAttr(obj, name, msg=None)
assertIn(member, container, msg=None)

Just like self.assertTrue(a in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)

Just like self.assertTrue(a is b), but with a nicer default message.

assertIsInstance(obj, cls, msg=None)

Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertIsNone(obj, msg=None)

Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNot(expr1, expr2, msg=None)

Just like self.assertTrue(a is not b), but with a nicer default message.

assertIsNotNone(obj, msg=None)

Included for symmetry with assertIsNone.

assertIsSubclass(cls, superclass, msg=None)
assertLess(a, b, msg=None)

Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)

Just like self.assertTrue(a <= b), but with a nicer default message.

assertListEqual(list1, list2, msg=None)

A list-specific equality assertion.

Parameters:
  • list1 – The first list to compare.

  • list2 – The second list to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertLogs(logger=None, level=None)

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

This method must be used as a context manager, and will yield a recording object with two attributes: output and records. At the end of the context manager, the output attribute will be a list of the matching formatted log messages and the records attribute will be a list of the corresponding LogRecord objects.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])
assertMultiLineEqual(first, second, msg=None)

Assert that two multi-line strings are equal.

assertNoLogs(logger=None, level=None)

Fail unless no log messages of level level or higher are emitted on logger_name or its children.

This method must be used as a context manager.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertNotEndsWith(s, suffix, msg=None)
assertNotEqual(first, second, msg=None)

Fail if the two objects are equal as determined by the ‘!=’ operator.

assertNotHasAttr(obj, name, msg=None)
assertNotIn(member, container, msg=None)

Just like self.assertTrue(a not in b), but with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)

Included for symmetry with assertIsInstance.

assertNotIsSubclass(cls, superclass, msg=None)
assertNotRegex(text, unexpected_regex, msg=None)

Fail the test if the text matches the regular expression.

assertNotStartsWith(s, prefix, msg=None)
assertRaises(expected_exception, *args, **kwargs)

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)

Asserts that the message in a raised exception matches a regex.

Parameters:
  • expected_exception – Exception class expected to be raised.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)

Fail the test unless the text matches the regular expression.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters:
  • seq1 – The first sequence to compare.

  • seq2 – The second sequence to compare.

  • seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)

A set-specific equality assertion.

Parameters:
  • set1 – The first set to compare.

  • set2 – The second set to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertStartsWith(s, prefix, msg=None)
assertTrue(expr, msg=None)

Check that the expression is true.

assertTupleEqual(tuple1, tuple2, msg=None)

A tuple-specific equality assertion.

Parameters:
  • tuple1 – The first tuple to compare.

  • tuple2 – The second tuple to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertWarns(expected_warning, *args, **kwargs)

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)
assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)

Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.

Parameters:
  • expected_warning – Warning class expected to be triggered.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

clearGenerated()

Remove the directories that are used for testing.

countTestCases()
createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)

Create a small dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a dataset for testing purposes from the given data frame.

Parameters:
  • df (pd.DataFrame) – data frame containing the dataset

  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

  • prep (dict) – dictionary containing preparation settings

  • n_jobs (int) – number of jobs to use for parallel processing

  • chunk_size (int) – size of chunks to use per job in parallel processing

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

debug()

Run the test without collecting errors in a TestResult

defaultTestResult()
classmethod doClassCleanups()

Execute all class cleanup functions. Normally called for you after tearDownClass.

doCleanups()

Execute all cleanup functions. Normally called for you after tearDown.

classmethod enterClassContext(cm)

Same as enterContext, but class-wide.

enterContext(cm)

Enters the supplied context manager.

If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.

fail(msg=None)

Fail immediately, with the given message.

failureException

alias of AssertionError

classmethod getAllDescriptorSets()

Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.

TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.

Returns:

list of DescriptorCalculator objects

Return type:

list

getBigDF()

Get a large data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

classmethod getDataPrepGrid()

Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.

Returns:

a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)

Return type:

grid

classmethod getDefaultCalculatorCombo()

Makes a list of default descriptor calculators that can be used in tests.

It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.

Returns:

list of created DescriptorCalculator objects

Return type:

list

static getDefaultPrep(add_imputer=None)

Return a dictionary with default preparation settings.

static getDescriptorSets()[source]
classmethod getPrepCombos()

Return a list of all possible preparation combinations as generated by getDataPrepGrid as well as their names. The generated list can be used to parameterize tests with the given named combinations.

Returns:

list of `list`s of all possible combinations of preparation

Return type:

list

getSmallDF()

Get a small data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

getStorage()[source]
getTable()[source]
id()
longMessage = True
maxDiff = 640
run(result=None)
setUp()[source]

Hook method for setting up the test fixture before exercising it.

classmethod setUpClass()

Hook method for setting up class fixture before running tests in the class.

setUpPaths()

Create the directories that are used for testing.

shortDescription()

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

skipTest(reason)

Skip this test.

subTest(msg=<object object>, **params)

Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

tearDown()

Remove all files and directories that are used for testing.

classmethod tearDownClass()

Hook method for deconstructing the class fixture after running all tests in the class.

testDescriptors()[source]
testSubsetting()[source]
testTableCreation()[source]

Test the creation of a table from a data set.

testTableSerialization()[source]
class qsprpred.data.tables.tests.TestQSPRTable(methodName='runTest')[source]

Bases: DataSetsPathMixIn, QSPRTestCase

Simple tests for dataset creation and serialization under different conditions and error states.

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

classmethod addClassCleanup(function, /, *args, **kwargs)

Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).

addCleanup(function, /, *args, **kwargs)

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

addTypeEqualityFunc(typeobj, function)

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters:
  • typeobj – The data type to call this function on when both values are of the same type in assertEqual().

  • function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertCountEqual(first, second, msg=None)

Asserts that two iterables have the same elements, the same number of times, without regard to order.

self.assertEqual(Counter(list(first)),

Counter(list(second)))

Example:
  • [0, 1, 1] and [1, 0, 1] compare equal.

  • [0, 0, 1] and [0, 1] compare unequal.

assertDictEqual(d1, d2, msg=None)
assertEndsWith(s, suffix, msg=None)
assertEqual(first, second, msg=None)

Fail if the two objects are unequal as determined by the ‘==’ operator.

assertFalse(expr, msg=None)

Check that the expression is false.

assertGreater(a, b, msg=None)

Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)

Just like self.assertTrue(a >= b), but with a nicer default message.

assertHasAttr(obj, name, msg=None)
assertIn(member, container, msg=None)

Just like self.assertTrue(a in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)

Just like self.assertTrue(a is b), but with a nicer default message.

assertIsInstance(obj, cls, msg=None)

Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertIsNone(obj, msg=None)

Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNot(expr1, expr2, msg=None)

Just like self.assertTrue(a is not b), but with a nicer default message.

assertIsNotNone(obj, msg=None)

Included for symmetry with assertIsNone.

assertIsSubclass(cls, superclass, msg=None)
assertLess(a, b, msg=None)

Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)

Just like self.assertTrue(a <= b), but with a nicer default message.

assertListEqual(list1, list2, msg=None)

A list-specific equality assertion.

Parameters:
  • list1 – The first list to compare.

  • list2 – The second list to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertLogs(logger=None, level=None)

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

This method must be used as a context manager, and will yield a recording object with two attributes: output and records. At the end of the context manager, the output attribute will be a list of the matching formatted log messages and the records attribute will be a list of the corresponding LogRecord objects.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])
assertMultiLineEqual(first, second, msg=None)

Assert that two multi-line strings are equal.

assertNoLogs(logger=None, level=None)

Fail unless no log messages of level level or higher are emitted on logger_name or its children.

This method must be used as a context manager.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertNotEndsWith(s, suffix, msg=None)
assertNotEqual(first, second, msg=None)

Fail if the two objects are equal as determined by the ‘!=’ operator.

assertNotHasAttr(obj, name, msg=None)
assertNotIn(member, container, msg=None)

Just like self.assertTrue(a not in b), but with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)

Included for symmetry with assertIsInstance.

assertNotIsSubclass(cls, superclass, msg=None)
assertNotRegex(text, unexpected_regex, msg=None)

Fail the test if the text matches the regular expression.

assertNotStartsWith(s, prefix, msg=None)
assertRaises(expected_exception, *args, **kwargs)

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)

Asserts that the message in a raised exception matches a regex.

Parameters:
  • expected_exception – Exception class expected to be raised.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)

Fail the test unless the text matches the regular expression.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters:
  • seq1 – The first sequence to compare.

  • seq2 – The second sequence to compare.

  • seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)

A set-specific equality assertion.

Parameters:
  • set1 – The first set to compare.

  • set2 – The second set to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertStartsWith(s, prefix, msg=None)
assertTrue(expr, msg=None)

Check that the expression is true.

assertTupleEqual(tuple1, tuple2, msg=None)

A tuple-specific equality assertion.

Parameters:
  • tuple1 – The first tuple to compare.

  • tuple2 – The second tuple to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertWarns(expected_warning, *args, **kwargs)

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)
assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)

Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.

Parameters:
  • expected_warning – Warning class expected to be triggered.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

checkBadInit(ds)[source]
checkClassification(ds, target_names, ths)[source]
checkConsistency(ds: QSPRDataSet)[source]
checkConsistencyMulticlass(ds)[source]
checkConsistencySingleclass(ds)[source]
checkRegression(ds, target_names)[source]
clearGenerated()

Remove the directories that are used for testing.

countTestCases()
createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)

Create a small dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a dataset for testing purposes from the given data frame.

Parameters:
  • df (pd.DataFrame) – data frame containing the dataset

  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

  • prep (dict) – dictionary containing preparation settings

  • n_jobs (int) – number of jobs to use for parallel processing

  • chunk_size (int) – size of chunks to use per job in parallel processing

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

debug()

Run the test without collecting errors in a TestResult

defaultTestResult()
classmethod doClassCleanups()

Execute all class cleanup functions. Normally called for you after tearDownClass.

doCleanups()

Execute all cleanup functions. Normally called for you after tearDown.

classmethod enterClassContext(cm)

Same as enterContext, but class-wide.

enterContext(cm)

Enters the supplied context manager.

If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.

fail(msg=None)

Fail immediately, with the given message.

failureException

alias of AssertionError

classmethod getAllDescriptorSets()

Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.

TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.

Returns:

list of DescriptorCalculator objects

Return type:

list

getBigDF()

Get a large data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

classmethod getDataPrepGrid()

Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.

Returns:

a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)

Return type:

grid

classmethod getDefaultCalculatorCombo()

Makes a list of default descriptor calculators that can be used in tests.

It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.

Returns:

list of created DescriptorCalculator objects

Return type:

list

static getDefaultPrep(add_imputer=None)

Return a dictionary with default preparation settings.

classmethod getPrepCombos()

Return a list of all possible preparation combinations as generated by getDataPrepGrid as well as their names. The generated list can be used to parameterize tests with the given named combinations.

Returns:

list of `list`s of all possible combinations of preparation

Return type:

list

getSmallDF()

Get a small data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

getStorage(df, name, n_jobs=1, chunk_size=None)
id()
longMessage = True
maxDiff = 640
run(result=None)
setUp()[source]

Hook method for setting up the test fixture before exercising it.

classmethod setUpClass()

Hook method for setting up class fixture before running tests in the class.

setUpPaths()

Create the directories that are used for testing.

shortDescription()

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

skipTest(reason)

Skip this test.

subTest(msg=<object object>, **params)

Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

tearDown()

Remove all files and directories that are used for testing.

classmethod tearDownClass()

Hook method for deconstructing the class fixture after running all tests in the class.

testDefaults()[source]

Test basic dataset creation and serialization with mostly default options.

testFilter()[source]

Test removing entries from the dataset using a DataFilter.

testMultitask()[source]

Test multi-task dataset creation and functionality.

testRandomStateFolds()[source]
testRandomStateSplit()[source]
testTargetProperty()[source]

Test target property creation and serialization in the context of a dataset.

class qsprpred.data.tables.tests.TestSearchFeatures(methodName='runTest')[source]

Bases: DataSetsPathMixIn, QSPRTestCase

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

classmethod addClassCleanup(function, /, *args, **kwargs)

Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).

addCleanup(function, /, *args, **kwargs)

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

addTypeEqualityFunc(typeobj, function)

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters:
  • typeobj – The data type to call this function on when both values are of the same type in assertEqual().

  • function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertCountEqual(first, second, msg=None)

Asserts that two iterables have the same elements, the same number of times, without regard to order.

self.assertEqual(Counter(list(first)),

Counter(list(second)))

Example:
  • [0, 1, 1] and [1, 0, 1] compare equal.

  • [0, 0, 1] and [0, 1] compare unequal.

assertDictEqual(d1, d2, msg=None)
assertEndsWith(s, suffix, msg=None)
assertEqual(first, second, msg=None)

Fail if the two objects are unequal as determined by the ‘==’ operator.

assertFalse(expr, msg=None)

Check that the expression is false.

assertGreater(a, b, msg=None)

Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)

Just like self.assertTrue(a >= b), but with a nicer default message.

assertHasAttr(obj, name, msg=None)
assertIn(member, container, msg=None)

Just like self.assertTrue(a in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)

Just like self.assertTrue(a is b), but with a nicer default message.

assertIsInstance(obj, cls, msg=None)

Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertIsNone(obj, msg=None)

Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNot(expr1, expr2, msg=None)

Just like self.assertTrue(a is not b), but with a nicer default message.

assertIsNotNone(obj, msg=None)

Included for symmetry with assertIsNone.

assertIsSubclass(cls, superclass, msg=None)
assertLess(a, b, msg=None)

Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)

Just like self.assertTrue(a <= b), but with a nicer default message.

assertListEqual(list1, list2, msg=None)

A list-specific equality assertion.

Parameters:
  • list1 – The first list to compare.

  • list2 – The second list to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertLogs(logger=None, level=None)

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

This method must be used as a context manager, and will yield a recording object with two attributes: output and records. At the end of the context manager, the output attribute will be a list of the matching formatted log messages and the records attribute will be a list of the corresponding LogRecord objects.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])
assertMultiLineEqual(first, second, msg=None)

Assert that two multi-line strings are equal.

assertNoLogs(logger=None, level=None)

Fail unless no log messages of level level or higher are emitted on logger_name or its children.

This method must be used as a context manager.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertNotEndsWith(s, suffix, msg=None)
assertNotEqual(first, second, msg=None)

Fail if the two objects are equal as determined by the ‘!=’ operator.

assertNotHasAttr(obj, name, msg=None)
assertNotIn(member, container, msg=None)

Just like self.assertTrue(a not in b), but with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)

Included for symmetry with assertIsInstance.

assertNotIsSubclass(cls, superclass, msg=None)
assertNotRegex(text, unexpected_regex, msg=None)

Fail the test if the text matches the regular expression.

assertNotStartsWith(s, prefix, msg=None)
assertRaises(expected_exception, *args, **kwargs)

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)

Asserts that the message in a raised exception matches a regex.

Parameters:
  • expected_exception – Exception class expected to be raised.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)

Fail the test unless the text matches the regular expression.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters:
  • seq1 – The first sequence to compare.

  • seq2 – The second sequence to compare.

  • seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)

A set-specific equality assertion.

Parameters:
  • set1 – The first set to compare.

  • set2 – The second set to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertStartsWith(s, prefix, msg=None)
assertTrue(expr, msg=None)

Check that the expression is true.

assertTupleEqual(tuple1, tuple2, msg=None)

A tuple-specific equality assertion.

Parameters:
  • tuple1 – The first tuple to compare.

  • tuple2 – The second tuple to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertWarns(expected_warning, *args, **kwargs)

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)
assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)

Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.

Parameters:
  • expected_warning – Warning class expected to be triggered.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

clearGenerated()

Remove the directories that are used for testing.

countTestCases()
createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a large dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)

Create a small dataset for testing purposes.

Parameters:
  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)

Create a dataset for testing purposes from the given data frame.

Parameters:
  • df (pd.DataFrame) – data frame containing the dataset

  • name (str) – name of the dataset

  • target_props (List of dicts or TargetProperty) – list of target properties

  • random_state (int) – random state to use for splitting and shuffling

  • prep (dict) – dictionary containing preparation settings

  • n_jobs (int) – number of jobs to use for parallel processing

  • chunk_size (int) – size of chunks to use per job in parallel processing

Returns:

a QSPRDataSet object

Return type:

QSPRDataSet

debug()

Run the test without collecting errors in a TestResult

defaultTestResult()
classmethod doClassCleanups()

Execute all class cleanup functions. Normally called for you after tearDownClass.

doCleanups()

Execute all cleanup functions. Normally called for you after tearDown.

classmethod enterClassContext(cm)

Same as enterContext, but class-wide.

enterContext(cm)

Enters the supplied context manager.

If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.

fail(msg=None)

Fail immediately, with the given message.

failureException

alias of AssertionError

classmethod getAllDescriptorSets()

Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.

TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.

Returns:

list of DescriptorCalculator objects

Return type:

list

getBigDF()

Get a large data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

classmethod getDataPrepGrid()

Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.

Returns:

a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)

Return type:

grid

classmethod getDefaultCalculatorCombo()

Makes a list of default descriptor calculators that can be used in tests.

It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.

Returns:

list of created DescriptorCalculator objects

Return type:

list

static getDefaultPrep(add_imputer=None)

Return a dictionary with default preparation settings.

classmethod getPrepCombos()

Return a list of all possible preparation combinations as generated by getDataPrepGrid as well as their names. The generated list can be used to parameterize tests with the given named combinations.

Returns:

list of `list`s of all possible combinations of preparation

Return type:

list

getSmallDF()

Get a small data frame for testing purposes.

Returns:

a pandas.DataFrame containing the dataset

Return type:

pd.DataFrame

getStorage(df, name, n_jobs=1, chunk_size=None)
id()
longMessage = True
maxDiff = 640
run(result=None)
setUp()[source]

Hook method for setting up the test fixture before exercising it.

classmethod setUpClass()

Hook method for setting up class fixture before running tests in the class.

setUpPaths()

Create the directories that are used for testing.

shortDescription()

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

skipTest(reason)

Skip this test.

subTest(msg=<object object>, **params)

Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

tearDown()

Remove all files and directories that are used for testing.

classmethod tearDownClass()

Hook method for deconstructing the class fixture after running all tests in the class.

testPropSearch()[source]
testSMARTS()[source]
validateSearch(dataset: QSPRDataSet, result: QSPRDataSet, name: str)[source]

Validate the results of a search.

class qsprpred.data.tables.tests.TestTargetImputation(methodName='runTest')[source]

Bases: PathMixIn, QSPRTestCase

Small tests to only check if the target imputation works on its own.

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

classmethod addClassCleanup(function, /, *args, **kwargs)

Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).

addCleanup(function, /, *args, **kwargs)

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

addTypeEqualityFunc(typeobj, function)

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters:
  • typeobj – The data type to call this function on when both values are of the same type in assertEqual().

  • function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertCountEqual(first, second, msg=None)

Asserts that two iterables have the same elements, the same number of times, without regard to order.

self.assertEqual(Counter(list(first)),

Counter(list(second)))

Example:
  • [0, 1, 1] and [1, 0, 1] compare equal.

  • [0, 0, 1] and [0, 1] compare unequal.

assertDictEqual(d1, d2, msg=None)
assertEndsWith(s, suffix, msg=None)
assertEqual(first, second, msg=None)

Fail if the two objects are unequal as determined by the ‘==’ operator.

assertFalse(expr, msg=None)

Check that the expression is false.

assertGreater(a, b, msg=None)

Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)

Just like self.assertTrue(a >= b), but with a nicer default message.

assertHasAttr(obj, name, msg=None)
assertIn(member, container, msg=None)

Just like self.assertTrue(a in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)

Just like self.assertTrue(a is b), but with a nicer default message.

assertIsInstance(obj, cls, msg=None)

Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertIsNone(obj, msg=None)

Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNot(expr1, expr2, msg=None)

Just like self.assertTrue(a is not b), but with a nicer default message.

assertIsNotNone(obj, msg=None)

Included for symmetry with assertIsNone.

assertIsSubclass(cls, superclass, msg=None)
assertLess(a, b, msg=None)

Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)

Just like self.assertTrue(a <= b), but with a nicer default message.

assertListEqual(list1, list2, msg=None)

A list-specific equality assertion.

Parameters:
  • list1 – The first list to compare.

  • list2 – The second list to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertLogs(logger=None, level=None)

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

This method must be used as a context manager, and will yield a recording object with two attributes: output and records. At the end of the context manager, the output attribute will be a list of the matching formatted log messages and the records attribute will be a list of the corresponding LogRecord objects.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])
assertMultiLineEqual(first, second, msg=None)

Assert that two multi-line strings are equal.

assertNoLogs(logger=None, level=None)

Fail unless no log messages of level level or higher are emitted on logger_name or its children.

This method must be used as a context manager.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertNotEndsWith(s, suffix, msg=None)
assertNotEqual(first, second, msg=None)

Fail if the two objects are equal as determined by the ‘!=’ operator.

assertNotHasAttr(obj, name, msg=None)
assertNotIn(member, container, msg=None)

Just like self.assertTrue(a not in b), but with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)

Included for symmetry with assertIsInstance.

assertNotIsSubclass(cls, superclass, msg=None)
assertNotRegex(text, unexpected_regex, msg=None)

Fail the test if the text matches the regular expression.

assertNotStartsWith(s, prefix, msg=None)
assertRaises(expected_exception, *args, **kwargs)

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)

Asserts that the message in a raised exception matches a regex.

Parameters:
  • expected_exception – Exception class expected to be raised.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)

Fail the test unless the text matches the regular expression.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters:
  • seq1 – The first sequence to compare.

  • seq2 – The second sequence to compare.

  • seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)

A set-specific equality assertion.

Parameters:
  • set1 – The first set to compare.

  • set2 – The second set to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertStartsWith(s, prefix, msg=None)
assertTrue(expr, msg=None)

Check that the expression is true.

assertTupleEqual(tuple1, tuple2, msg=None)

A tuple-specific equality assertion.

Parameters:
  • tuple1 – The first tuple to compare.

  • tuple2 – The second tuple to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertWarns(expected_warning, *args, **kwargs)

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)
assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)

Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.

Parameters:
  • expected_warning – Warning class expected to be triggered.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

clearGenerated()

Remove the directories that are used for testing.

countTestCases()
debug()

Run the test without collecting errors in a TestResult

defaultTestResult()
classmethod doClassCleanups()

Execute all class cleanup functions. Normally called for you after tearDownClass.

doCleanups()

Execute all cleanup functions. Normally called for you after tearDown.

classmethod enterClassContext(cm)

Same as enterContext, but class-wide.

enterContext(cm)

Enters the supplied context manager.

If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.

fail(msg=None)

Fail immediately, with the given message.

failureException

alias of AssertionError

id()
longMessage = True
maxDiff = 640
run(result=None)
setUp()[source]

Set up the test Dataframe.

classmethod setUpClass()

Hook method for setting up class fixture before running tests in the class.

setUpPaths()

Create the directories that are used for testing.

shortDescription()

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

skipTest(reason)

Skip this test.

subTest(msg=<object object>, **params)

Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

tearDown()

Remove all files and directories that are used for testing.

classmethod tearDownClass()

Hook method for deconstructing the class fixture after running all tests in the class.

class qsprpred.data.tables.tests.TestTargetSpec(methodName='runTest')[source]

Bases: QSPRTestCase

Test the TargetSpec class.

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

classmethod addClassCleanup(function, /, *args, **kwargs)

Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).

addCleanup(function, /, *args, **kwargs)

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

addTypeEqualityFunc(typeobj, function)

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters:
  • typeobj – The data type to call this function on when both values are of the same type in assertEqual().

  • function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertCountEqual(first, second, msg=None)

Asserts that two iterables have the same elements, the same number of times, without regard to order.

self.assertEqual(Counter(list(first)),

Counter(list(second)))

Example:
  • [0, 1, 1] and [1, 0, 1] compare equal.

  • [0, 0, 1] and [0, 1] compare unequal.

assertDictEqual(d1, d2, msg=None)
assertEndsWith(s, suffix, msg=None)
assertEqual(first, second, msg=None)

Fail if the two objects are unequal as determined by the ‘==’ operator.

assertFalse(expr, msg=None)

Check that the expression is false.

assertGreater(a, b, msg=None)

Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)

Just like self.assertTrue(a >= b), but with a nicer default message.

assertHasAttr(obj, name, msg=None)
assertIn(member, container, msg=None)

Just like self.assertTrue(a in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)

Just like self.assertTrue(a is b), but with a nicer default message.

assertIsInstance(obj, cls, msg=None)

Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertIsNone(obj, msg=None)

Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNot(expr1, expr2, msg=None)

Just like self.assertTrue(a is not b), but with a nicer default message.

assertIsNotNone(obj, msg=None)

Included for symmetry with assertIsNone.

assertIsSubclass(cls, superclass, msg=None)
assertLess(a, b, msg=None)

Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)

Just like self.assertTrue(a <= b), but with a nicer default message.

assertListEqual(list1, list2, msg=None)

A list-specific equality assertion.

Parameters:
  • list1 – The first list to compare.

  • list2 – The second list to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertLogs(logger=None, level=None)

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

This method must be used as a context manager, and will yield a recording object with two attributes: output and records. At the end of the context manager, the output attribute will be a list of the matching formatted log messages and the records attribute will be a list of the corresponding LogRecord objects.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])
assertMultiLineEqual(first, second, msg=None)

Assert that two multi-line strings are equal.

assertNoLogs(logger=None, level=None)

Fail unless no log messages of level level or higher are emitted on logger_name or its children.

This method must be used as a context manager.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertNotEndsWith(s, suffix, msg=None)
assertNotEqual(first, second, msg=None)

Fail if the two objects are equal as determined by the ‘!=’ operator.

assertNotHasAttr(obj, name, msg=None)
assertNotIn(member, container, msg=None)

Just like self.assertTrue(a not in b), but with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)

Included for symmetry with assertIsInstance.

assertNotIsSubclass(cls, superclass, msg=None)
assertNotRegex(text, unexpected_regex, msg=None)

Fail the test if the text matches the regular expression.

assertNotStartsWith(s, prefix, msg=None)
assertRaises(expected_exception, *args, **kwargs)

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)

Asserts that the message in a raised exception matches a regex.

Parameters:
  • expected_exception – Exception class expected to be raised.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)

Fail the test unless the text matches the regular expression.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters:
  • seq1 – The first sequence to compare.

  • seq2 – The second sequence to compare.

  • seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)

A set-specific equality assertion.

Parameters:
  • set1 – The first set to compare.

  • set2 – The second set to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertStartsWith(s, prefix, msg=None)
assertTrue(expr, msg=None)

Check that the expression is true.

assertTupleEqual(tuple1, tuple2, msg=None)

A tuple-specific equality assertion.

Parameters:
  • tuple1 – The first tuple to compare.

  • tuple2 – The second tuple to compare.

  • msg – Optional message to use on failure instead of a list of differences.

assertWarns(expected_warning, *args, **kwargs)

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)
assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)

Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.

Parameters:
  • expected_warning – Warning class expected to be triggered.

  • expected_regex – Regex (re.Pattern object or string) expected to be found in error message.

  • args – Function to be called and extra positional args.

  • kwargs – Extra kwargs.

  • msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

checkTargetSpec(target_spec, name, task, th, n_classes=None)[source]
countTestCases()
debug()

Run the test without collecting errors in a TestResult

defaultTestResult()
classmethod doClassCleanups()

Execute all class cleanup functions. Normally called for you after tearDownClass.

doCleanups()

Execute all cleanup functions. Normally called for you after tearDown.

classmethod enterClassContext(cm)

Same as enterContext, but class-wide.

enterContext(cm)

Enters the supplied context manager.

If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.

fail(msg=None)

Fail immediately, with the given message.

failureException

alias of AssertionError

id()
longMessage = True
maxDiff = 640
run(result=None)
setUp()

Hook method for setting up the test fixture before exercising it.

classmethod setUpClass()

Hook method for setting up class fixture before running tests in the class.

shortDescription()

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

skipTest(reason)

Skip this test.

subTest(msg=<object object>, **params)

Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

tearDown()

Hook method for deconstructing the test fixture after testing it.

classmethod tearDownClass()

Hook method for deconstructing the class fixture after running all tests in the class.

testInit()[source]

Check the TargetSpec class on target spec creation.

testSerialization = None
testSerialization_0(**kw)
testSerialization_1(**kw)

Module contents