qsprpred.data.tables package
Submodules
qsprpred.data.tables.base module
- class qsprpred.data.tables.base.DataSetDependant(dataset: MoleculeDataTable | None = None)[source]
Bases:
object
Classes that need a data set to operate have to implement this.
- getDataSet()[source]
Get the data set attached to this object.
- Raises:
ValueError – If no data set is attached to this object.
- setDataSet(dataset: MoleculeDataTable)[source]
- class qsprpred.data.tables.base.DataTable[source]
Bases:
StoredTable
- abstract apply(func: callable, on_props: list[str] | None = None, func_args: list | None = None, func_kwargs: dict | None = None)[source]
Apply a function on all or selected properties. The properties are supplied as the first positional argument to the function.
- abstract clearFiles()
Delete the files associated with the table.
- abstract filter(table_filters: list[Callable])[source]
Filter the dataset.
- Parameters:
table_filters (List[Callable]) – The filters to apply.
- abstract static fromFile(filename: str) StoredTable
Load a
StoredTable
object from a file.- Parameters:
filename (str) – The name of the file to load the object from.
- Returns:
The
StoredTable
object itself.
- abstract getSubset(prefix: str)[source]
Get a subset of the dataset.
- Parameters:
prefix (str) – The prefix of the subset.
- abstract reload()
Reload the table from a file.
- abstract removeProperty(name: str)[source]
Remove a property from the dataset.
- Parameters:
name (str) – The name of the property.
- abstract save()
Save the table to a file.
- class qsprpred.data.tables.base.MoleculeDataTable[source]
Bases:
DataTable
- abstract addDescriptors(descriptors: DescriptorSet, *args, **kwargs)[source]
Add descriptors to the dataset.
- Parameters:
descriptors (list[DescriptorSet]) – The descriptors to add.
args – Additional positional arguments to be passed to each descriptor set.
kwargs – Additional keyword arguments to be passed to each descriptor set.
- abstract apply(func: callable, on_props: list[str] | None = None, func_args: list | None = None, func_kwargs: dict | None = None)
Apply a function on all or selected properties. The properties are supplied as the first positional argument to the function.
- abstract clearFiles()
Delete the files associated with the table.
- abstract filter(table_filters: list[Callable])
Filter the dataset.
- Parameters:
table_filters (List[Callable]) – The filters to apply.
- abstract static fromFile(filename: str) StoredTable
Load a
StoredTable
object from a file.- Parameters:
filename (str) – The name of the file to load the object from.
- Returns:
The
StoredTable
object itself.
- abstract getDescriptorNames() list[str] [source]
Get the names of the descriptors that are currently in the dataset.
- Returns:
a
list
of descriptor names
- abstract getDescriptors() DataFrame [source]
Get the table of descriptors that are currently in the dataset.
- Returns:
a pd.DataFrame with the descriptors
- abstract getProperties()
Get the property names contained in the dataset.
- abstract getSubset(prefix: str)
Get a subset of the dataset.
- Parameters:
prefix (str) – The prefix of the subset.
- abstract reload()
Reload the table from a file.
- abstract removeProperty(name: str)
Remove a property from the dataset.
- Parameters:
name (str) – The name of the property.
- abstract save()
Save the table to a file.
- class qsprpred.data.tables.base.StoredTable[source]
Bases:
ABC
Abstract base class for tables that are stored in a file.
- abstract static fromFile(filename: str) StoredTable [source]
Load a
StoredTable
object from a file.- Parameters:
filename (str) – The name of the file to load the object from.
- Returns:
The
StoredTable
object itself.
qsprpred.data.tables.mol module
- class qsprpred.data.tables.mol.DescriptorTable(calculator: DescriptorSet, name: str, df: DataFrame | None = None, store_dir: str = '.', overwrite: bool = False, key_cols: list | None = None, n_jobs: int = 1, chunk_size: int = 1000, autoindex_name: str = 'QSPRID', random_state: int | None = None, store_format: str = 'pkl')[source]
Bases:
PandasDataTable
Pandas table that holds descriptor data for modelling and other analyses.
- Variables:
calculator (DescriptorSet) –
DescriptorSet
used for descriptor calculation.
Initialize a
DescriptorTable
object.- Parameters:
calculator (DescriptorSet) –
DescriptorSet
used for descriptor calculation.name (str) – Name of the new descriptor table.
df (pd.DataFrame) – data frame containing the descriptors. If you provide a dataframe for a dataset that already exists on disk, the dataframe from disk will override the supplied data frame. Set ‘overwrite’ to
True
to override the data frame on disk.store_dir (str) – Directory to store the dataset files. Defaults to the current directory. If it already contains files with the same name, the existing data will be loaded.
overwrite (bool) – Overwrite existing dataset.
key_cols (list) – list of columns to use as index. If None, the index will be a custom generated ID.
n_jobs (int) – Number of jobs to use for parallel processing. If <= 0, all available cores will be used.
chunk_size (int) – Size of chunks to use per job in parallel processing.
autoindex_name (str) – Column name to use for automatically generated IDs.
random_state (int) – Random state to use for shuffling and other random ops.
store_format (str) – Format to use for storing the data (‘pkl’ or ‘csv’).
- apply(func: Callable[[dict[str, list[Any]] | DataFrame, ...], Any], func_args: tuple[Any] | None = None, func_kwargs: dict[str, Any] | None = None, on_props: list[str] | None = None, as_df: bool = False, chunk_size: int | None = None, n_jobs: int | None = None) Generator
Apply a function to the data frame. The properties of the data set are passed as the first positional argument to the function. This will be a dictionary of the form
{'prop1': [...], 'prop2': [...], ...}
. Ifas_df
isTrue
, the properties will be passed as a data frame instead.Any additional arguments specified in
func_args
andfunc_kwargs
will be passed to the function after the properties as positional and keyword arguments, respectively.If
on_props
is specified, only the properties in this list will be passed to the function. Ifon_props
isNone
, all properties will be passed to the function.- Parameters:
func (Callable) – Function to apply to the data frame.
func_args (list) – Positional arguments to pass to the function.
func_kwargs (dict) – Keyword arguments to pass to the function.
on_props (list[str]) – list of properties to send to the function as arguments
as_df (bool) – If
True
, the function is applied to chunks represented as data frames.chunk_size (int) – Size of chunks to use per job in parallel processing. If
None
, the chunk size will be set toself.chunkSize
. The chunk size will always be set to the number of rows in the data frame ifn_jobs
or `self.nJobs is 1.n_jobs (int) – Number of jobs to use for parallel processing. If
None
,self.nJobs
is used.
- Returns:
Generator that yields the results of the function applied to each chunk of the data frame as determined by
chunk_size
andn_jobs
. Each item in the generator will be the result of the function applied to one chunk of the data set.- Return type:
Generator
- clearFiles()
Remove all files associated with this data set from disk.
- dropEmptyProperties(names: list[str])
Drop rows with empty target property value from the data set.
- filter(table_filters: list[Callable])
Filter the data frame using a list of filters.
Each filter is a function that takes the data frame and returns a a new data frame with the filtered rows. The new data frame is then used as the input for the next filter. The final data frame is saved as the new data frame of the
MoleculeTable
.
- classmethod fromFile(filename: str) PandasDataTable
Load a
StoredTable
object from a file.- Parameters:
filename (str) – The name of the file to load the object from.
- Returns:
The
StoredTable
object itself.
- generateIndex(name: str | None = None, prefix: str | None = None)
Generate a custom index for the data frame automatically.
- getDF()
Get the data frame this instance manages.
- Returns:
The data frame this instance manages.
- Return type:
pd.DataFrame
- getDescriptorNames(active_only=True)[source]
Get the names of the descriptors in this represented by this table. By default, only active descriptors are returned. You can use active_only=False to get all descriptors saved in the table.
- getProperties() list[str]
Get names of all properties/variables saved in the data frame (all columns).
- Returns:
list of property names.
- Return type:
- getProperty(name: str) Series
Get property values from the data set.
- Parameters:
name (str) – Name of the property to get.
- Returns:
List of values for the property.
- Return type:
pd.Series
- getSubset(prefix: str)
Get a subset of the data set by providing a prefix for the column names or a column name directly.
- Parameters:
prefix (str) – Prefix of the column names to select.
- hasProperty(name)
Check whether a property is present in the data frame.
- imputeProperties(names: list[str], imputer: Callable)
Impute missing property values.
- Parameters:
names (list) – List of property names to impute.
imputer (Callable) –
- imputer object implementing the
fit_transform
method from scikit-learn API.
- imputer object implementing the
- iterChunks(include_props: list[str] | None = None, as_dict: bool = False, chunk_size: int | None = None) Generator[DataFrame | dict, None, None]
Batch a data frame into chunks of the given size.
- Parameters:
- Returns:
Generator that yields batches of the data frame as smaller data frames.
- Return type:
Generator[pd.DataFrame, None, None]
- keepDescriptors(descriptors: list[str]) list[str] [source]
Mark only the given descriptors as active in this set.
- Parameters:
descriptors (list) – list of descriptor names to keep
- Returns:
list of descriptor names that were kept
- Return type:
- Raises:
ValueError – If any of the descriptors are not present in the table.
- property metaFile
The path to the meta file of this data set.
- property nJobs
- reload()
Reload the data table from disk.
- removeProperty(name)
Remove a property from the data frame.
- Parameters:
name (str) – Name of the property to delete.
- save()
Save the data frame to disk and all associated files.
- Returns:
Path to the saved data frame.
- Return type:
- setIndex(cols: list[str])
Create and index column from several columns of the data set. This also resets the
idProp
attribute to be the name of the index columns joined by a ‘~’ character. The values of the columns are also joined in the same way to create the index. Thus, make sure the values of the columns are unique together and can be joined to a string.
- setRandomState(random_state: int)
Set the random state for this instance.
- Parameters:
random_state (int) – Random state to use for shuffling and other random operations.
- property storeDir
The data set folder containing the data set files after saving.
- property storePath
The path to the main data set file.
- property storePrefix
The prefix of the data set files.
- toFile(filename: str)
Save the metafile and all associated files to a custom location.
- Parameters:
filename (str) – absolute path to the saved metafile.
- class qsprpred.data.tables.mol.MoleculeTable(name: str, df: DataFrame | None = None, smiles_col: str = 'SMILES', add_rdkit: bool = False, store_dir: str = '.', overwrite: bool = False, n_jobs: int | None = 1, chunk_size: int | None = None, drop_invalids: bool = True, index_cols: list[str] | None = None, autoindex_name: str = 'QSPRID', random_state: int | None = None, store_format: str = 'pkl')[source]
Bases:
PandasDataTable
,SearchableMolTable
,Summarizable
Class that holds and prepares molecule data for modelling and other analyses.
- Variables:
smilesCol (str) – Name of the column containing the SMILES sequences of molecules.
includesRdkit (bool) – Whether the data frame contains RDKit molecules as one of the properties.
descriptors (list[DescriptorTable]) – List of
DescriptorTable
objects containing the descriptors calculated for this table.
Initialize a
MoleculeTable
object.This object wraps a pandas dataframe and provides short-hand methods to prepare molecule data for modelling and analysis.
- Parameters:
name (str) – Name of the dataset. You can use this name to load the dataset from disk anytime and create a new instance.
df (pd.DataFrame) – Pandas dataframe containing the data. If you provide a dataframe for a dataset that already exists on disk,
Set (the dataframe from disk will override the supplied data frame.) – ‘overwrite’ to
True
to override the data frame on disk.smiles_col (str) – Name of the column containing the SMILES sequences of molecules.
add_rdkit (bool) – Add RDKit molecule instances to the dataframe. WARNING: This can take a lot of memory.
store_dir (str) – Directory to store the dataset files. Defaults to the current directory. If it already contains files with the same name, the existing data will be loaded.
overwrite (bool) – Overwrite existing dataset.
n_jobs (int) – Number of jobs to use for parallel processing. If <= 0, all available cores will be used.
chunk_size (int) – Size of chunks to use per job in parallel processing.
drop_invalids (bool) – Drop invalid molecules from the data frame.
index_cols (list[str]) – list of columns to use as index. If None, the index will be a custom generated ID.
autoindex_name (str) – Column name to use for automatically generated IDs.
random_state (int) – Random state to use for shuffling and other random ops.
store_format (str) – Format to use for storing the data (‘pkl’ or ‘csv’).
- addClusters(clusters: list['MoleculeClusters'], recalculate: bool = False)[source]
Add clusters to the data frame.
A new column is created that contains the identifier of the corresponding cluster calculator.
- Parameters:
clusters (list) – list of
MoleculeClusters
calculators.recalculate (bool) – Whether to recalculate clusters even if they are already present in the data frame.
- addDescriptors(descriptors: list[qsprpred.data.descriptors.sets.DescriptorSet], recalculate: bool = False, fail_on_invalid: bool = True, *args, **kwargs)[source]
Add descriptors to the data frame with the given descriptor calculators.
- Parameters:
descriptors (list[DescriptorSet]) – List of
DescriptorSet
objects to use for descriptor calculation.recalculate (bool) – Whether to recalculate descriptors even if they are already present in the data frame. If
False
, existing descriptors are kept and no calculation takes place.fail_on_invalid (bool) – Whether to throw an exception if any molecule is invalid.
*args – Additional positional arguments to pass to each descriptor set.
**kwargs – Additional keyword arguments to pass to each descriptor set.
- addScaffolds(scaffolds: list[qsprpred.data.chem.scaffolds.Scaffold], add_rdkit_scaffold: bool = False, recalculate: bool = False)[source]
Add scaffolds to the data frame.
A new column is created that contains the SMILES of the corresponding scaffold. If
add_rdkit_scaffold
is set toTrue
, a new column is created that contains the RDKit scaffold of the corresponding molecule.
- apply(func: Callable[[dict[str, list[Any]] | DataFrame, ...], Any], func_args: tuple[Any] | None = None, func_kwargs: dict[str, Any] | None = None, on_props: list[str] | None = None, as_df: bool = False, chunk_size: int | None = None, n_jobs: int | None = None) Generator
Apply a function to the data frame. The properties of the data set are passed as the first positional argument to the function. This will be a dictionary of the form
{'prop1': [...], 'prop2': [...], ...}
. Ifas_df
isTrue
, the properties will be passed as a data frame instead.Any additional arguments specified in
func_args
andfunc_kwargs
will be passed to the function after the properties as positional and keyword arguments, respectively.If
on_props
is specified, only the properties in this list will be passed to the function. Ifon_props
isNone
, all properties will be passed to the function.- Parameters:
func (Callable) – Function to apply to the data frame.
func_args (list) – Positional arguments to pass to the function.
func_kwargs (dict) – Keyword arguments to pass to the function.
on_props (list[str]) – list of properties to send to the function as arguments
as_df (bool) – If
True
, the function is applied to chunks represented as data frames.chunk_size (int) – Size of chunks to use per job in parallel processing. If
None
, the chunk size will be set toself.chunkSize
. The chunk size will always be set to the number of rows in the data frame ifn_jobs
or `self.nJobs is 1.n_jobs (int) – Number of jobs to use for parallel processing. If
None
,self.nJobs
is used.
- Returns:
Generator that yields the results of the function applied to each chunk of the data frame as determined by
chunk_size
andn_jobs
. Each item in the generator will be the result of the function applied to one chunk of the data set.- Return type:
Generator
- attachDescriptors(calculator: DescriptorSet, descriptors: DataFrame, index_cols: list)[source]
Attach descriptors to the data frame.
- Parameters:
calculator (DescriptorsCalculator) – DescriptorsCalculator object to use for descriptor calculation.
descriptors (pd.DataFrame) – DataFrame containing the descriptors to attach.
index_cols (list) – List of column names to use as index.
- checkMols(throw: bool = True)[source]
Returns a boolean array indicating whether each molecule is valid or not. If
throw
isTrue
, an exception is thrown if any molecule is invalid.- Parameters:
throw (bool) – Whether to throw an exception if any molecule is invalid.
- Returns:
Boolean series indicating whether each molecule is valid.
- Return type:
mask (pd.Series)
- clearFiles()
Remove all files associated with this data set from disk.
- createScaffoldGroups(mols_per_group: int = 10)[source]
Create scaffold groups.
A scaffold group is a list of molecules that share the same scaffold. New columns are created that contain the scaffold group ID and the scaffold group size.
- Parameters:
mols_per_group (int) – number of molecules per scaffold group.
- property descriptorSets
Get the descriptor calculators for this table.
- dropDescriptorSets(descriptors: list[qsprpred.data.descriptors.sets.DescriptorSet | str], full_removal: bool = False)[source]
Drop descriptors from the given sets from the data frame.
- Parameters:
descriptors (list[DescriptorSet | str]) – List of
DescriptorSet
objects or their names. Name of a descriptor set corresponds to the result returned by its__str__
method.full_removal (bool) – Whether to remove the descriptor data (will perform full removal). By default, a soft removal is performed by just rendering the descriptors inactive. A full removal will remove the descriptorSet from the dataset, including the saved files. It is not possible to restore a descriptorSet after a full removal.
- dropDescriptors(descriptors: list[str])[source]
Drop descriptors by name. Performs a simple feature selection by removing the given descriptor names from the data set.
- dropEmptyProperties(names: list[str])
Drop rows with empty target property value from the data set.
- dropInvalids()[source]
Drops invalid molecules from the data set.
- Returns:
- Boolean mask of invalid molecules in the original
data set.
- Return type:
mask (pd.Series)
- filter(table_filters: list[Callable])
Filter the data frame using a list of filters.
Each filter is a function that takes the data frame and returns a a new data frame with the filtered rows. The new data frame is then used as the input for the next filter. The final data frame is saved as the new data frame of the
MoleculeTable
.
- classmethod fromFile(filename: str) PandasDataTable
Load a
StoredTable
object from a file.- Parameters:
filename (str) – The name of the file to load the object from.
- Returns:
The
StoredTable
object itself.
- static fromSDF(name, filename, smiles_prop, *args, **kwargs)[source]
Create a
MoleculeTable
instance from an SDF file.- Parameters:
name (str) – Name of the data set.
filename (str) – Path to the SDF file.
smiles_prop (str) – Name of the property in the SDF file containing the SMILES sequence.
*args – Additional arguments to pass to the
MoleculeTable
constructor.**kwargs – Additional keyword arguments to pass to the
MoleculeTable
constructor.
- static fromSMILES(name: str, smiles: list, *args, **kwargs)[source]
Create a
MoleculeTable
instance from a list of SMILES sequences.- Parameters:
name (str) – Name of the data set.
smiles (list) – list of SMILES sequences.
*args – Additional arguments to pass to the
MoleculeTable
constructor.**kwargs – Additional keyword arguments to pass to the
MoleculeTable
constructor.
- static fromTableFile(name: str, filename: str, sep='\t', *args, **kwargs)[source]
Create a
MoleculeTable
instance from a file containing a table of molecules (i.e. a CSV file).- Parameters:
name (str) – Name of the data set.
filename (str) – Path to the file containing the table.
sep (str) – Separator used in the file for different columns.
*args – Additional arguments to pass to the
MoleculeTable
constructor.**kwargs – Additional keyword arguments to pass to the
MoleculeTable
constructor.
- generateDescriptorDataSetName(ds_set: str | DescriptorSet)[source]
Generate a descriptor set name from a descriptor set.
- generateIndex(name: str | None = None, prefix: str | None = None)
Generate a custom index for the data frame automatically.
- getClusterNames(clusters: list['MoleculeClusters'] | None = None)[source]
Get the names of the clusters in the data frame.
- Returns:
List of cluster names.
- Return type:
- getClusters(clusters: list['MoleculeClusters'] | None = None)[source]
Get the subset of the data frame that contains only clusters.
- Returns:
Data frame containing only clusters.
- Return type:
pd.DataFrame
- getDF()
Get the data frame this instance manages.
- Returns:
The data frame this instance manages.
- Return type:
pd.DataFrame
- getDescriptorNames()[source]
Get the names of the descriptors present for molecules in this data set.
- Returns:
list of descriptor names.
- Return type:
- getDescriptors(active_only=False)[source]
Get the calculated descriptors as a pandas data frame.
- Returns:
Data frame containing only descriptors.
- Return type:
pd.DataFrame
- getProperties() list[str]
Get names of all properties/variables saved in the data frame (all columns).
- Returns:
list of property names.
- Return type:
- getProperty(name: str) Series
Get property values from the data set.
- Parameters:
name (str) – Name of the property to get.
- Returns:
List of values for the property.
- Return type:
pd.Series
- getScaffoldGroups(scaffold_name: str, mol_per_group: int = 10)[source]
Get the scaffold groups for a given combination of scaffold and number of molecules per scaffold group.
- getScaffoldNames(scaffolds: list[qsprpred.data.chem.scaffolds.Scaffold] | None = None, include_mols: bool = False)[source]
Get the names of the scaffolds in the data frame.
- getScaffolds(scaffolds: list[qsprpred.data.chem.scaffolds.Scaffold] | None = None, include_mols: bool = False)[source]
Get the subset of the data frame that contains only scaffolds.
- Parameters:
include_mols (bool) – Whether to include the RDKit scaffold columns as well.
- Returns:
Data frame containing only scaffolds.
- Return type:
pd.DataFrame
- getSubset(prefix: str)
Get a subset of the data set by providing a prefix for the column names or a column name directly.
- Parameters:
prefix (str) – Prefix of the column names to select.
- getSummary()[source]
Make a summary with some statistics about the molecules in this table. The summary contains the number of molecules per target and the number of unique molecules per target.
Requires this data set to be imported from Papyrus for now.
- Returns:
A dataframe with the summary statistics.
- Return type:
(pd.DataFrame)
- property hasClusters
Check whether the data frame contains clusters.
- Returns:
Whether the data frame contains clusters.
- Return type:
- hasDescriptors(descriptors: list[qsprpred.data.descriptors.sets.DescriptorSet | str] | None = None) bool | list[bool] [source]
Check whether the data frame contains given descriptors.
- Parameters:
descriptors (list) – list of
DescriptorSet
objects or prefixes of descriptors to check for. IfNone
, all descriptors are checked for and a single boolean is returned if any descriptors are found.- Returns:
list of booleans indicating whether each descriptor is present or not.
- Return type:
- hasProperty(name)
Check whether a property is present in the data frame.
- property hasScaffoldGroups
Check whether the data frame contains scaffold groups.
- Returns:
Whether the data frame contains scaffold groups.
- Return type:
- property hasScaffolds
Check whether the data frame contains scaffolds.
- Returns:
Whether the data frame contains scaffolds.
- Return type:
- imputeProperties(names: list[str], imputer: Callable)
Impute missing property values.
- Parameters:
names (list) – List of property names to impute.
imputer (Callable) –
- imputer object implementing the
fit_transform
method from scikit-learn API.
- imputer object implementing the
- iterChunks(include_props: list[str] | None = None, as_dict: bool = False, chunk_size: int | None = None) Generator[DataFrame | dict, None, None]
Batch a data frame into chunks of the given size.
- Parameters:
- Returns:
Generator that yields batches of the data frame as smaller data frames.
- Return type:
Generator[pd.DataFrame, None, None]
- property metaFile
The path to the meta file of this data set.
- property nJobs
- processMols(processor: MolProcessor, proc_args: tuple[Any] | None = None, proc_kwargs: dict[str, Any] | None = None, add_props: list[str] | None = None, as_rdkit: bool = False, chunk_size: int | None = None, n_jobs: int | None = None) Generator [source]
Apply a function to the molecules in the data frame. The SMILES or an RDKit molecule will be supplied as the first positional argument to the function. Additional properties to provide from the data set can be specified with ‘add_props’, which will be a dictionary supplied as an additional positional argument to the function.
IMPORTANT: For successful parallel processing, the processor must be picklable. Also note that the returned generator will produce results as soon as they are ready, which means that the chunks of data will not be in the same order as the original data frame. However, you can pass the value of
idProp
inadd_props
to identify the processed molecules. SeeCheckSmilesValid
for an example.- Parameters:
processor (MolProcessor) –
MolProcessor
object to use for processing.proc_args (list, optional) – Any additional positional arguments to pass to the processor.
proc_kwargs (dict, optional) – Any additional keyword arguments to pass to the processor.
add_props (list, optional) – List of data set properties to send to the processor. If
None
, all properties will be sent.as_rdkit (bool, optional) – Whether to convert the molecules to RDKit molecules before applying the processor.
chunk_size (int, optional) – Size of chunks to use per job in parallel. If not specified,
self.chunkSize
is used.n_jobs (int, optional) – Number of jobs to use for parallel processing. If not specified,
self.nJobs
is used.
- Returns:
A generator that yields the results of the supplied processor on the chunked molecules from the data set.
- Return type:
Generator
- reload()
Reload the data table from disk.
- removeProperty(name)
Remove a property from the data frame.
- Parameters:
name (str) – Name of the property to delete.
- restoreDescriptorSets(descriptors: list[qsprpred.data.descriptors.sets.DescriptorSet | str])[source]
Restore descriptors that were previously removed.
- Parameters:
descriptors (list[DescriptorSet | str]) – List of
DescriptorSet
objects or their names. Name of a descriptor set corresponds to the result returned by its__str__
method.
- classmethod runMolProcess(props: dict[str, list] | DataFrame, func: MolProcessor, add_rdkit: bool, smiles_col: str, *args, **kwargs)[source]
A helper method to run a
MolProcessor
on a list of molecules viaapply
. It converts the SMILES to RDKit molecules if required and then applies the function to theMolProcessor
object.- Parameters:
props (dict) – Dictionary of properties that will be passed in addition to the molecule structure.
func (MolProcessor) –
MolProcessor
object to use for processing.add_rdkit (bool) – Whether to convert the SMILES to RDKit molecules before applying the function.
smiles_col (str) – Name of the property containing the SMILES sequences.
*args – Additional positional arguments to pass to the function.
**kwargs – Additional keyword arguments to pass to the function.
- sample(n: int, name: str | None = None, random_state: int | None = None) MoleculeTable [source]
Sample n molecules from the table.
- Parameters:
- Returns:
A dataframe with the sampled molecules.
- Return type:
- save()
Save the data frame to disk and all associated files.
- Returns:
Path to the saved data frame.
- Return type:
- searchOnProperty(prop_name: str, values: list[str], name: str | None = None, exact=False) MoleculeTable [source]
Search in this table using a property name and a list of values. It is assumed that the property is searchable with string matching. Either an exact match or a partial match can be used. If ‘exact’ is
False
, the search will be performed with partial matching, i.e. all molecules that contain any of the given values in the property will be returned. If ‘exact’ isTrue
, only molecules that have the exact property value for any of the given values will be returned.- Parameters:
prop_name (str) – Name of the property to search on.
values (list[str]) – List of values to search for. If any of the values is found in the property, the molecule will be considered a match.
name (str | None, optional) – Name of the new table. Defaults to the name of the old table, plus the
_searched
suffix.exact (bool, optional) – Whether to use exact matching, i.e. whether to search for exact strings or just substrings. Defaults to False.
- Returns:
A new table with the molecules from the old table with the given property values.
- Return type:
- searchWithIndex(index: Index, name: str | None = None) MoleculeTable [source]
Search in this table using a pandas index. The return values is a new table with the molecules from the old table with the given indices.
- Parameters:
index (pd.Index) – Indices to search for in this table.
name (str) – Name of the new table. Defaults to the name of the old table, plus the
_searched
suffix.
- Returns:
A new table with the molecules from the old table with the given indices.
- Return type:
- searchWithSMARTS(patterns: list[str], operator: ~typing.Literal['or', 'and'] = 'or', use_chirality: bool = False, name: str | None = None, match_function: ~typing.Callable = <function match_mol_to_smarts>) MoleculeTable [source]
Search the molecules in the table with a SMARTS pattern.
- Parameters:
patterns – List of SMARTS patterns to search with.
operator (object) – Whether to use an “or” or “and” operator on patterns. Defaults to “or”.
use_chirality – Whether to use chirality in the search.
name – Name of the new table. Defaults to the name of the old table, plus the
smarts_searched
suffix.match_function – Function to use for matching the molecules to the SMARTS patterns. Defaults to
match_mol_to_smarts
.
- Returns:
A dataframe with the molecules that match the pattern.
- Return type:
(MolTable)
- setIndex(cols: list[str])
Create and index column from several columns of the data set. This also resets the
idProp
attribute to be the name of the index columns joined by a ‘~’ character. The values of the columns are also joined in the same way to create the index. Thus, make sure the values of the columns are unique together and can be joined to a string.
- setRandomState(random_state: int)
Set the random state for this instance.
- Parameters:
random_state (int) – Random state to use for shuffling and other random operations.
- property smiles: Generator[str, None, None]
Get the SMILES strings of the molecules in the data frame.
- Returns:
Generator of SMILES strings.
- Return type:
Generator[str, None, None]
- standardizeSmiles(smiles_standardizer, drop_invalid=True)[source]
Apply smiles_standardizer to the compounds in parallel
- Parameters:
() (smiles_standardizer) – either
None
to skip the standardization,chembl
,old
, or a partial function that reads and standardizes smiles.drop_invalid (bool) – whether to drop invalid SMILES from the data set. Defaults to
True
. IfFalse
, invalid SMILES will be retained in their original form. Ifself.invalidsRemoved
isTrue
, there will be no effect even ifdrop_invalid
isTrue
. Setself.invalidsRemoved
toFalse
on this instance to force the removal of invalid SMILES.
- Raises:
ValueError – when smiles_standardizer is not a callable or one of the predefined strings.
- property storeDir
The data set folder containing the data set files after saving.
- property storePath
The path to the main data set file.
- property storePrefix
The prefix of the data set files.
- toFile(filename: str)[source]
Save the metafile and all associated files to a custom location.
- Parameters:
filename (str) – absolute path to the saved metafile.
qsprpred.data.tables.pandas module
- class qsprpred.data.tables.pandas.PandasDataTable(name: str, df: DataFrame | None = None, store_dir: str = '.', overwrite: bool = False, index_cols: list[str] | None = None, n_jobs: int = 1, chunk_size: int | None = None, autoindex_name: str = 'QSPRID', random_state: int | None = None, store_format: str = 'pkl', parallel_generator: ParallelGenerator | None = None)[source]
Bases:
DataTable
,JSONSerializable
A Pandas DataFrame wrapper class to enable data processing functions on QSPRpred data.
- Variables:
name (str) – Name of the data set. You can use this name to load the dataset from disk anytime and create a new instance.
df (pd.DataFrame) – Pandas dataframe containing the data. You can modify this one directly, but note that removing rows, adding rows, or changing the index or other automatic properties of the data frame might break the data set. In that case, it is recommended to recreate the data set from scratch.
indexCols (List) – List of columns to use as index. If
None
, the index will be a custom generated ID. Note that if you specify multiple columns their values will be joined with a ‘~’ character rather than using the default pandas multi-index.nJobs (int) – Number of jobs to use for parallel processing. If set to
None
or0
, all available cores will be set.chunkSize (int) – Size of chunks to use per job in parallel processing. This is automatically set to the number of rows in the data frame divided by
nJobs
. However, you can also set it manually if you want to use a different chunk size. Set toNone
to again use the default value determined bynJobs
.randomState (int) – Random state to use for all random operations.
idProp (str) – Column name to use for automatically generated IDs. Defaults to ‘QSPRID’. If
indexCols
is set, this will be the names of the columns joined by ‘~’.storeFormat (str) – Format to use for storing the data frame. Currently only ‘pkl’ and ‘csv’ are supported. Defaults to ‘pkl’ because it is faster. However, ‘csv’ is more portable and can be opened in other programs.
parallelGenerator (Callable) – A
ParallelGenerator
to use for parallel processing of chunks of data. Defaults toqsprpred.utils.parallel.MultiprocessingPoolGenerator
. You can replace this with your own parallel generator function if you want to use a different parallelization strategy (i.e. utilize remote servers instead of local processes).
Initialize a
PandasDataTable
object. Args- name (str):
Name of the data set. You can use this name to load the dataset from disk anytime and create a new instance.
- df (pd.DataFrame):
Pandas dataframe containing the data. If you provide a dataframe for a dataset that already exists on disk, the dataframe from disk will override the supplied data frame. Set ‘overwrite’ to
True
to override the data frame on disk.- store_dir (str):
Directory to store the dataset files. Defaults to the current directory. If it already contains files with the same name, the existing data will be loaded.
- overwrite (bool):
Overwrite existing dataset.
- index_cols (List):
List of columns to use as index. If None, the index will be a custom generated ID.
- n_jobs (int):
Number of jobs to use for parallel processing. If <= 0, all available cores will be used.
- chunk_size (int):
Size of chunks to use per job in parallel processing. If
None
, the chunk size will be set to the number of rows in the data frame divided bynJobs
.- autoindex_name (str):
Column name to use for automatically generated IDs.
- random_state (int):
Random state to use for all random operations for reproducibility. If not specified, the state is generated randomly. The state is saved upon
save
so if you want to change the state later, call thesetRandomState
method after loading.- store_format (str):
Format to use for storing the data frame. Currently only ‘pkl’ and ‘csv’ are supported.
- parallel_generator (ParallelGenerator | None):
A
ParallelGenerator
to use for parallel processing of chunks of data. Defaults toqsprpred.utils.parallel.MultiprocessingPoolGenerator
. You can replace this with your own parallel generator function if you want to use a different parallelization strategy (i.e. utilize remote servers instead of local processes).
- apply(func: Callable[[dict[str, list[Any]] | DataFrame, ...], Any], func_args: tuple[Any] | None = None, func_kwargs: dict[str, Any] | None = None, on_props: list[str] | None = None, as_df: bool = False, chunk_size: int | None = None, n_jobs: int | None = None) Generator [source]
Apply a function to the data frame. The properties of the data set are passed as the first positional argument to the function. This will be a dictionary of the form
{'prop1': [...], 'prop2': [...], ...}
. Ifas_df
isTrue
, the properties will be passed as a data frame instead.Any additional arguments specified in
func_args
andfunc_kwargs
will be passed to the function after the properties as positional and keyword arguments, respectively.If
on_props
is specified, only the properties in this list will be passed to the function. Ifon_props
isNone
, all properties will be passed to the function.- Parameters:
func (Callable) – Function to apply to the data frame.
func_args (list) – Positional arguments to pass to the function.
func_kwargs (dict) – Keyword arguments to pass to the function.
on_props (list[str]) – list of properties to send to the function as arguments
as_df (bool) – If
True
, the function is applied to chunks represented as data frames.chunk_size (int) – Size of chunks to use per job in parallel processing. If
None
, the chunk size will be set toself.chunkSize
. The chunk size will always be set to the number of rows in the data frame ifn_jobs
or `self.nJobs is 1.n_jobs (int) – Number of jobs to use for parallel processing. If
None
,self.nJobs
is used.
- Returns:
Generator that yields the results of the function applied to each chunk of the data frame as determined by
chunk_size
andn_jobs
. Each item in the generator will be the result of the function applied to one chunk of the data set.- Return type:
Generator
- dropEmptyProperties(names: list[str])[source]
Drop rows with empty target property value from the data set.
- filter(table_filters: list[Callable])[source]
Filter the data frame using a list of filters.
Each filter is a function that takes the data frame and returns a a new data frame with the filtered rows. The new data frame is then used as the input for the next filter. The final data frame is saved as the new data frame of the
MoleculeTable
.
- classmethod fromFile(filename: str) PandasDataTable [source]
Load a
StoredTable
object from a file.- Parameters:
filename (str) – The name of the file to load the object from.
- Returns:
The
StoredTable
object itself.
- generateIndex(name: str | None = None, prefix: str | None = None)[source]
Generate a custom index for the data frame automatically.
- getDF()[source]
Get the data frame this instance manages.
- Returns:
The data frame this instance manages.
- Return type:
pd.DataFrame
- getProperties() list[str] [source]
Get names of all properties/variables saved in the data frame (all columns).
- Returns:
list of property names.
- Return type:
- getProperty(name: str) Series [source]
Get property values from the data set.
- Parameters:
name (str) – Name of the property to get.
- Returns:
List of values for the property.
- Return type:
pd.Series
- getSubset(prefix: str)[source]
Get a subset of the data set by providing a prefix for the column names or a column name directly.
- Parameters:
prefix (str) – Prefix of the column names to select.
- imputeProperties(names: list[str], imputer: Callable)[source]
Impute missing property values.
- Parameters:
names (list) – List of property names to impute.
imputer (Callable) –
- imputer object implementing the
fit_transform
method from scikit-learn API.
- imputer object implementing the
- iterChunks(include_props: list[str] | None = None, as_dict: bool = False, chunk_size: int | None = None) Generator[DataFrame | dict, None, None] [source]
Batch a data frame into chunks of the given size.
- Parameters:
- Returns:
Generator that yields batches of the data frame as smaller data frames.
- Return type:
Generator[pd.DataFrame, None, None]
- property metaFile
The path to the meta file of this data set.
- property nJobs
- removeProperty(name)[source]
Remove a property from the data frame.
- Parameters:
name (str) – Name of the property to delete.
- save()[source]
Save the data frame to disk and all associated files.
- Returns:
Path to the saved data frame.
- Return type:
- setIndex(cols: list[str])[source]
Create and index column from several columns of the data set. This also resets the
idProp
attribute to be the name of the index columns joined by a ‘~’ character. The values of the columns are also joined in the same way to create the index. Thus, make sure the values of the columns are unique together and can be joined to a string.
- setRandomState(random_state: int)[source]
Set the random state for this instance.
- Parameters:
random_state (int) – Random state to use for shuffling and other random operations.
- property storeDir
The data set folder containing the data set files after saving.
- property storePath
The path to the main data set file.
- property storePrefix
The prefix of the data set files.
- toFile(filename: str)[source]
Save the metafile and all associated files to a custom location.
- Parameters:
filename (str) – absolute path to the saved metafile.
qsprpred.data.tables.qspr module
- class qsprpred.data.tables.qspr.QSPRDataset(name: str, target_props: list[qsprpred.tasks.TargetProperty | dict] | None = None, df: DataFrame | None = None, smiles_col: str = 'SMILES', add_rdkit: bool = False, store_dir: str = '.', overwrite: bool = False, n_jobs: int | None = 1, chunk_size: int | None = None, drop_invalids: bool = True, drop_empty: bool = True, index_cols: list[str] | None = None, autoindex_name: str = 'QSPRID', random_state: int | None = None, store_format: str = 'pkl')[source]
Bases:
MoleculeTable
Prepare dataset for QSPR model training.
It splits the data in train and test set, as well as creating cross-validation folds. Optionally low quality data is filtered out. For classification the dataset samples are labelled as active/inactive.
- Variables:
targetProperties (str) – property to be predicted with QSPRmodel
df (pd.dataframe) – dataset
X (np.ndarray/pd.DataFrame) – m x n feature matrix for cross validation, where m is the number of samplesand n is the number of features.
y (np.ndarray/pd.DataFrame) – m-d label array for cross validation, where m is the number of samples and equals to row of X.
X_ind (np.ndarray/pd.DataFrame) – m x n Feature matrix for independent set, where m is the number of samples and n is the number of features.
y_ind (np.ndarray/pd.DataFrame) – m-l label array for independent set, where m is the number of samples and equals to row of X_ind, and l is the number of types.
X_ind_outliers (np.ndarray/pd.DataFrame) – m x n Feature matrix for outliers in independent set, where m is the number of samples and n is the number of features.
y_ind_outliers (np.ndarray/pd.DataFrame) – m-l label array for outliers in independent set, where m is the number of samples and equals to row of X_ind_outliers, and l is the number of types.
featureStandardizer (SKLearnStandardizer) – feature standardizer
applicabilityDomain (ApplicabilityDomain) – applicability domain
- Construct QSPRdata, also apply transformations of output property if
specified.
- Parameters:
name (str) – data name, used in saving the data
target_props (list[TargetProperty | dict] | None) – target properties, names should correspond with target columnname in df. If
None
, target properties will be inferred if this data set has been saved previously. Defaults toNone
.df (pd.DataFrame, optional) – input dataframe containing smiles and target property. Defaults to None.
smiles_col (str, optional) – name of column in df containing SMILES. Defaults to “SMILES”.
add_rdkit (bool, optional) – if true, column with rdkit molecules will be added to df. Defaults to False.
store_dir (str, optional) – directory for saving the output data. Defaults to ‘.’.
overwrite (bool, optional) – if already saved data at output dir if should be overwritten. Defaults to False.
n_jobs (int, optional) – number of parallel jobs. If <= 0, all available cores will be used. Defaults to 1.
chunk_size (int, optional) – chunk size for parallel processing. Defaults to 50.
drop_invalids (bool, optional) – if true, invalid SMILES will be dropped. Defaults to True.
drop_empty (bool, optional) – if true, rows with empty target property will be removed.
index_cols (list[str], optional) – columns to be used as index in the dataframe. Defaults to
None
in which case a custom ID will be generated.autoindex_name (str) – Column name to use for automatically generated IDs.
random_state (int, optional) – random state for splitting the data.
store_format (str, optional) – format to use for storing the data (‘pkl’ or ‘csv’).
- Raises:
ValueError – Raised if threshold given with non-classification task.
- addClusters(clusters: list['MoleculeClusters'], recalculate: bool = False)
Add clusters to the data frame.
A new column is created that contains the identifier of the corresponding cluster calculator.
- Parameters:
clusters (list) – list of
MoleculeClusters
calculators.recalculate (bool) – Whether to recalculate clusters even if they are already present in the data frame.
- addDescriptors(descriptors: list[qsprpred.data.descriptors.sets.DescriptorSet], recalculate: bool = False, featurize: bool = True, *args, **kwargs)[source]
Add descriptors to the data set.
If descriptors are already present, they will be recalculated if
recalculate
isTrue
. Featurization will be performed after adding descriptors iffeaturize
isTrue
. Featurization converts current data matrices to pure numeric matrices of selected descriptors (features).- Parameters:
descriptors (list[DescriptorSet]) – list of descriptor sets to add
recalculate (bool, optional) – whether to recalculate descriptors if they are already present. Defaults to
False
.featurize (bool, optional) – whether to featurize the data set splits after adding descriptors. Defaults to
True
.*args – additional positional arguments to pass to each descriptor set
**kwargs – additional keyword arguments to pass to each descriptor set
- addFeatures(feature_calculators: list[qsprpred.data.descriptors.sets.DescriptorSet], recalculate: bool = False)[source]
Add features to the data set.
- Parameters:
feature_calculators (list[DescriptorSet]) – list of feature calculators to add. Defaults to None.
recalculate (bool) – if True, recalculate features even if they are already present in the data set. Defaults to False.
- addScaffolds(scaffolds: list[qsprpred.data.chem.scaffolds.Scaffold], add_rdkit_scaffold: bool = False, recalculate: bool = False)
Add scaffolds to the data frame.
A new column is created that contains the SMILES of the corresponding scaffold. If
add_rdkit_scaffold
is set toTrue
, a new column is created that contains the RDKit scaffold of the corresponding molecule.
- apply(func: Callable[[dict[str, list[Any]] | DataFrame, ...], Any], func_args: tuple[Any] | None = None, func_kwargs: dict[str, Any] | None = None, on_props: list[str] | None = None, as_df: bool = False, chunk_size: int | None = None, n_jobs: int | None = None) Generator
Apply a function to the data frame. The properties of the data set are passed as the first positional argument to the function. This will be a dictionary of the form
{'prop1': [...], 'prop2': [...], ...}
. Ifas_df
isTrue
, the properties will be passed as a data frame instead.Any additional arguments specified in
func_args
andfunc_kwargs
will be passed to the function after the properties as positional and keyword arguments, respectively.If
on_props
is specified, only the properties in this list will be passed to the function. Ifon_props
isNone
, all properties will be passed to the function.- Parameters:
func (Callable) – Function to apply to the data frame.
func_args (list) – Positional arguments to pass to the function.
func_kwargs (dict) – Keyword arguments to pass to the function.
on_props (list[str]) – list of properties to send to the function as arguments
as_df (bool) – If
True
, the function is applied to chunks represented as data frames.chunk_size (int) – Size of chunks to use per job in parallel processing. If
None
, the chunk size will be set toself.chunkSize
. The chunk size will always be set to the number of rows in the data frame ifn_jobs
or `self.nJobs is 1.n_jobs (int) – Number of jobs to use for parallel processing. If
None
,self.nJobs
is used.
- Returns:
Generator that yields the results of the function applied to each chunk of the data frame as determined by
chunk_size
andn_jobs
. Each item in the generator will be the result of the function applied to one chunk of the data set.- Return type:
Generator
- attachDescriptors(calculator: DescriptorSet, descriptors: DataFrame, index_cols: list)
Attach descriptors to the data frame.
- Parameters:
calculator (DescriptorsCalculator) – DescriptorsCalculator object to use for descriptor calculation.
descriptors (pd.DataFrame) – DataFrame containing the descriptors to attach.
index_cols (list) – List of column names to use as index.
- checkMols(throw: bool = True)
Returns a boolean array indicating whether each molecule is valid or not. If
throw
isTrue
, an exception is thrown if any molecule is invalid.- Parameters:
throw (bool) – Whether to throw an exception if any molecule is invalid.
- Returns:
Boolean series indicating whether each molecule is valid.
- Return type:
mask (pd.Series)
- clearFiles()
Remove all files associated with this data set from disk.
- createScaffoldGroups(mols_per_group: int = 10)
Create scaffold groups.
A scaffold group is a list of molecules that share the same scaffold. New columns are created that contain the scaffold group ID and the scaffold group size.
- Parameters:
mols_per_group (int) – number of molecules per scaffold group.
- property descriptorSets
Get the descriptor calculators for this table.
- dropDescriptorSets(descriptors: list[qsprpred.data.descriptors.sets.DescriptorSet | str], full_removal: bool = False)
Drop descriptors from the given sets from the data frame.
- Parameters:
descriptors (list[DescriptorSet | str]) – List of
DescriptorSet
objects or their names. Name of a descriptor set corresponds to the result returned by its__str__
method.full_removal (bool) – Whether to remove the descriptor data (will perform full removal). By default, a soft removal is performed by just rendering the descriptors inactive. A full removal will remove the descriptorSet from the dataset, including the saved files. It is not possible to restore a descriptorSet after a full removal.
- dropDescriptors(descriptors: list[str])[source]
Drop descriptors by name. Performs a simple feature selection by removing the given descriptor names from the data set.
- dropEmptyProperties(names: list[str])[source]
Drop rows with empty target property value from the data set.
- dropEmptySmiles()
Drop rows with empty SMILES from the data set.
- dropInvalids()[source]
Drops invalid molecules from the data set.
- Returns:
- Boolean mask of invalid molecules in the original
data set.
- Return type:
mask (pd.Series)
- featurizeSplits(shuffle: bool = True, random_state: int | None = None)[source]
If the data set has descriptors, load them into the train and test splits.
If no descriptors are available, remove all features from the splits. They will become zero length along the feature axis (columns), but will retain their original length along the sample axis (rows). This is useful for the case where the data set has no descriptors, but the user wants to retain train and test splits.
shuffle (bool): whether to shuffle the training and test sets random_state (int): random state for shuffling
- fillMissing(fill_value: float, columns: list[str] | None = None)[source]
Fill missing values in the data set with a given value.
- filter(table_filters: list[Callable])[source]
Filter the data set using the given filters.
- Parameters:
table_filters (list[Callable]) – list of filters to apply
- filterFeatures(feature_filters: list[Callable])[source]
Filter features in the data set.
- Parameters:
feature_filters (list[Callable]) – list of feature filter functions that take X feature matrix and y target vector as arguments
- classmethod fromFile(filename: str) PandasDataTable
Load a
StoredTable
object from a file.- Parameters:
filename (str) – The name of the file to load the object from.
- Returns:
The
StoredTable
object itself.
- static fromMolTable(mol_table: MoleculeTable, target_props: list[qsprpred.tasks.TargetProperty | dict], name=None, **kwargs) QSPRDataset [source]
Create QSPRDataset from a MoleculeTable.
- Parameters:
mol_table (MoleculeTable) – MoleculeTable to use as the data source
target_props (list) – list of target properties to use
name (str, optional) – name of the data set. Defaults to None.
kwargs – additional keyword arguments to pass to the constructor
- Returns:
created data set
- Return type:
- static fromSDF(name: str, filename: str, smiles_prop: str, *args, **kwargs)[source]
Create QSPRDataset from SDF file.
It is currently not implemented for QSPRDataset, but you can convert from ‘MoleculeTable’ with the ‘fromMolTable’ method.
- static fromSMILES(name: str, smiles: list, *args, **kwargs)
Create a
MoleculeTable
instance from a list of SMILES sequences.- Parameters:
name (str) – Name of the data set.
smiles (list) – list of SMILES sequences.
*args – Additional arguments to pass to the
MoleculeTable
constructor.**kwargs – Additional keyword arguments to pass to the
MoleculeTable
constructor.
- static fromTableFile(name: str, filename: str, sep: str = '\t', *args, **kwargs)[source]
Create QSPRDataset from table file (i.e. CSV or TSV).
- Parameters:
- Returns:
QSPRDataset
object- Return type:
- generateDescriptorDataSetName(ds_set: str | DescriptorSet)
Generate a descriptor set name from a descriptor set.
- generateIndex(name: str | None = None, prefix: str | None = None)
Generate a custom index for the data frame automatically.
- getClusterNames(clusters: list['MoleculeClusters'] | None = None)
Get the names of the clusters in the data frame.
- Returns:
List of cluster names.
- Return type:
- getClusters(clusters: list['MoleculeClusters'] | None = None)
Get the subset of the data frame that contains only clusters.
- Returns:
Data frame containing only clusters.
- Return type:
pd.DataFrame
- getDF()
Get the data frame this instance manages.
- Returns:
The data frame this instance manages.
- Return type:
pd.DataFrame
- getDescriptorNames()
Get the names of the descriptors present for molecules in this data set.
- Returns:
list of descriptor names.
- Return type:
- getDescriptors(active_only=False)
Get the calculated descriptors as a pandas data frame.
- Returns:
Data frame containing only descriptors.
- Return type:
pd.DataFrame
- getFeatures(inplace: bool = False, concat: bool = False, raw: bool = False, ordered: bool = False, refit_standardizer: bool = True)[source]
Get the current feature sets (training and test) from the dataset.
This method also applies any feature standardizers that have been set on the dataset during preparation. Outliers are dropped from the test set if they are present, unless
concat
isTrue
.- Parameters:
inplace (bool) – If
True
, the created feature matrices will be saved to the dataset object itself as ‘X’ and ‘X_ind’ attributes. Note that this will overwrite any existing feature matrices and if the data preparation workflow changes, these are not kept up to date. Therefore, it is recommended to generate new feature sets after any data set changes.concat (bool) – If
True
, the training and test feature matrices will be concatenated into a single matrix. This is useful for training models that do not require separate training and test sets (i.e. the final optimized models).raw (bool) – If
True
, the raw feature matrices will be returned without any standardization applied.ordered (bool) – If
True
, the returned feature matrices will be ordered according to the original order of the data set. This is only relevant ifconcat
isTrue
.refit_standardizer (bool) – If
True
, the feature standardizer will be refit on the training set upon this call. IfFalse
, the previously fitted standardizer will be used. Defaults toTrue
. UseFalse
if this dataset is used for prediction only and the standardizer has been initialized already.
- getProperties() list[str]
Get names of all properties/variables saved in the data frame (all columns).
- Returns:
list of property names.
- Return type:
- getProperty(name: str) Series
Get property values from the data set.
- Parameters:
name (str) – Name of the property to get.
- Returns:
List of values for the property.
- Return type:
pd.Series
- getScaffoldGroups(scaffold_name: str, mol_per_group: int = 10)
Get the scaffold groups for a given combination of scaffold and number of molecules per scaffold group.
- getScaffoldNames(scaffolds: list[qsprpred.data.chem.scaffolds.Scaffold] | None = None, include_mols: bool = False)
Get the names of the scaffolds in the data frame.
- getScaffolds(scaffolds: list[qsprpred.data.chem.scaffolds.Scaffold] | None = None, include_mols: bool = False)
Get the subset of the data frame that contains only scaffolds.
- Parameters:
include_mols (bool) – Whether to include the RDKit scaffold columns as well.
- Returns:
Data frame containing only scaffolds.
- Return type:
pd.DataFrame
- getSubset(prefix: str)
Get a subset of the data set by providing a prefix for the column names or a column name directly.
- Parameters:
prefix (str) – Prefix of the column names to select.
- getSummary()
Make a summary with some statistics about the molecules in this table. The summary contains the number of molecules per target and the number of unique molecules per target.
Requires this data set to be imported from Papyrus for now.
- Returns:
A dataframe with the summary statistics.
- Return type:
(pd.DataFrame)
- getTargetProperties(names: list) list[qsprpred.tasks.TargetProperty] [source]
Get the target properties with the given names.
- Parameters:
- Returns:
list of target properties
- Return type:
- getTargetPropertiesValues(concat: bool = False, ordered: bool = False)[source]
Get the response values (training and test) for the set target property.
- Parameters:
- Returns:
tuple
of (train_responses, test_responses) orpandas.DataFrame
of all target property values
- property hasClusters
Check whether the data frame contains clusters.
- Returns:
Whether the data frame contains clusters.
- Return type:
- hasDescriptors(descriptors: list[qsprpred.data.descriptors.sets.DescriptorSet | str] | None = None) bool | list[bool]
Check whether the data frame contains given descriptors.
- Parameters:
descriptors (list) – list of
DescriptorSet
objects or prefixes of descriptors to check for. IfNone
, all descriptors are checked for and a single boolean is returned if any descriptors are found.- Returns:
list of booleans indicating whether each descriptor is present or not.
- Return type:
- property hasFeatures
Check whether the currently selected set of features is not empty.
- hasProperty(name)
Check whether a property is present in the data frame.
- property hasScaffoldGroups
Check whether the data frame contains scaffold groups.
- Returns:
Whether the data frame contains scaffold groups.
- Return type:
- property hasScaffolds
Check whether the data frame contains scaffolds.
- Returns:
Whether the data frame contains scaffolds.
- Return type:
- imputeProperties(names: list[str], imputer: Callable)[source]
Impute missing property values.
- Parameters:
names (list) – List of property names to impute.
imputer (Callable) –
- imputer object implementing the
fit_transform
method from scikit-learn API.
- imputer object implementing the
- property isMultiTask
Check if the dataset contains multiple target properties.
- iterChunks(include_props: list[str] | None = None, as_dict: bool = False, chunk_size: int | None = None) Generator[DataFrame | dict, None, None]
Batch a data frame into chunks of the given size.
- Parameters:
- Returns:
Generator that yields batches of the data frame as smaller data frames.
- Return type:
Generator[pd.DataFrame, None, None]
- iterFolds(split: DataSplit, concat: bool = False) Generator[tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame | pandas.core.series.Series, pandas.core.frame.DataFrame | pandas.core.series.Series, list[int], list[int]], None, None] [source]
Iterate over the folds of the dataset.
- loadDescriptorsToSplits(shuffle: bool = True, random_state: int | None = None)[source]
Load all available descriptors into the train and test splits.
If no descriptors are available, an exception will be raised.
- Parameters:
- Raises:
ValueError – if no descriptors are available
- makeClassification(target_property: str, th: list[float] | None = None)[source]
Switch to classification task using the given threshold values.
- makeRegression(target_property: str)[source]
Switch to regression task using the given target property.
- Parameters:
target_property (str) – name of the target property to use for regression
- property metaFile
The path to the meta file of this data set.
- property nJobs
- property nTargetProperties
Get the number of target properties in the dataset.
- prepareDataset(smiles_standardizer: str | ~typing.Callable | None = 'chembl', data_filters: list | None = (<qsprpred.data.processing.data_filters.RepeatsFilter object>, ), split=None, feature_calculators: list[qsprpred.data.descriptors.sets.DescriptorSet] | None = None, feature_filters: list | None = None, feature_standardizer: ~qsprpred.data.processing.feature_standardizers.SKLearnStandardizer | None = None, feature_fill_value: float = nan, applicability_domain: ~qsprpred.data.processing.applicability_domain.ApplicabilityDomain | ~mlchemad.base.ApplicabilityDomain | None = None, drop_outliers: bool = False, recalculate_features: bool = False, shuffle: bool = True, random_state: int | None = None)[source]
Prepare the dataset for use in QSPR model.
- Parameters:
smiles_standardizer (str | Callable) – either
chembl
,old
, or a partial function that reads and standardizes smiles. IfNone
, no standardization will be performed. Defaults tochembl
.data_filters (list of datafilter obj) – filters number of rows from dataset
split (datasplitter obj) – splits the dataset into train and test set
feature_calculators (list[DescriptorSet]) – descriptor sets to add to the data set
feature_filters (list of feature filter objs) – filters features
feature_standardizer (SKLearnStandardizer or sklearn.base.BaseEstimator) – standardizes and/or scales features
feature_fill_value (float) – value to fill missing values with. Defaults to
numpy.nan
applicability_domain (applicabilityDomain obj) – attaches an applicability domain calculator to the dataset and fits it on the training set
drop_outliers (bool) – whether to drop samples that are outside the applicability domain from the test set, if one is attached.
recalculate_features (bool) – recalculate features even if they are already present in the file
shuffle (bool) – whether to shuffle the created training and test sets
random_state (int) – random state for shuffling
- processMols(processor: MolProcessor, proc_args: tuple[Any] | None = None, proc_kwargs: dict[str, Any] | None = None, add_props: list[str] | None = None, as_rdkit: bool = False, chunk_size: int | None = None, n_jobs: int | None = None) Generator
Apply a function to the molecules in the data frame. The SMILES or an RDKit molecule will be supplied as the first positional argument to the function. Additional properties to provide from the data set can be specified with ‘add_props’, which will be a dictionary supplied as an additional positional argument to the function.
IMPORTANT: For successful parallel processing, the processor must be picklable. Also note that the returned generator will produce results as soon as they are ready, which means that the chunks of data will not be in the same order as the original data frame. However, you can pass the value of
idProp
inadd_props
to identify the processed molecules. SeeCheckSmilesValid
for an example.- Parameters:
processor (MolProcessor) –
MolProcessor
object to use for processing.proc_args (list, optional) – Any additional positional arguments to pass to the processor.
proc_kwargs (dict, optional) – Any additional keyword arguments to pass to the processor.
add_props (list, optional) – List of data set properties to send to the processor. If
None
, all properties will be sent.as_rdkit (bool, optional) – Whether to convert the molecules to RDKit molecules before applying the processor.
chunk_size (int, optional) – Size of chunks to use per job in parallel. If not specified,
self.chunkSize
is used.n_jobs (int, optional) – Number of jobs to use for parallel processing. If not specified,
self.nJobs
is used.
- Returns:
A generator that yields the results of the supplied processor on the chunked molecules from the data set.
- Return type:
Generator
- reload()
Reload the data table from disk.
- removeProperty(name)
Remove a property from the data frame.
- Parameters:
name (str) – Name of the property to delete.
- reset()[source]
Reset the data set. Splits will be removed and all descriptors will be moved to the training data. Molecule standardization and molecule filtering are not affected.
- resetTargetProperty(prop: TargetProperty | str)[source]
Reset target property to its original value.
- Parameters:
prop (TargetProperty | str) – target property to reset
- restoreDescriptorSets(descriptors: list[qsprpred.data.descriptors.sets.DescriptorSet | str])[source]
Restore descriptors that were previously removed.
- Parameters:
descriptors (list[DescriptorSet | str]) – List of
DescriptorSet
objects or their names. Name of a descriptor set corresponds to the result returned by its__str__
method.
- restoreTrainingData()[source]
Restore training data from the data frame.
If the data frame contains a column ‘Split_IsTrain’, the data will be split into training and independent sets. Otherwise, the independent set will be empty. If descriptors are available, the resulting training matrices will be featurized.
- classmethod runMolProcess(props: dict[str, list] | DataFrame, func: MolProcessor, add_rdkit: bool, smiles_col: str, *args, **kwargs)
A helper method to run a
MolProcessor
on a list of molecules viaapply
. It converts the SMILES to RDKit molecules if required and then applies the function to theMolProcessor
object.- Parameters:
props (dict) – Dictionary of properties that will be passed in addition to the molecule structure.
func (MolProcessor) –
MolProcessor
object to use for processing.add_rdkit (bool) – Whether to convert the SMILES to RDKit molecules before applying the function.
smiles_col (str) – Name of the property containing the SMILES sequences.
*args – Additional positional arguments to pass to the function.
**kwargs – Additional keyword arguments to pass to the function.
- sample(n: int, name: str | None = None, random_state: int | None = None) MoleculeTable
Sample n molecules from the table.
- Parameters:
- Returns:
A dataframe with the sampled molecules.
- Return type:
- save(save_split: bool = True)[source]
Save the data set to file and serialize metadata.
- Parameters:
save_split (bool) – whether to save split data to the managed data frame.
- searchOnProperty(prop_name: str, values: list[str], name: str | None = None, exact=False) MoleculeTable
Search in this table using a property name and a list of values. It is assumed that the property is searchable with string matching. Either an exact match or a partial match can be used. If ‘exact’ is
False
, the search will be performed with partial matching, i.e. all molecules that contain any of the given values in the property will be returned. If ‘exact’ isTrue
, only molecules that have the exact property value for any of the given values will be returned.- Parameters:
prop_name (str) – Name of the property to search on.
values (list[str]) – List of values to search for. If any of the values is found in the property, the molecule will be considered a match.
name (str | None, optional) – Name of the new table. Defaults to the name of the old table, plus the
_searched
suffix.exact (bool, optional) – Whether to use exact matching, i.e. whether to search for exact strings or just substrings. Defaults to False.
- Returns:
A new table with the molecules from the old table with the given property values.
- Return type:
- searchWithIndex(index: Index, name: str | None = None) MoleculeTable [source]
Search in this table using a pandas index. The return values is a new table with the molecules from the old table with the given indices.
- Parameters:
index (pd.Index) – Indices to search for in this table.
name (str) – Name of the new table. Defaults to the name of the old table, plus the
_searched
suffix.
- Returns:
A new table with the molecules from the old table with the given indices.
- Return type:
- searchWithSMARTS(patterns: list[str], operator: ~typing.Literal['or', 'and'] = 'or', use_chirality: bool = False, name: str | None = None, match_function: ~typing.Callable = <function match_mol_to_smarts>) MoleculeTable
Search the molecules in the table with a SMARTS pattern.
- Parameters:
patterns – List of SMARTS patterns to search with.
operator (object) – Whether to use an “or” or “and” operator on patterns. Defaults to “or”.
use_chirality – Whether to use chirality in the search.
name – Name of the new table. Defaults to the name of the old table, plus the
smarts_searched
suffix.match_function – Function to use for matching the molecules to the SMARTS patterns. Defaults to
match_mol_to_smarts
.
- Returns:
A dataframe with the molecules that match the pattern.
- Return type:
(MolTable)
- setApplicabilityDomain(applicability_domain: ApplicabilityDomain | ApplicabilityDomain)[source]
Set the applicability domain calculator.
- Parameters:
applicability_domain (ApplicabilityDomain | MLChemADApplicabilityDomain) – applicability domain calculator instance
- setFeatureStandardizer(feature_standardizer)[source]
Set feature standardizer.
- Parameters:
feature_standardizer (SKLearnStandardizer | BaseEstimator) – feature standardizer
- setIndex(cols: list[str])
Create and index column from several columns of the data set. This also resets the
idProp
attribute to be the name of the index columns joined by a ‘~’ character. The values of the columns are also joined in the same way to create the index. Thus, make sure the values of the columns are unique together and can be joined to a string.
- setRandomState(random_state: int)
Set the random state for this instance.
- Parameters:
random_state (int) – Random state to use for shuffling and other random operations.
- setTargetProperties(target_props: list[qsprpred.tasks.TargetProperty | dict], drop_empty: bool = True)[source]
Set list of target properties and apply transformations if specified.
- Parameters:
target_props (list[TargetProperty]) – list of target properties
drop_empty (bool, optional) – whether to drop rows with empty target property values. Defaults to
True
.
- setTargetProperty(prop: TargetProperty | dict, drop_empty: bool = True)[source]
Add a target property to the dataset.
- Parameters:
prop (TargetProperty) – name of the target property to add
drop_empty (bool) – whether to drop rows with empty target property values. Defaults to
True
.
- property smiles: Generator[str, None, None]
Get the SMILES strings of the molecules in the data frame.
- Returns:
Generator of SMILES strings.
- Return type:
Generator[str, None, None]
- split(split: DataSplit, featurize: bool = False)[source]
Split dataset into train and test set.
You can either split tha data frame itself or you can set
featurize
toTrue
if you want to use feature matrices instead of the raw data frame.
- standardizeSmiles(smiles_standardizer, drop_invalid=True)
Apply smiles_standardizer to the compounds in parallel
- Parameters:
() (smiles_standardizer) – either
None
to skip the standardization,chembl
,old
, or a partial function that reads and standardizes smiles.drop_invalid (bool) – whether to drop invalid SMILES from the data set. Defaults to
True
. IfFalse
, invalid SMILES will be retained in their original form. Ifself.invalidsRemoved
isTrue
, there will be no effect even ifdrop_invalid
isTrue
. Setself.invalidsRemoved
toFalse
on this instance to force the removal of invalid SMILES.
- Raises:
ValueError – when smiles_standardizer is not a callable or one of the predefined strings.
- property storeDir
The data set folder containing the data set files after saving.
- property storePath
The path to the main data set file.
- property storePrefix
The prefix of the data set files.
- property targetPropertyNames
Get the names of the target properties.
- toFile(filename: str)
Save the metafile and all associated files to a custom location.
- Parameters:
filename (str) – absolute path to the saved metafile.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transformProperties(targets: list[str], transformer: Callable)[source]
Transform the target properties using the given transformer.
- Parameters:
- unsetTargetProperty(name: str | TargetProperty)[source]
Unset the target property. It will not remove it from the data set, but will make it unavailable for training.
- Parameters:
name (str | TargetProperty) – name of the target property to drop or the property itself
qsprpred.data.tables.searchable module
- class qsprpred.data.tables.searchable.SearchableMolTable[source]
Bases:
MoleculeDataTable
- abstract addDescriptors(descriptors: DescriptorSet, *args, **kwargs)
Add descriptors to the dataset.
- Parameters:
descriptors (list[DescriptorSet]) – The descriptors to add.
args – Additional positional arguments to be passed to each descriptor set.
kwargs – Additional keyword arguments to be passed to each descriptor set.
- abstract apply(func: callable, on_props: list[str] | None = None, func_args: list | None = None, func_kwargs: dict | None = None)
Apply a function on all or selected properties. The properties are supplied as the first positional argument to the function.
- abstract clearFiles()
Delete the files associated with the table.
- abstract filter(table_filters: list[Callable])
Filter the dataset.
- Parameters:
table_filters (List[Callable]) – The filters to apply.
- abstract static fromFile(filename: str) StoredTable
Load a
StoredTable
object from a file.- Parameters:
filename (str) – The name of the file to load the object from.
- Returns:
The
StoredTable
object itself.
- abstract getDescriptorNames() list[str]
Get the names of the descriptors that are currently in the dataset.
- Returns:
a
list
of descriptor names
- abstract getDescriptors() DataFrame
Get the table of descriptors that are currently in the dataset.
- Returns:
a pd.DataFrame with the descriptors
- abstract getProperties()
Get the property names contained in the dataset.
- abstract getSubset(prefix: str)
Get a subset of the dataset.
- Parameters:
prefix (str) – The prefix of the subset.
- abstract hasDescriptors()
Indicates if the dataset has descriptors.
- abstract reload()
Reload the table from a file.
- abstract removeProperty(name: str)
Remove a property from the dataset.
- Parameters:
name (str) – The name of the property.
- abstract save()
Save the table to a file.
- abstract searchOnProperty(prop_name: str, values: list[str], name: str | None = None, exact=False) MoleculeDataTable [source]
Search the molecules within this
MoleculeDataSet
on a property value.- Parameters:
prop_name – Name of the column to search on.
values – Values to search for.
name – Name of the new table.
exact – Whether to search for exact matches or not.
- Returns:
A data set with the molecules that match the search.
- Return type:
- abstract searchWithSMARTS(patterns: list[str], operator: Literal['or', 'and'] = 'or', use_chirality: bool = False, name: str | None = None) MoleculeDataTable [source]
Search the molecules within this
MoleculeDataSet
with SMARTS patterns.- Parameters:
patterns – List of SMARTS patterns to search with.
operator (object) – Whether to use an “or” or “and” operator on patterns. Defaults to “or”.
use_chirality – Whether to use chirality in the search.
name – Name of the new table.
- Returns:
A dataframe with the molecules that match the pattern.
- Return type:
qsprpred.data.tables.tests module
- class qsprpred.data.tables.tests.TestApply(methodName='runTest')[source]
Bases:
DataSetsPathMixIn
,QSPRTestCase
Tests the apply method of the data set.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testRegular = None
- testRegular_0(**kw)
- testRegular_1(**kw)
- testRegular_2(**kw)
- testRegular_3(**kw)
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.data.tables.tests.TestDataSetCreationAndSerialization(methodName='runTest')[source]
Bases:
DataSetsPathMixIn
,QSPRTestCase
Simple tests for dataset creation and serialization under different conditions and error states.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkConsistency(ds: QSPRDataset)[source]
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testInvalidsDetection = None
- testInvalidsDetection_0(**kw)
- testInvalidsDetection_1(**kw)
- testTargetProperty()[source]
Test target property creation and serialization in the context of a dataset.
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.data.tables.tests.TestDataSetPreparation(methodName='runTest')[source]
Bases:
DataSetsPathMixIn
,DataPrepCheckMixIn
,QSPRTestCase
Test as many possible combinations of data sets and their preparation settings. These can run potentially for a long time so use the
skip
decorator if you want to skip all these tests to speed things up during development.Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkDescriptors(dataset: QSPRDataset, target_props: list[dict | qsprpred.tasks.TargetProperty])
Check if information about descriptors is consistent in the data set. Checks if calculators are consistent with the descriptors contained in the data set. This is tested also before and after serialization.
- Parameters:
dataset (QSPRDataset) – The data set to check.
target_props (List of dicts or TargetProperty) – list of target properties
- Raises:
AssertionError – If the consistency check fails.
- checkFeatures(ds: QSPRDataset, expected_length: int)
Check if the feature names and the feature matrix of a data set is consistent with expected number of variables.
- Parameters:
ds (QSPRDataset) – The data set to check.
expected_length (int) – The expected number of features.
- Raises:
AssertionError – If the feature names or the feature matrix is not consistent
- checkPrep(dataset, feature_calculators, split, feature_standardizer, feature_filter, data_filter, applicability_domain, expected_target_props)
Check the consistency of the dataset after preparation.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testPrepCombos = None
- testPrepCombos_00_MorganFP_None_None_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_None_None_None’, name=’MorganFP_None_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec5f70>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_01_MorganFP_None_None_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_None_None_TopKatApplicabilityDomain’, name=’MorganFP_None_None_None_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec5c40>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7ec5eb0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_02_MorganFP_None_None_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_None_RepeatsFilter_None’, name=’MorganFP_None_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec5f40>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7ec5b80>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_03_MorganFP_None_None_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_None_Repeats…ilter_TopKatApplicabilityDomain’, name=’MorganFP_None_None_None_Repeats…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec6b10>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7ec5df0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7ec6210>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_04_MorganFP_None_None_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_HighCorrelationFilter_None_None’, name=’MorganFP_None_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec5ee0>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7ec6060>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_05_MorganFP_None_None_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_HighCorrelat…_None_TopKatApplicabilityDomain’, name=’MorganFP_None_None_HighCorrelat…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff9335070>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7ec5fd0>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7ec5250>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_06_MorganFP_None_None_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_HighCorrelationFilter_RepeatsFilter_None’, name=’MorganFP_None_None_HighCorrelationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec4920>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7ec48c0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7ec4560>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_07_MorganFP_None_None_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_None_HighCorrelat…ilter_TopKatApplicabilityDomain’, name=’MorganFP_None_None_HighCorrelat…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec4800>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7ec47d0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7ec4770>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7ec4710>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_08_MorganFP_None_StandardScaler_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_None_None_None’, name=’MorganFP_None_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec4680>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_09_MorganFP_None_StandardScaler_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_No…_None_TopKatApplicabilityDomain’, name=’MorganFP_None_StandardScaler_No…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ec45f0>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff83d5490>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_10_MorganFP_None_StandardScaler_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_None_RepeatsFilter_None’, name=’MorganFP_None_StandardScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7bc20>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7e7be90>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_11_MorganFP_None_StandardScaler_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_No…ilter_TopKatApplicabilityDomain’, name=’MorganFP_None_StandardScaler_No…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7a540>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7e7b740>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7e7bb60>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_12_MorganFP_None_StandardScaler_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_HighCorrelationFilter_None_None’, name=’MorganFP_None_StandardScaler_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7b7a0>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7e7a840>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_13_MorganFP_None_StandardScaler_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_Hi…_None_TopKatApplicabilityDomain’, name=’MorganFP_None_StandardScaler_Hi…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7a3f0>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7e7bb30>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7e7ae10>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_14_MorganFP_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_Hi…lationFilter_RepeatsFilter_None’, name=’MorganFP_None_StandardScaler_Hi…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7bad0>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7e7ba40>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7e7b9e0>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_15_MorganFP_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_None_StandardScaler_Hi…ilter_TopKatApplicabilityDomain’, name=’MorganFP_None_StandardScaler_Hi…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7b950>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7e7b8c0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7e7b800>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7e7aab0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_16_MorganFP_RandomSplit_None_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_None_None_None’, name=’MorganFP_RandomSplit_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e795e0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7e7a120>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_17_MorganFP_RandomSplit_None_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_None_None_TopKatApplicabilityDomain’, name=’MorganFP_RandomSplit_None_None_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7a210>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7e796a0>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7e79d30>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_18_MorganFP_RandomSplit_None_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_None_RepeatsFilter_None’, name=’MorganFP_RandomSplit_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e79430>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7e793d0>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7e79880>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_19_MorganFP_RandomSplit_None_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_None_…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RandomSplit_None_None_…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7a750>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7e7a810>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7e7b020>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7e7b0e0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_20_MorganFP_RandomSplit_None_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_HighCorrelationFilter_None_None’, name=’MorganFP_RandomSplit_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e7b170>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7e7b1d0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7e7a990>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_21_MorganFP_RandomSplit_None_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_HighC…_None_TopKatApplicabilityDomain’, name=’MorganFP_RandomSplit_None_HighC…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7dde3f0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7dddfd0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7ddf080>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7dde870>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_22_MorganFP_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_HighC…lationFilter_RepeatsFilter_None’, name=’MorganFP_RandomSplit_None_HighC…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ddea20>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7ddffb0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7dde4e0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7ddff80>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_23_MorganFP_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_None_HighC…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RandomSplit_None_HighC…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ddd6d0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7ddd610>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7ddd640>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7ddcf20>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7ddcef0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_24_MorganFP_RandomSplit_StandardScaler_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardScaler_None_None_None’, name=’MorganFP_RandomSplit_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7ddd5b0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7ddd580>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_25_MorganFP_RandomSplit_StandardScaler_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…_None_TopKatApplicabilityDomain’, name=’MorganFP_RandomSplit_StandardSc…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e32990>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7e33ef0>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7e30470>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_26_MorganFP_RandomSplit_StandardScaler_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardScaler_None_RepeatsFilter_None’, name=’MorganFP_RandomSplit_StandardScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e30350>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7e30380>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7e30050>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_27_MorganFP_RandomSplit_StandardScaler_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RandomSplit_StandardSc…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e30bf0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7e30c80>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7e305f0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7e31160>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_28_MorganFP_RandomSplit_StandardScaler_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…HighCorrelationFilter_None_None’, name=’MorganFP_RandomSplit_StandardSc…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7e30c20>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d083e0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d08320>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_29_MorganFP_RandomSplit_StandardScaler_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…_None_TopKatApplicabilityDomain’, name=’MorganFP_RandomSplit_StandardSc…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7d08380>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d084d0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d08560>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d085c0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_30_MorganFP_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…lationFilter_RepeatsFilter_None’, name=’MorganFP_RandomSplit_StandardSc…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7d08650>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d08680>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d08740>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d087a0>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_31_MorganFP_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RandomSplit_StandardSc…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RandomSplit_StandardSc…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…anFP object at 0x7efff7d08830>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d08860>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d08920>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d08980>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d089e0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_32_RDKitDescs_None_None_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_None_None_None’, name=’RDKitDescs_None_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d08a70>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_33_RDKitDescs_None_None_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_None_None_TopKatApplicabilityDomain’, name=’RDKitDescs_None_None_None_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d08a10>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d08aa0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_34_RDKitDescs_None_None_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_None_RepeatsFilter_None’, name=’RDKitDescs_None_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d08b30>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d08b60>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_35_RDKitDescs_None_None_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_None_Repea…ilter_TopKatApplicabilityDomain’, name=’RDKitDescs_None_None_None_Repea…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d08bf0>,), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d08c20>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d08c80>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_36_RDKitDescs_None_None_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_HighCorrelationFilter_None_None’, name=’RDKitDescs_None_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d08d10>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d08d40>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_37_RDKitDescs_None_None_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_HighCorrel…_None_TopKatApplicabilityDomain’, name=’RDKitDescs_None_None_HighCorrel…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d08dd0>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d08e00>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d08e60>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_38_RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_None’, name=’RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d08ef0>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d08f20>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d08f80>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_39_RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_None_HighCorrel…ilter_TopKatApplicabilityDomain’, name=’RDKitDescs_None_None_HighCorrel…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09010>,), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d09040>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d090a0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d09100>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_40_RDKitDescs_None_StandardScaler_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_None_None_None’, name=’RDKitDescs_None_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09190>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_41_RDKitDescs_None_StandardScaler_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…_None_TopKatApplicabilityDomain’, name=’RDKitDescs_None_StandardScaler_…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09250>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d092e0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_42_RDKitDescs_None_StandardScaler_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_None_RepeatsFilter_None’, name=’RDKitDescs_None_StandardScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09370>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d09400>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_43_RDKitDescs_None_StandardScaler_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…ilter_TopKatApplicabilityDomain’, name=’RDKitDescs_None_StandardScaler_…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09490>,), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d09520>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d09580>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_44_RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_None’, name=’RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09610>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d096a0>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_45_RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…_None_TopKatApplicabilityDomain’, name=’RDKitDescs_None_StandardScaler_…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09730>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d097f0>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d09850>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_46_RDKitDescs_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…lationFilter_RepeatsFilter_None’, name=’RDKitDescs_None_StandardScaler_…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d098e0>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d099a0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d09a00>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_47_RDKitDescs_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_None_StandardScaler_…ilter_TopKatApplicabilityDomain’, name=’RDKitDescs_None_StandardScaler_…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09a90>,), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d09b50>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d09bb0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d09c10>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_48_RDKitDescs_RandomSplit_None_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_None_None_None’, name=’RDKitDescs_RandomSplit_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09ca0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d09cd0>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_49_RDKitDescs_RandomSplit_None_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_Non…_None_TopKatApplicabilityDomain’, name=’RDKitDescs_RandomSplit_None_Non…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09d60>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d09d90>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d09df0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_50_RDKitDescs_RandomSplit_None_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_None_RepeatsFilter_None’, name=’RDKitDescs_RandomSplit_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09e80>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d09eb0>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d09f10>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_51_RDKitDescs_RandomSplit_None_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_Non…ilter_TopKatApplicabilityDomain’, name=’RDKitDescs_RandomSplit_None_Non…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d09fa0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d09fd0>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0a030>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0a090>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_52_RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_None’, name=’RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0a120>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0a150>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0a1b0>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_53_RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_Hig…_None_TopKatApplicabilityDomain’, name=’RDKitDescs_RandomSplit_None_Hig…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0a240>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0a270>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0a2d0>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0a330>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_54_RDKitDescs_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_Hig…lationFilter_RepeatsFilter_None’, name=’RDKitDescs_RandomSplit_None_Hig…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0a3c0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0a3f0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0a450>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0a4b0>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_55_RDKitDescs_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_None_Hig…ilter_TopKatApplicabilityDomain’, name=’RDKitDescs_RandomSplit_None_Hig…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0a540>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0a570>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0a5d0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0a630>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0a690>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_56_RDKitDescs_RandomSplit_StandardScaler_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_StandardScaler_None_None_None’, name=’RDKitDescs_RandomSplit_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0a720>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0a750>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_57_RDKitDescs_RandomSplit_StandardScaler_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…_None_TopKatApplicabilityDomain’, name=’RDKitDescs_RandomSplit_Standard…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0a870>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0a8a0>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0a990>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_58_RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_None’, name=’RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0aa20>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0aa50>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0ab40>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_59_RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…ilter_TopKatApplicabilityDomain’, name=’RDKitDescs_RandomSplit_Standard…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0abd0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0ac00>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0acf0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0ad50>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_60_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…HighCorrelationFilter_None_None’, name=’RDKitDescs_RandomSplit_Standard…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0ade0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0ae10>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0af00>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_61_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…_None_TopKatApplicabilityDomain’, name=’RDKitDescs_RandomSplit_Standard…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0af90>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0afc0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0b0b0>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0b110>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_62_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…lationFilter_RepeatsFilter_None’, name=’RDKitDescs_RandomSplit_Standard…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0b1a0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0b1d0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0b2c0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0b320>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_63_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’RDKitDescs_RandomSplit_Standard…ilter_TopKatApplicabilityDomain’, name=’RDKitDescs_RandomSplit_Standard…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.sets…escs object at 0x7efff7d0b3b0>,), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d0b3e0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0b4d0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0b530>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0b590>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_64_MorganFP_RDKitDescs_None_None_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_None_None_None’, name=’MorganFP_RDKitDescs_None_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d0b680>), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_65_MorganFP_RDKitDescs_None_None_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_N…_None_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_None_None_N…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d0b740>), split=None, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0b770>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_66_MorganFP_RDKitDescs_None_None_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_None_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_None_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d0b860>), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0b890>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_67_MorganFP_RDKitDescs_None_None_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_N…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_None_None_N…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d0b980>), split=None, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0b9b0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0ba10>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_68_MorganFP_RDKitDescs_None_None_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_HighCorrelationFilter_None_None’, name=’MorganFP_RDKitDescs_None_None_HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d0bb00>), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0bb30>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_69_MorganFP_RDKitDescs_None_None_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_H…_None_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_None_None_H…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d0bc50>), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0bc80>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d0bce0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_70_MorganFP_RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_H…lationFilter_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_None_None_H…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d0bdd0>), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0be00>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0be60>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_71_MorganFP_RDKitDescs_None_None_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_None_H…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_None_None_H…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d0bf50>), split=None, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d0bf80>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d0bfe0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d64080>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_72_MorganFP_RDKitDescs_None_StandardScaler_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_StandardScaler_None_None_None’, name=’MorganFP_RDKitDescs_None_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d64170>), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_73_MorganFP_RDKitDescs_None_StandardScaler_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…_None_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_None_Standa…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d642c0>), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d64380>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_74_MorganFP_RDKitDescs_None_StandardScaler_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…dScaler_None_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_None_Standa…dScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d64470>), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d64530>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_75_MorganFP_RDKitDescs_None_StandardScaler_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_None_Standa…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d64620>), split=None, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d646e0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d64740>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_76_MorganFP_RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…HighCorrelationFilter_None_None’, name=’MorganFP_RDKitDescs_None_Standa…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d64830>), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d648f0>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_77_MorganFP_RDKitDescs_None_StandardScaler_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…_None_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_None_Standa…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d649e0>), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d64aa0>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d64b00>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_78_MorganFP_RDKitDescs_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…lationFilter_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_None_Standa…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d64bf0>), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d64ce0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d64d40>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_79_MorganFP_RDKitDescs_None_StandardScaler_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_None_Standa…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_None_Standa…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d64e30>), split=None, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d64ef0>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d64f50>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d64fb0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_80_MorganFP_RDKitDescs_RandomSplit_None_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit_None_None_None_None’, name=’MorganFP_RDKitDescs_RandomSplit_None_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d650a0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d650d0>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_81_MorganFP_RDKitDescs_RandomSplit_None_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…_None_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_RandomSplit…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d651c0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d651f0>, feature_standardizer=None, feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d65250>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_82_MorganFP_RDKitDescs_RandomSplit_None_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit_None_None_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_RandomSplit_None_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d65340>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d65370>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d653d0>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_83_MorganFP_RDKitDescs_RandomSplit_None_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_RandomSplit…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d654c0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d654f0>, feature_standardizer=None, feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d65550>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d655b0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_84_MorganFP_RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…HighCorrelationFilter_None_None’, name=’MorganFP_RDKitDescs_RandomSplit…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d656a0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d656d0>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d65730>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_85_MorganFP_RDKitDescs_RandomSplit_None_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…_None_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_RandomSplit…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d65820>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d65850>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d658b0>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d65910>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_86_MorganFP_RDKitDescs_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…lationFilter_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_RandomSplit…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d65a00>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d65a30>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d65a90>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d65af0>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_87_MorganFP_RDKitDescs_RandomSplit_None_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_RandomSplit…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d65be0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d65c10>, feature_standardizer=None, feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d65c70>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d65cd0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d65d30>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_88_MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_None_None’, name=’MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d65e20>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d65e50>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_89_MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…_None_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_RandomSplit…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d65fd0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d66000>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d660f0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_90_MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…dScaler_None_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_RandomSplit…dScaler_None_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d661e0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d66210>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d66300>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_91_MorganFP_RDKitDescs_RandomSplit_StandardScaler_None_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_RandomSplit…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d663f0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d66420>, feature_standardizer=StandardScaler(), feature_filter=None, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d66510>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d66570>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_92_MorganFP_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_None_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…HighCorrelationFilter_None_None’, name=’MorganFP_RDKitDescs_RandomSplit…HighCorrelationFilter_None_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d66660>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d66690>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d66780>, data_filter=None, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_93_MorganFP_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_None_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…_None_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_RandomSplit…_None_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d66870>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d668a0>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d66990>, data_filter=None, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d669f0>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_94_MorganFP_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_None(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…lationFilter_RepeatsFilter_None’, name=’MorganFP_RDKitDescs_RandomSplit…lationFilter_RepeatsFilter_None’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d66ae0>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d66b10>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d66c00>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d66c60>, applicability_domain=None].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- testPrepCombos_95_MorganFP_RDKitDescs_RandomSplit_StandardScaler_HighCorrelationFilter_RepeatsFilter_TopKatApplicabilityDomain(**kw)
Tests one combination of a data set and its preparation settings [with _=’MorganFP_RDKitDescs_RandomSplit…ilter_TopKatApplicabilityDomain’, name=’MorganFP_RDKitDescs_RandomSplit…ilter_TopKatApplicabilityDomain’, feature_calculators=(<qsprpred.data.descriptors.fing…Descs object at 0x7efff7d66d50>), split=<qsprpred.data.sampling.splits.R…mSplit object at 0x7efff7d66d80>, feature_standardizer=StandardScaler(), feature_filter=<qsprpred.data.processing.featur…Filter object at 0x7efff7d66e70>, data_filter=<qsprpred.data.processing.data_f…Filter object at 0x7efff7d66ed0>, applicability_domain=<mlchemad.applicability_domains….Domain object at 0x7efff7d66f30>].
This generates a large number of parameterized tests. Use the
skip
decorator if you want to skip all these tests. Note that the combinations are not exhaustive, but defined byDataSetsPathMixIn.getPrepCombos()
.
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.data.tables.tests.TestSearchFeatures(methodName='runTest')[source]
Bases:
DataSetsPathMixIn
,QSPRTestCase
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- validateSearch(dataset: QSPRDataset, result: QSPRDataset, name: str)[source]
Validate the results of a search.
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.data.tables.tests.TestTargetImputation(methodName='runTest')[source]
Bases:
PathMixIn
,QSPRTestCase
Small tests to only check if the target imputation works on its own.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.tables.tests.TestTargetProperty(methodName='runTest')[source]
Bases:
QSPRTestCase
Test the TargetProperty class.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- countTestCases()
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- setUp()
Hook method for setting up the test fixture before exercising it.
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testSerialization = None
- testSerialization_0(**kw)
- testSerialization_1(**kw)
- class qsprpred.data.tables.tests.TestTargetTransformation(methodName='runTest')[source]
Bases:
DataSetsPathMixIn
,QSPRTestCase
Tests the transformation of target properties.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- validate_split(dataset)
Check if the split has the data it should have after splitting.