qsprpred.data.processing package
Submodules
qsprpred.data.processing.applicability_domain module
- class qsprpred.data.processing.applicability_domain.ApplicabilityDomain(threshold: float | None = None, direction: str | None = None)[source]
Bases:
JSONSerializable,ABCDefine the applicability domain for a dataset.
A class to define the applicability domain for a dataset. A fitted applicability domain can be used to filter out molecules that are not in in the applicability domain or just to check if a molecule is in the applicability domain.
Initialize the applicability domain with a threshold.
- Parameters:
- contains(X: DataFrame) DataFrame[source]
Check if the applicability domain contains the features.
- Parameters:
X (pd.DataFrame) – array of features to check
- Returns:
- pd.Series of booleans indicating if the features are in the
applicability domain
- Return type:
pd.Series
- property direction: str
Return the direction of the threshold.
The direction should be ‘>’, ‘<’, ‘>=’, ‘<=’
- abstract fit(X: DataFrame) None[source]
Fit the applicability domain model.
- Parameters:
X (pd.DataFrame) – array of features to fit model on
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- abstract transform(X: DataFrame) DataFrame[source]
Transform the features to a score for the applicability domain.
The result could be a boolean array indicating if the features are in the applicability domain or a continous score indicating a measure of applicability (e.g., a probability or a distance).
- Parameters:
X (pd.DataFrame) – array of features
- Returns:
scores for the applicability domain
- Return type:
pd.Series
- class qsprpred.data.processing.applicability_domain.KNNApplicabilityDomain(k: int = 5, alpha: float | None = None, hard_threshold: float | None = None, scaling: str | None = 'robust', dist: str = 'euclidean', scaler_kwargs=None, n_jobs: int = 1, astype: str | None = 'float64')[source]
Bases:
ApplicabilityDomainApplicability domain defined using K-nearest neighbours.
This class is adapted from the
KNNApplicabilityDomainclass in themlchemadpackage.Create the k-Nearest Neighbor applicability domain.
- Parameters:
k (int) – number of nearest neighbors
alpha (float) – ratio of inlier samples calculated from the training set; ignored if hard_threshold is set
hard_threshold (float) – samples with a distance greater or equal to this threshold will be considered outliers
scaling (str) – scaling method; must be one of ‘robust’, ‘minmax’, ‘maxabs’, ‘standard’ or None (default: ‘robust’)
dist (str) – kNN distance to be calculated (default: euclidean); one of {list(dist_fns.keys())}; jaccard is recommended for binary fingerprints.
scaler_kwargs (dict) – additional parameters to supply to the scaler
n_jobs (int) – number of parallel processes used to fit the kNN model
- contains(X: DataFrame) DataFrame
Check if the applicability domain contains the features.
- Parameters:
X (pd.DataFrame) – array of features to check
- Returns:
- pd.Series of booleans indicating if the features are in the
applicability domain
- Return type:
pd.Series
- property direction: str
Return the direction of the threshold.
The direction should be ‘>’, ‘<’, ‘>=’, ‘<=’
- fit(X)[source]
Fit the applicability domain to the given feature matrix
- Parameters:
X – feature matrix
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- class qsprpred.data.processing.applicability_domain.MLChemAD(applicability_domain: ApplicabilityDomain, astype: str | None = 'float64')[source]
Bases:
ApplicabilityDomainDefine the applicability domain for a dataset using the MLChemAD package.
This class uses the MLChemAD package to filter out molecules that are not in the applicability domain. The MLChemAD package is available at https://github.com/OlivierBeq/MLChemAD
- Variables:
applicabilityDomain (MLChemApplicabilityDomain) – applicability domain object
fitted (bool) – whether the applicability domain is fitted or not
Initialize the MLChemADFilter with the domain_type attribute.
- Parameters:
- contains(X: DataFrame) DataFrame
Check if the applicability domain contains the features.
- Parameters:
X (pd.DataFrame) – array of features to check
- Returns:
- pd.Series of booleans indicating if the features are in the
applicability domain
- Return type:
pd.Series
- property direction: str
Return the direction of the threshold.
The direction should be ‘>’, ‘<’, ‘>=’, ‘<=’
- fit(X: DataFrame) None[source]
Fit the applicability domain model.
- Parameters:
X (pd.DataFrame) – array of features to fit model on
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
qsprpred.data.processing.data_filters module
Filters for QSPR Datasets.
To add a new filter: * Add a DataFilter subclass for your new filter
- class qsprpred.data.processing.data_filters.CategoryFilter(prop: str, values: list[str], data_set: QSPRDataSet | None = None, keep: bool = False)[source]
Bases:
DataFilterTo filter out values from column
- Variables:
Initialize the CategoryFilter with the name, values and keep attributes.
- Parameters:
prop (str) – column based on which to filter.
values (list) – list of values to filter from props.
data_set (QSPRDataSet) – dataset to filter.
keep (bool, optional) – whether to keep or discard the values. Defaults to False.
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- getDataSet() QSPRDataSet
Get the data set attached to this object.
- Returns:
The data set attached to this object
- Return type:
- Raises:
ValueError – If no data set is attached to this object.
- setDataSet(dataset: QSPRDataSet | None) None
Set the data set for this object.
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: DataFrame | None = None) tuple[DataFrame, DataFrame | None][source]
Filter rows from dataframe.
- Parameters:
X (pd.DataFrame) – dataframe to filter.
y (pd.DataFrame, optional) – output dataframe if the filtering method requires it
- Returns:
filtered dataframe. pd.DataFrame: target dataframe.
- Return type:
pd.DataFrame
- class qsprpred.data.processing.data_filters.DataFilter(**kwargs)[source]
Bases:
Step,DataSetDependentFilter out some rows from a dataframe.
Initialize the step
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- getDataSet() QSPRDataSet
Get the data set attached to this object.
- Returns:
The data set attached to this object
- Return type:
- Raises:
ValueError – If no data set is attached to this object.
- setDataSet(dataset: QSPRDataSet | None) None
Set the data set for this object.
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- class qsprpred.data.processing.data_filters.NaNFilter(features: list[str] | None = None, keep: bool = False)[source]
Bases:
DataFilterStep that removes rows containing NaN values in a specified column
Initialize the step with the columns to check for NaN values
If no columns are specified, all columns are checked for NaN values.
- Parameters:
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- getDataSet() QSPRDataSet
Get the data set attached to this object.
- Returns:
The data set attached to this object
- Return type:
- Raises:
ValueError – If no data set is attached to this object.
- setDataSet(dataset: QSPRDataSet | None) None
Set the data set for this object.
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- class qsprpred.data.processing.data_filters.OutlierFilter(ad: ApplicabilityDomain)[source]
Bases:
DataFilterRemove outliers based on an applicability domain
Initialize the OutlierFilter with an applicability domain from MLChemAD.
- Parameters:
ad (MLChemAD | MLChemADApplicabilityDomain) – The applicability domain to use.
- fit(X: DataFrame, y: None | DataFrame = None)[source]
Fit the applicability domain to the data.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame, optional) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- getDataSet() QSPRDataSet
Get the data set attached to this object.
- Returns:
The data set attached to this object
- Return type:
- Raises:
ValueError – If no data set is attached to this object.
- setDataSet(dataset: QSPRDataSet | None) None
Set the data set for this object.
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: DataFrame | None = None) tuple[DataFrame, DataFrame | None][source]
Remove samples outside the applicability domain.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame, optional) – training targets
- Returns:
filtered training data and targets
- Return type:
tuple[pd.DataFrame, pd.DataFrame]
- class qsprpred.data.processing.data_filters.RepeatsFilter(keep: str | bool = False, timecol: str | None = None, additional_cols: list[str] | None = None, data_set: QSPRDataSet | None = None)[source]
Bases:
DataFilterTo filter out duplicate molecules based on descriptor values.
- Variables:
keep (str) – For duplicate entries determines how properties are treated, if False remove both (/all) duplicate entries, if True keep them, if first, keep row of first entry (based on time), if last keep row of last entry based on time. options: ‘first’, ‘last’, True, False
timeCol (str, optional) – name of column containing time of publication used if keep is ‘first’ or ‘last’
additionalCols (list[str], optional) – additional columns to use for determining duplicates (e.g. proteinid, in case of PCM modelling), so that compounds with same X but different proteinid are not removed.
Initialize the RepeatsFilter with the keep, timecol and additional_cols attributes.
- Parameters:
keep (str|bool, optional) – For duplicate entries determines how properties are treated, if False remove both (/all) duplicate entries, if True keep them, if first, keep row of first entry (based on time), if last keep row of last entry based on time. Defaults to False.
timecol (str, optional) – name of column containing time of publication used if keep is ‘first’ or ‘last’. Defaults to None.
additional_cols (list[str], optional) – additional columns to use for determining duplicates (e.g. proteinid, in case of PCM modelling), so that compounds with same X but different proteinid are not removed. Defaults to None.
data_set (QSPRDataSet, optional) – dataset to filter. Defaults to None.
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- getDataSet() QSPRDataSet
Get the data set attached to this object.
- Returns:
The data set attached to this object
- Return type:
- Raises:
ValueError – If no data set is attached to this object.
- setDataSet(dataset: QSPRDataSet | None) None
Set the data set for this object.
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: DataFrame | None = None) tuple[DataFrame, DataFrame | None][source]
Filter rows from dataframe.
- Parameters:
X (pandas dataframe) – dataframe to filter
y (pandas dataframe, optional) – output dataframe if the filtering method requires it
- Returns:
filtered dataframe and target dataframe if provided.
- Return type:
tuple[pd.DataFrame, pd.DataFrame | None]
qsprpred.data.processing.feature_filters module
Different filters to select features from trainingset.
To add a new feature filters: * Add a FeatureFilter subclass for your new filter
- class qsprpred.data.processing.feature_filters.BorutaFilter(boruta_feat_selector: BorutaPy = None, seed: int | None = None)[source]
Bases:
FeatureFilter,RandomizedBoruta filter from BorutaPy: Boruta all-relevant feature selection.
- Variables:
featSelector (BorutaPy) – BorutaPy feature selector
droppedFeatures (pd.Index) – columns dropped by Boruta filter
seed (int) – Random state to use for shuffling and other random operations.
Initialize the BorutaFilter class.
- Parameters:
boruta_feat_selector (BorutaPy, optional) – The BorutaPy feature selector. If not provided, a default BorutaPy instance will be created.
seed (int | None, optional) – Random state to use for shuffling and other random operations. If None, the random state set in the BorutaPy instance is used. Defaults to None.
- fit(X: DataFrame, y: DataFrame)[source]
Fit the Boruta filter to the data.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- classmethod fromFile(filename: str) BorutaFilter[source]
Initialize a new instance from a JSON file.
- toFile(filename: str) str[source]
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: DataFrame | None = None) tuple[DataFrame, DataFrame][source]
Filter out uninformative features from a dataframe using BorutaPy.
- Parameters:
X (pd.DataFrame) – dataframe to be filtered
y (pd.DataFrame, optional) – output dataframe if the filtering method requires it
- Returns:
The filtered dataframe pd.DataFrame: The target dataframe
- Return type:
pd.DataFrame
- class qsprpred.data.processing.feature_filters.FeatureFilter(**kwargs)[source]
Bases:
StepFilter out uninformative featureNames from a dataframe.
Initialize the step
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- abstract transform(X: DataFrame, y: DataFrame | None = None) DataFrame[source]
Filter out uninformative features from a dataframe.
- Parameters:
X (pd.DataFrame) – dataframe to be filtered
y (pd.DataFrame, optional) – output dataframe if the filtering method requires it
- Returns:
The filtered pd.DataFrame
- class qsprpred.data.processing.feature_filters.HighCorrelationFilter(th: float)[source]
Bases:
FeatureFilterRemove features with correlation higher than a given threshold.
- Variables:
th (float) – threshold for correlation
high_corr_cols (pd.Index) – columns with high correlation (if fitted)
Initialize the step
- fit(X: DataFrame, y: None | DataFrame = None)[source]
Find features with correlation higher than a given threshold.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame, optional) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: DataFrame | None = None) tuple[DataFrame, DataFrame][source]
Filter out high correlation features from a dataframe.
- Parameters:
X (pd.DataFrame) – dataframe to be filtered
y (pd.DataFrame, optional) – output dataframe if the filtering method requires it
- Returns:
The filtered dataframe pd.DataFrame: The target dataframe
- Return type:
pd.DataFrame
- class qsprpred.data.processing.feature_filters.LowVarianceFilter(th: float)[source]
Bases:
FeatureFilterRemove features with variance equal to or lower than a given threshold after MinMax scaling.
- Variables:
th (float) – threshold for removing features
low_var_cols (pd.Index) – columns with low variance (if fitted)
Initialize the step
- fit(X: DataFrame, y: None | DataFrame = None)[source]
Find features with variance equal to or lower than a given threshold after MinMax scaling.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame, optional) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: DataFrame | None = None) tuple[DataFrame, DataFrame][source]
Filter out low variance features from a dataframe.
- Parameters:
X (pd.DataFrame) – dataframe to be filtered
y (pd.DataFrame, optional) – output dataframe if the filtering method requires it
- Returns:
The filtered dataframe pd.DataFrame: The target dataframe
- Return type:
pd.DataFrame
qsprpred.data.processing.feature_transformers module
This module is used for feature standardization and transformation in a pipeline.
- class qsprpred.data.processing.feature_transformers.FeatureTransformer(**kwargs)[source]
Bases:
StepBase class for feature transformers
This class is used to standardize or transform feature sets in a pipeline. It should be subclassed to implement specific transformations.
Currently, only the SklearnStep class is implemented, which wraps a scikit-learn transformer.
Initialize the step
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- abstract transform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Apply the step to the dataset
Note. the step should not modify the original data
- Parameters:
X (pd.DataFrame) – data to be transformed
y (pd.DataFrame) – target data to be transformed
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- class qsprpred.data.processing.feature_transformers.SklearnStep(transformer: BaseEstimator)[source]
Bases:
FeatureTransformerStep that wraps a scikit-learn transformer
For example, this can be used to wrap a scikit-learn StandardScaler
- Variables:
transformer (BaseEstimator) – scikit-learn transformer to wrap, should have implementations of the
fitandtransformmethods.
Initialize the SklearnStep
- Parameters:
transformer (BaseEstimator) – scikit-learn transformer to wrap, should have implementations of the
fitandtransformmethods.
- fit(X: DataFrame, y: None | DataFrame = None)[source]
Fit the transformer to the data
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame | None) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None][source]
Transform the data using the transformer
- Parameters:
X (pd.DataFrame) – data to be transformed
y (pd.DataFrame | None) – target data to be transformed
- Returns:
transformed data pd.DataFrame | None: (transformed) target data
- Return type:
pd.DataFrame
qsprpred.data.processing.imputers module
- class qsprpred.data.processing.imputers.FeatureImputer(imputer: _BaseImputer, feature_properties: list[str] | None = None)[source]
Bases:
ImputerInitialize the feature imputer.
- Parameters:
imputer (callable) – imputer function, e.g. from sklearn.impute, should have fit and transform methods
feature_properties (list[str], optional) – feature properties to impute, if None, all features will be imputed. Note that you can set either a DescriptorSet name or a list of feature names prefixed by the DescriptorSet name, e.g. [‘RDKitDesc’, ‘MorganFP_0’, ‘MorganFP_1’]
- fit(X: DataFrame, y: DataFrame)[source]
Fit the imputer to the dataset
- Parameters:
X (pd.DataFrame) – training data features
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- class qsprpred.data.processing.imputers.Imputer(**kwargs)[source]
Bases:
StepInitialize the step
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- abstract transform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame][source]
Impute values in the dataset.
- Parameters:
X (pd.DataFrame) – features (to be imputed)
y (pd.DataFrame) – target data (to be imputed)
- Returns:
(imputed) data pd.DataFrame: (imputed) target data
- Return type:
pd.DataFrame
- class qsprpred.data.processing.imputers.TargetImputer(imputer: _BaseImputer, target_properties: list[str] | None = None)[source]
Bases:
ImputerInitialize the target imputer.
- Parameters:
- fit(X: DataFrame, y: DataFrame)[source]
Fit the imputer to the dataset
- Parameters:
X (pd.DataFrame) – training data features
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
qsprpred.data.processing.mol_processor module
Abstract class that defines a simple callback interface to process molecules.
- class qsprpred.data.processing.mol_processor.MolProcessor[source]
Bases:
ABCA callable that processes a list of molecules either specified as strings, RDKit molecules, or
StoredMolinstances. The processor can also accept additional properties related to the molecules if specified by the caller.
- class qsprpred.data.processing.mol_processor.MolProcessorWithID(id_prop: str | None = 'ID')[source]
Bases:
MolProcessor,ABCA processor that requires a unique identifier for each molecule. Callers are instructed to pass this property with the
requiredPropsattribute.- Variables:
idProp (str) – The name of the passed property that contains the molecule’s unique identifier.
Initialize the processor with the name of the property that contains the molecule’s unique identifier.
- Parameters:
id_prop (str) – Name of the property that contains the molecule’s unique identifier. Defaults to “QSPRID”.
- iterMolsAndIDs(mols, props: dict[str, list] | None)[source]
Iterate over molecules and their corresponding IDs regardless of the input molecule format. This is just a helper function that will detect the input and yield the molecule and its ID.
qsprpred.data.processing.pipeline module
- class qsprpred.data.processing.pipeline.DatasetPipeline(feature_calculators: list[DescriptorSet] | None = None, steps: dict[str, Step | BaseEstimator] | None = None, fixed: list[str] | None = None, fit_on: dict[str, str] | None = None, apply_to: dict[str, str] | None = None, skip: list[str] | None = None, seed: int | None = None)[source]
Bases:
PipelinePipeline class for applying data preprocessing steps to a QSPRDataset.
- Variables:
feature_calculators (list[DescriptorSet] | None) – List of feature calculators to apply to the dataset. If None, no feature calculators are applied.
originalfeatureNames (list[str] | None) – Original feature names in the dataset before applying the pipeline.
Initialize the DatasetPipeline
- Parameters:
feature_calculators (list[DescriptorSet] | None) – List of feature calculators to apply to the dataset.
steps (dict[str, Step | BaseEstimator]) – Dictionary of named steps in the pipeline, if the step is a scikit-learn transformer, it will be wrapped in a SklearnStep.
fixed (list[str]) – List of step names that should not be fitted, only transformed
fit_on (dict[str, str]) – Settings for which data a step should be fitted on. Either ‘train’, ‘test’ or ‘both’, if not specified the step is fitted on the training data.
apply_to (dict[str, str]) – Settings for which data a step should be applied to. Either ‘train’, ‘test’ or ‘both’, if not specified the step is applied to both.
seed (int | None) – Random state for the pipeline
- addStep(name: str, step: Step, fit_on: str = 'train', apply_to: str = 'both', fixed: bool = False)
Add a step to the pipeline
- apply(X_train: DataFrame, y_train: DataFrame | None = None, X_test: DataFrame | None = None, y_test: DataFrame | None = None, fit: bool = True) tuple[DataFrame, DataFrame | None, DataFrame | None, DataFrame | None]
Apply the pipeline to the data
If fit is True, the pipeline is fitted to the training data and then applied to the train and test data. If fit is False, the pipeline is only applied to the data.
- Parameters:
X_train (pd.DataFrame) – training data to apply the pipeline to
y_train (pd.DataFrame | None) – training target data to apply the pipeline to
X_test (pd.DataFrame | None) – test data to apply the pipeline to
y_test (pd.DataFrame | None) – test target data to apply the pipeline to
fit (bool) – whether to fit the pipeline
- Returns:
transformed training data y_train (pd.DataFrame | None): transformed training targets X_test (pd.DataFrame | None): transformed test data y_test (pd.DataFrame | None): transformed test targets
- Return type:
X_train (pd.DataFrame)
- applyOnDataSet(dataset: QSPRTable, split: DataSplit | None = None, fit: bool = True, seed: int | None = None) Generator[tuple[DataFrame, DataFrame, DataFrame, DataFrame] | tuple[DataFrame, DataFrame], None, None][source]
Apply the pipeline to the dataset
- Note. the random state of the dataset is used to randomize the pipeline
when the seed of feature calculators, splits or steps is not set.
- Parameters:
- Yields:
X_train (pd.DataFrame) – transformed training data y_train (pd.DataFrame): transformed training targets X_test (pd.DataFrame | None): transformed test data if split is not None y_test (pd.DataFrame | None): transformed test targets if split is not None
- removeSkip(name: str)
Remove a step from the skip list
- Parameters:
name (str) – name of the step to remove from the skip list
- removeStep(name: str)
Remove a step from the pipeline
- Parameters:
name (str) – name of the step to remove
- property skip: list[str]
Get the steps to skip
The steps to skip are not fitted or transformed, but are still present in the pipeline.
- class qsprpred.data.processing.pipeline.Pipeline(steps: dict[str, Step | BaseEstimator] | None = None, fixed: list[str] | None = None, fit_on: dict[str, str] | None = None, apply_to: dict[str, str] | None = None, skip: list[str] | None = None, seed: int | None = None)[source]
Bases:
Randomized,JSONSerializablePipeline class for for sequentially applying data preprocessing steps.
- Variables:
steps (dict[str, Step | BaseEstimator]) – Dictionary of named steps in the pipeline, if the step is a scikit-learn transformer, it will be wrapped in a SklearnStep.
fixed (list[str]) – List of step names that should not be fitted, only transformed
fitOn (dict[str, str]) – Settings for which data a step should be fitted on. Either ‘train’, ‘test’ or ‘both’, if not specified the step is fitted on the training data.
applyTo (dict[str, str]) – Settings for which data a step should be applied to. Either ‘train’, ‘test’ or ‘both’, if not specified the step is applied to both.
randomState (int | None) – Random state for the pipeline
fitted (bool) – Whether the pipeline is fitted
Initialize the Pipeline
- Parameters:
steps (dict[str, Step | BaseEstimator]) – Dictionary of named steps in the pipeline, if the step is a scikit-learn transformer, it will be wrapped in a SklearnStep.
fixed (list[str]) – List of step names that should not be fitted, only transformed
fit_on (dict[str, str]) – Settings for which data a step should be fitted on. Either ‘train’, ‘test’ or ‘both’, if not specified the step is fitted on the training data.
apply_to (dict[str, str]) – Settings for which data a step should be applied to. Either ‘train’, ‘test’ or ‘both’, if not specified the step is applied to both.
seed (int | None) – Random state for the pipeline
- addSkip(name: str)[source]
Add a step to the skip list
- Parameters:
name (str) – name of the step to skip
- addStep(name: str, step: Step, fit_on: str = 'train', apply_to: str = 'both', fixed: bool = False)[source]
Add a step to the pipeline
- apply(X_train: DataFrame, y_train: DataFrame | None = None, X_test: DataFrame | None = None, y_test: DataFrame | None = None, fit: bool = True) tuple[DataFrame, DataFrame | None, DataFrame | None, DataFrame | None][source]
Apply the pipeline to the data
If fit is True, the pipeline is fitted to the training data and then applied to the train and test data. If fit is False, the pipeline is only applied to the data.
- Parameters:
X_train (pd.DataFrame) – training data to apply the pipeline to
y_train (pd.DataFrame | None) – training target data to apply the pipeline to
X_test (pd.DataFrame | None) – test data to apply the pipeline to
y_test (pd.DataFrame | None) – test target data to apply the pipeline to
fit (bool) – whether to fit the pipeline
- Returns:
transformed training data y_train (pd.DataFrame | None): transformed training targets X_test (pd.DataFrame | None): transformed test data y_test (pd.DataFrame | None): transformed test targets
- Return type:
X_train (pd.DataFrame)
- removeSkip(name: str)[source]
Remove a step from the skip list
- Parameters:
name (str) – name of the step to remove from the skip list
- removeStep(name: str)[source]
Remove a step from the pipeline
- Parameters:
name (str) – name of the step to remove
- property skip: list[str]
Get the steps to skip
The steps to skip are not fitted or transformed, but are still present in the pipeline.
qsprpred.data.processing.step module
- class qsprpred.data.processing.step.DummyStep(**kwargs)[source]
Bases:
StepDummy step that does nothing
Initialize the step
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None][source]
Just return the input data
- Parameters:
X (pd.DataFrame) – data to be transformed
y (pd.DataFrame | None) – target data to be transformed
- Returns:
unchanged data pd.DataFrame | None: unchanged target data
- Return type:
pd.DataFrame
- class qsprpred.data.processing.step.Shuffle(seed: int | None = None)[source]
Bases:
Step,RandomizedStep that shuffles the data
- Variables:
randomState (int | None) – Seed to randomize the shuffle.
Initialize the shuffle step
- Parameters:
seed (int | None) – Seed to randomize the shuffle.
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None][source]
Shuffle the data
- Parameters:
X (pd.DataFrame) – data to be shuffled
y (pd.DataFrame | None) – target data to be shuffled
- Returns:
shuffled data pd.DataFrame | None: shuffled target data
- Return type:
pd.DataFrame
- class qsprpred.data.processing.step.Step(**kwargs)[source]
Bases:
JSONSerializableA data preprocessing step that can be applied to a dataset
Initialize the step
- fit(X: DataFrame, y: None | DataFrame = None)[source]
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None][source]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- abstract transform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None][source]
Apply the step to the dataset
Note. the step should not modify the original data
- Parameters:
X (pd.DataFrame) – data to be transformed
y (pd.DataFrame) – target data to be transformed
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
qsprpred.data.processing.target_transformers module
- class qsprpred.data.processing.target_transformers.Discretizer(target: str, th: list[float] | float)[source]
Bases:
TargetTransformerDiscretizes the target data into bins.
Note. using this step in a pipeline may break the subsequent model training as the discretizer does not update the
targetPropertiesof the dataset. It is recommended to use themakeClassificationmethod of the dataset instead, see the documentation of theQSPRDataSetclass.- Variables:
Initialize the discretizer.
- Parameters:
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- getIntervals(discrete_values: Series) Series[source]
Transform the discretized values to intervals.
- Parameters:
discrete_values (pd.Series) – discretized values
- Returns:
intervals corresponding to the discretized values
- Return type:
pd.Series
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: DataFrame | None = None) tuple[DataFrame, DataFrame | None][source]
Discretize the target data into bins.
- Parameters:
X (pd.DataFrame) – features
y (pd.DataFrame | None) – target data to be discretized
- Returns:
data pd.DataFrame | None: (discretized) target data
- Return type:
pd.DataFrame
- class qsprpred.data.processing.target_transformers.SimpleTargetTransformer(target: str, transformation: Literal['log10', 'log2', 'log', 'sqrt', 'cbrt', 'exp', 'square', 'cube', 'reciprocal'])[source]
Bases:
TargetTransformerApplies a simple transformation to the target data.
- Variables:
transform_dict (dict) – dictionary of available transformations
transformer (callable) – numpy function
Initialize the SklearnStep
- Parameters:
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- getInverseTransformer() callable[source]
Get the inverse transformer function
- Returns:
inverse transformer function
- Return type:
callable
- getTransformer() callable[source]
Get the transformer function
- Returns:
transformer function
- Return type:
callable
- inverseTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None][source]
Inverse transform the data using the inverse transformer
- Parameters:
X (pd.DataFrame) – data to be transformed
y (pd.DataFrame | None) – target data to be transformed
- Returns:
transformed data pd.DataFrame | None: (transformed) target data
- Return type:
pd.DataFrame
- inverse_transform_dict = {'cbrt': <function SimpleTargetTransformer.<lambda>>, 'cube': <function SimpleTargetTransformer.<lambda>>, 'exp': <function SimpleTargetTransformer.<lambda>>, 'log': <function SimpleTargetTransformer.<lambda>>, 'log10': <function SimpleTargetTransformer.<lambda>>, 'log2': <function SimpleTargetTransformer.<lambda>>, 'reciprocal': <function SimpleTargetTransformer.<lambda>>, 'sqrt': <function SimpleTargetTransformer.<lambda>>, 'square': <function SimpleTargetTransformer.<lambda>>}
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- toJSON() str
- Serialize object to a JSON string. This JSON string should
contain all data necessary to reconstruct the object.
- Returns:
JSON string of the object
- Return type:
json (str)
- transform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None][source]
Transform the data using the transformer
- Parameters:
X (pd.DataFrame) – data to be transformed
y (pd.DataFrame | None) – target data to be transformed
- Returns:
transformed data pd.DataFrame | None: (transformed) target data
- Return type:
pd.DataFrame
- transform_dict = {'cbrt': <function SimpleTargetTransformer.<lambda>>, 'cube': <function SimpleTargetTransformer.<lambda>>, 'exp': <function SimpleTargetTransformer.<lambda>>, 'log': <function SimpleTargetTransformer.<lambda>>, 'log10': <function SimpleTargetTransformer.<lambda>>, 'log2': <function SimpleTargetTransformer.<lambda>>, 'reciprocal': <function SimpleTargetTransformer.<lambda>>, 'sqrt': <function SimpleTargetTransformer.<lambda>>, 'square': <function SimpleTargetTransformer.<lambda>>}
- class qsprpred.data.processing.target_transformers.TargetTransformer(**kwargs)[source]
Bases:
StepInitialize the step
- fit(X: DataFrame, y: None | DataFrame = None)
Fit the step to the dataset
If the step requires fitting to the data, this method should be implemented.
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- fitTransform(X: DataFrame, y: None | DataFrame = None) tuple[DataFrame, DataFrame | None]
Fit the step to the dataset and apply it
- Parameters:
X (pd.DataFrame) – training data
y (pd.DataFrame) – training targets
- Returns:
transformed data pd.DataFrame: (transformed) target data
- Return type:
pd.DataFrame
- property fitted: bool
Check if the step is fitted
- Returns:
True if the step is fitted, False otherwise
- Return type:
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
qsprpred.data.processing.tests module
- class qsprpred.data.processing.tests.TestApplicabilityDomain(methodName='runTest')[source]
Bases:
DataSetsPathMixIn,QSPRTestCaseTest the applicability domain.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.processing.tests.TestDataFilters(methodName='runTest')[source]
Bases:
QSPRTestCase,StepCheckMixInTest the data filters, which filter the dataset based on properties.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkFitTransform(step: Step, dataset: QSPRTable, fromfile=False) Tuple[DataFrame, DataFrame | None]
Check basic step fit and transform functionality.
- checkStep(step: Step, dataset: QSPRTable) Tuple[DataFrame, DataFrame | None]
Check basic step functionality and serialization.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.processing.tests.TestDatasetPipeline(methodName='runTest')[source]
Bases:
DataSetsPathMixIn,QSPRTestCaseTest the dataset pipeline.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.processing.tests.TestDummyStep(methodName='runTest')[source]
Bases:
QSPRTestCase,StepCheckMixInTest the dummy step
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkFitTransform(step: Step, dataset: QSPRTable, fromfile=False) Tuple[DataFrame, DataFrame | None]
Check basic step fit and transform functionality.
- checkStep(step: Step, dataset: QSPRTable) Tuple[DataFrame, DataFrame | None]
Check basic step functionality and serialization.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.processing.tests.TestFeatureFilters(methodName='runTest')[source]
Bases:
QSPRTestCase,StepCheckMixInTests to check if the feature filters work on their own.
Note: This also tests the
DataframeDescriptorSet, as it is used to add test descriptors.Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkFitTransform(step: Step, dataset: QSPRTable, fromfile=False) Tuple[DataFrame, DataFrame | None]
Check basic step fit and transform functionality.
- checkStep(step: Step, dataset: QSPRTable) Tuple[DataFrame, DataFrame | None]
Check basic step functionality and serialization.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testBorutaFilter = None
- testBorutaFilter_0(**kw)
Test the Boruta filter, which removes the features which are statistically as [with use_index_cols=True] relevant as random features.
- testBorutaFilter_1(**kw)
Test the Boruta filter, which removes the features which are statistically as [with use_index_cols=False] relevant as random features.
- testHighCorrelationFilter = None
- testHighCorrelationFilter_0(**kw)
Test the high correlation filter, which drops features with a correlation [with use_index_cols=True] above a threshold.
- testHighCorrelationFilter_1(**kw)
Test the high correlation filter, which drops features with a correlation [with use_index_cols=False] above a threshold.
- testLowVarianceFilter = None
- class qsprpred.data.processing.tests.TestFeatureTransformers(methodName='runTest')[source]
Bases:
QSPRTestCase,StepCheckMixInTest the sklearn step which wraps a sklearn transformer.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkFitTransform(step: Step, dataset: QSPRTable, fromfile=False) Tuple[DataFrame, DataFrame | None]
Check basic step fit and transform functionality.
- checkStep(step: Step, dataset: QSPRTable) Tuple[DataFrame, DataFrame | None]
Check basic step functionality and serialization.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.processing.tests.TestImputers(methodName='runTest')[source]
Bases:
QSPRTestCase,StepCheckMixInTest the sklearn step which wraps a sklearn imputer.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkFitTransform(step: Step, dataset: QSPRTable, fromfile=False) Tuple[DataFrame, DataFrame | None]
Check basic step fit and transform functionality.
- checkStep(step: Step, dataset: QSPRTable) Tuple[DataFrame, DataFrame | None]
Check basic step functionality and serialization.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.processing.tests.TestMolProcessor(methodName='runTest')[source]
Bases:
DataSetsPathMixIn,QSPRTestCaseCreate an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- class TestingProcessor(id_prop)[source]
Bases:
MolProcessor- property requiredProps: list[str]
The properties required by the processor. This is to inform the caller that the processor requires certain properties to be passed to the
__call__method or via thepropsattribute ofStoredMolinstances.
- property supportsParallel
Whether the processor supports parallel processing.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testMolProcess = None
- testMolProcess_00_1_50_None_True_None_None(**kw)
- testMolProcess_01_1_50_None_True_None__a_1_(**kw)
- testMolProcess_02_1_50_None_True__1_2__None(**kw)
- testMolProcess_03_1_50_None_True__1_2___a_1_(**kw)
- testMolProcess_04_1_50_None_False_None_None(**kw)
- testMolProcess_05_1_50_None_False_None__a_1_(**kw)
- testMolProcess_06_1_50_None_False__1_2__None(**kw)
- testMolProcess_07_1_50_None_False__1_2___a_1_(**kw)
- testMolProcess_08_1_50__fu_CL__True_None_None(**kw)
- testMolProcess_09_1_50__fu_CL__True_None__a_1_(**kw)
- testMolProcess_10_1_50__fu_CL__True__1_2__None(**kw)
- testMolProcess_11_1_50__fu_CL__True__1_2___a_1_(**kw)
- testMolProcess_12_1_50__fu_CL__False_None_None(**kw)
- testMolProcess_13_1_50__fu_CL__False_None__a_1_(**kw)
- testMolProcess_14_1_50__fu_CL__False__1_2__None(**kw)
- testMolProcess_15_1_50__fu_CL__False__1_2___a_1_(**kw)
- testMolProcess_16_1_50__SMILES__True_None_None(**kw)
- testMolProcess_17_1_50__SMILES__True_None__a_1_(**kw)
- testMolProcess_18_1_50__SMILES__True__1_2__None(**kw)
- testMolProcess_19_1_50__SMILES__True__1_2___a_1_(**kw)
- testMolProcess_20_1_50__SMILES__False_None_None(**kw)
- testMolProcess_21_1_50__SMILES__False_None__a_1_(**kw)
- testMolProcess_22_1_50__SMILES__False__1_2__None(**kw)
- testMolProcess_23_1_50__SMILES__False__1_2___a_1_(**kw)
- testMolProcess_24_1_None_None_True_None_None(**kw)
- testMolProcess_25_1_None_None_True_None__a_1_(**kw)
- testMolProcess_26_1_None_None_True__1_2__None(**kw)
- testMolProcess_27_1_None_None_True__1_2___a_1_(**kw)
- testMolProcess_28_1_None_None_False_None_None(**kw)
- testMolProcess_29_1_None_None_False_None__a_1_(**kw)
- testMolProcess_30_1_None_None_False__1_2__None(**kw)
- testMolProcess_31_1_None_None_False__1_2___a_1_(**kw)
- testMolProcess_32_1_None__fu_CL__True_None_None(**kw)
- testMolProcess_33_1_None__fu_CL__True_None__a_1_(**kw)
- testMolProcess_34_1_None__fu_CL__True__1_2__None(**kw)
- testMolProcess_35_1_None__fu_CL__True__1_2___a_1_(**kw)
- testMolProcess_36_1_None__fu_CL__False_None_None(**kw)
- testMolProcess_37_1_None__fu_CL__False_None__a_1_(**kw)
- testMolProcess_38_1_None__fu_CL__False__1_2__None(**kw)
- testMolProcess_39_1_None__fu_CL__False__1_2___a_1_(**kw)
- testMolProcess_40_1_None__SMILES__True_None_None(**kw)
- testMolProcess_41_1_None__SMILES__True_None__a_1_(**kw)
- testMolProcess_42_1_None__SMILES__True__1_2__None(**kw)
- testMolProcess_43_1_None__SMILES__True__1_2___a_1_(**kw)
- testMolProcess_44_1_None__SMILES__False_None_None(**kw)
- testMolProcess_45_1_None__SMILES__False_None__a_1_(**kw)
- testMolProcess_46_1_None__SMILES__False__1_2__None(**kw)
- testMolProcess_47_1_None__SMILES__False__1_2___a_1_(**kw)
- testMolProcess_48_2_50_None_True_None_None(**kw)
- testMolProcess_49_2_50_None_True_None__a_1_(**kw)
- testMolProcess_50_2_50_None_True__1_2__None(**kw)
- testMolProcess_51_2_50_None_True__1_2___a_1_(**kw)
- testMolProcess_52_2_50_None_False_None_None(**kw)
- testMolProcess_53_2_50_None_False_None__a_1_(**kw)
- testMolProcess_54_2_50_None_False__1_2__None(**kw)
- testMolProcess_55_2_50_None_False__1_2___a_1_(**kw)
- testMolProcess_56_2_50__fu_CL__True_None_None(**kw)
- testMolProcess_57_2_50__fu_CL__True_None__a_1_(**kw)
- testMolProcess_58_2_50__fu_CL__True__1_2__None(**kw)
- testMolProcess_59_2_50__fu_CL__True__1_2___a_1_(**kw)
- testMolProcess_60_2_50__fu_CL__False_None_None(**kw)
- testMolProcess_61_2_50__fu_CL__False_None__a_1_(**kw)
- testMolProcess_62_2_50__fu_CL__False__1_2__None(**kw)
- testMolProcess_63_2_50__fu_CL__False__1_2___a_1_(**kw)
- testMolProcess_64_2_50__SMILES__True_None_None(**kw)
- testMolProcess_65_2_50__SMILES__True_None__a_1_(**kw)
- testMolProcess_66_2_50__SMILES__True__1_2__None(**kw)
- testMolProcess_67_2_50__SMILES__True__1_2___a_1_(**kw)
- testMolProcess_68_2_50__SMILES__False_None_None(**kw)
- testMolProcess_69_2_50__SMILES__False_None__a_1_(**kw)
- testMolProcess_70_2_50__SMILES__False__1_2__None(**kw)
- testMolProcess_71_2_50__SMILES__False__1_2___a_1_(**kw)
- testMolProcess_72_2_None_None_True_None_None(**kw)
- testMolProcess_73_2_None_None_True_None__a_1_(**kw)
- testMolProcess_74_2_None_None_True__1_2__None(**kw)
- testMolProcess_75_2_None_None_True__1_2___a_1_(**kw)
- testMolProcess_76_2_None_None_False_None_None(**kw)
- testMolProcess_77_2_None_None_False_None__a_1_(**kw)
- testMolProcess_78_2_None_None_False__1_2__None(**kw)
- testMolProcess_79_2_None_None_False__1_2___a_1_(**kw)
- testMolProcess_80_2_None__fu_CL__True_None_None(**kw)
- testMolProcess_81_2_None__fu_CL__True_None__a_1_(**kw)
- testMolProcess_82_2_None__fu_CL__True__1_2__None(**kw)
- testMolProcess_83_2_None__fu_CL__True__1_2___a_1_(**kw)
- testMolProcess_84_2_None__fu_CL__False_None_None(**kw)
- testMolProcess_85_2_None__fu_CL__False_None__a_1_(**kw)
- testMolProcess_86_2_None__fu_CL__False__1_2__None(**kw)
- testMolProcess_87_2_None__fu_CL__False__1_2___a_1_(**kw)
- testMolProcess_88_2_None__SMILES__True_None_None(**kw)
- testMolProcess_89_2_None__SMILES__True_None__a_1_(**kw)
- testMolProcess_90_2_None__SMILES__True__1_2__None(**kw)
- testMolProcess_91_2_None__SMILES__True__1_2___a_1_(**kw)
- testMolProcess_92_2_None__SMILES__False_None_None(**kw)
- testMolProcess_93_2_None__SMILES__False_None__a_1_(**kw)
- testMolProcess_94_2_None__SMILES__False__1_2__None(**kw)
- testMolProcess_95_2_None__SMILES__False__1_2___a_1_(**kw)
- class qsprpred.data.processing.tests.TestPipeline(methodName='runTest')[source]
Bases:
DataSetsPathMixIn,QSPRTestCaseTest the dataset pipeline.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.processing.tests.TestShuffle(methodName='runTest')[source]
Bases:
QSPRTestCase,StepCheckMixInTest the shuffle step in the pipeline.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkFitTransform(step: Step, dataset: QSPRTable, fromfile=False) Tuple[DataFrame, DataFrame | None]
Check basic step fit and transform functionality.
- checkStep(step: Step, dataset: QSPRTable) Tuple[DataFrame, DataFrame | None]
Check basic step functionality and serialization.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.data.processing.tests.TestTargetTransformers(methodName='runTest')[source]
Bases:
QSPRTestCase,StepCheckMixInTest the sklearn step which wraps a sklearn transformer for targets.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEndsWith(s, suffix, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertHasAttr(obj, name, msg=None)
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertIsSubclass(cls, superclass, msg=None)
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
outputandrecords. At the end of the context manager, theoutputattribute will be a list of the matching formatted log messages and therecordsattribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEndsWith(s, suffix, msg=None)
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotHasAttr(obj, name, msg=None)
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotIsSubclass(cls, superclass, msg=None)
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertNotStartsWith(s, prefix, msg=None)
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertStartsWith(s, prefix, msg=None)
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkFitTransform(step: Step, dataset: QSPRTable, fromfile=False) Tuple[DataFrame, DataFrame | None]
Check basic step fit and transform functionality.
- checkStep(step: Step, dataset: QSPRTable) Tuple[DataFrame, DataFrame | None]
Check basic step functionality and serialization.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': TargetTasks.MULTICLASS, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a large dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=42, drop_empty_target_props=True)
Create a small dataset for testing purposes.
- Parameters:
- Returns:
a
QSPRDataSetobject- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': TargetTasks.REGRESSION}], random_state=None, n_jobs=1, chunk_size=None, drop_empty_target_props=True)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
n_jobs (int) – number of jobs to use for parallel processing
chunk_size (int) – size of chunks to use per job in parallel processing
- Returns:
a
QSPRDataSetobject- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptorSets()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests.
It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep(add_imputer=None)
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGridas well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFramecontaining the dataset- Return type:
pd.DataFrame
- getStorage(df, name, n_jobs=1, chunk_size=None)
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Create the directories that are used for testing.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.