qsprpred.models package
Subpackages
Submodules
qsprpred.models.early_stopping module
Early stopping for training of models.
- class qsprpred.models.early_stopping.EarlyStopping(mode: ~qsprpred.models.early_stopping.EarlyStoppingMode = EarlyStoppingMode.NOT_RECORDING, num_epochs: int | None = None, aggregate_func: ~typing.Callable[[list[int]], int] = <function mean>)[source]
Bases:
JSONSerializable
Early stopping tracker for training of QSPRpred models.
An instance of this class is used to track the number of epochs trained in a model when early stopping (mode RECORDING) is used. This information can then be used to determine the optimal number of epochs to train in a model training without early stopping (mode OPTIMAL). The optimal number of epochs is determined by aggregating the number of epochs trained in previous model trainings with early stopping. The aggregation function can be specified by the user. The number of epochs to train in a model training without early stopping can also be specified manually (mode FIXED). Models can also be trained with early stopping without recording the number of epochs trained (mode NOT_RECORDING), e.g. useful when hyperparameter tuning is performed with early stopping.
- Variables:
mode (EarlyStoppingMode) – early stopping mode
numEpochs (int) – number of epochs to train in FIXED mode.
aggregatefunc (function) – numpy function to aggregate trained epochs in OPTIMAL mode. Defaults to np.mean.
trainedEpochs (list[int]) – list of number of epochs trained in a model training with early stopping on RECORDING mode.
Initialize early stopping.
- Parameters:
mode (EarlyStoppingMode) – early stopping mode
num_epochs (int, optional) – number of epochs to train in FIXED mode.
aggregate_func (function, optional) – numpy function to aggregate trained epochs in OPTIMAL mode. Note, non-numpy functions are not supported.
- recordEpochs(epochs: int)[source]
Record number of epochs.
- Parameters:
epochs (int) – number of epochs
- toFile(filename: str) str
Serialize object to a JSON file. This JSON file should contain all data necessary to reconstruct the object.
- class qsprpred.models.early_stopping.EarlyStoppingMode(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
Enum representing the type of early stopping to use.
- Variables:
NOT_RECORDING (str) – early stopping, not recording number of epochs
RECORDING (str) – early stopping, recording number of epochs
FIXED (str) – no early stopping, specified number of epochs
OPTIMAL (str) – no early stopping, optimal number of epochs determined by previous training runs with early stopping (e.g. average number of epochs trained in cross validation with early stopping)
- FIXED = 'FIXED'
- NOT_RECORDING = 'NOT_RECORDING'
- OPTIMAL = 'OPTIMAL'
- RECORDING = 'RECORDING'
qsprpred.models.hyperparam_optimization module
Module for hyperparameter optimization of QSPRModels.
- class qsprpred.models.hyperparam_optimization.GridSearchOptimization(param_grid: dict, model_assessor: ~qsprpred.models.assessment.methods.ModelAssessor, score_aggregation: ~typing.Callable = <function mean>, monitor: ~qsprpred.models.monitors.HyperparameterOptimizationMonitor | None = None)[source]
Bases:
HyperparameterOptimization
Class for hyperparameter optimization of QSPRModels using GridSearch.
Initialize the class.
- Parameters:
param_grid (dict) – dictionary with parameter names as keys and lists of parameter settings to try as values
model_assessor (ModelAssessor) – assessment method to use for the optimization
score_aggregation (Callable) – function to aggregate the scores of different folds if the assessment method returns multiple predictions (default: np.mean)
monitor (HyperparameterOptimizationMonitor) – monitor for the optimization, if None, a BaseMonitor is used
- optimize(model: QSPRModel, ds: QSPRDataset, save_params: bool = True, refit_optimal: bool = False, **kwargs) dict [source]
Optimize the hyperparameters of the model.
- Parameters:
model (QSPRModel) – the model to optimize
ds (QSPRDataset) – dataset to use for the optimization
save_params (bool) – whether to set and save the best parameters to the model after optimization
refit_optimal (bool) – whether to refit the model with the optimal parameters on the entire training set after optimization. This implies ‘save_params=True’.
**kwargs – additional arguments for the assessment method
- Returns:
best parameters found during optimization
- Return type:
- saveResults(model: QSPRModel, ds: QSPRDataset, save_params: bool, refit_optimal: bool)
Handles saving of optimization results.
- Parameters:
model (QSPRModel) – model that was optimized
ds (QSPRDataset) – dataset used in the optimization
save_params (bool) – whether to re-initialize the model with the best parameters
refit_optimal (bool) – same as ‘save_params’, but also refits the model on the entire training set
- class qsprpred.models.hyperparam_optimization.HyperparameterOptimization(param_grid: dict, model_assessor: ModelAssessor, score_aggregation: Callable[[Iterable], float], monitor: HyperparameterOptimizationMonitor | None = None)[source]
Bases:
ABC
Base class for hyperparameter optimization.
- Variables:
runAssessment (ModelAssessor) – evaluation method to use
scoreAggregation (Callable[[Iterable], float]) – function to aggregate scores
paramGrid (dict) – dictionary of parameters to optimize
monitor (HyperparameterOptimizationMonitor) – monitor to track the optimization
bestScore (float) – best score found during optimization
bestParams (dict) – best parameters found during optimization
Initialize the hyperparameter optimization class.
- param_grid (dict):
dictionary of parameters to optimize
- model_assessor (ModelAssessor):
assessment method to use for determining the best parameters
score_aggregation (Callable[[Iterable], float]): function to aggregate scores monitor (HyperparameterOptimizationMonitor): monitor to track the optimization,
if None, a BaseMonitor is used
- abstract optimize(model: QSPRModel, ds: QSPRDataset, refit_optimal: bool = False) dict [source]
Optimize the model hyperparameters.
- Parameters:
model (QSPRModel) – model to optimize
ds (QSPRDataset) – dataset to use for the optimization
refit_optimal (bool) – whether to refit the model with the optimal parameters on the entire training set after optimization
- Returns:
dictionary of best parameters
- Return type:
- saveResults(model: QSPRModel, ds: QSPRDataset, save_params: bool, refit_optimal: bool)[source]
Handles saving of optimization results.
- Parameters:
model (QSPRModel) – model that was optimized
ds (QSPRDataset) – dataset used in the optimization
save_params (bool) – whether to re-initialize the model with the best parameters
refit_optimal (bool) – same as ‘save_params’, but also refits the model on the entire training set
- class qsprpred.models.hyperparam_optimization.OptunaOptimization(param_grid: dict, model_assessor: ~qsprpred.models.assessment.methods.ModelAssessor, score_aggregation: ~typing.Callable[[~typing.Iterable], float] = <function mean>, monitor: ~qsprpred.models.monitors.HyperparameterOptimizationMonitor | None = None, n_trials: int = 100, n_jobs: int = 1)[source]
Bases:
HyperparameterOptimization
Class for hyperparameter optimization of QSPRModels using Optuna.
- Variables:
- Example of OptunaOptimization for scikit-learn’s MLPClassifier:
>>> model = SklearnModel(base_dir=".", >>> alg = MLPClassifier(), alg_name="MLP") >>> search_space = { >>> "learning_rate_init": ["float", 1e-5, 1e-3,], >>> "power_t" : ["discrete_uniform", 0.2, 0.8, 0.1], >>> "momentum": ["float", 0.0, 1.0], >>> } >>> optimizer = OptunaOptimization( >>> scoring="average_precision", >>> param_grid=search_space, >>> n_trials=10 >>> ) >>> best_params = optimizer.optimize(model, dataset) # dataset is a QSPRDataset
- Available suggestion types:
[“categorical”, “discrete_uniform”, “float”, “int”, “loguniform”, “uniform”]
Initialize the class for hyperparameter optimization of QSPRModels using Optuna.
- Parameters:
param_grid (dict) – search space for bayesian optimization, keys are the parameter names, values are lists with first element the type of the parameter and the following elements the parameter bounds or values.
model_assessor (ModelAssessor) – assessment method to use for the optimization (default: CrossValAssessor)
score_aggregation (Callable) – function to aggregate the scores of different folds if the assessment method returns multiple predictions
monitor (HyperparameterOptimizationMonitor) – monitor for the optimization, if None, a BaseMonitor is used
n_trials (int) – number of trials for bayes optimization
n_jobs (int) – number of jobs to run in parallel. At the moment only n_jobs=1 is supported.
- objective(trial: Trial, model: QSPRModel, ds: QSPRDataset, **kwargs) float [source]
Objective for bayesian optimization.
- Parameters:
trial (optuna.trial.Trial) – trial object for the optimization
model (QSPRModel) – the model to optimize
ds (QSPRDataset) – dataset to use for the optimization
**kwargs – additional arguments for the assessment method
- Returns:
score of the model with the current parameters
- Return type:
- optimize(model: QSPRModel, ds: QSPRDataset, save_params: bool = True, refit_optimal: bool = False, **kwargs) dict [source]
Bayesian optimization of hyperparameters using optuna.
- Parameters:
model (QSPRModel) – the model to optimize
ds (QSPRDataset) – dataset to use for the optimization
save_params (bool) – whether to set and save the best parameters to the model after optimization
refit_optimal (bool) – Whether to refit the model with the optimal parameters on the entire training set after optimization. This implies ‘save_params=True’.
**kwargs – additional arguments for the assessment method
- Returns:
best parameters found during optimization
- Return type:
- saveResults(model: QSPRModel, ds: QSPRDataset, save_params: bool, refit_optimal: bool)
Handles saving of optimization results.
- Parameters:
model (QSPRModel) – model that was optimized
ds (QSPRDataset) – dataset used in the optimization
save_params (bool) – whether to re-initialize the model with the best parameters
refit_optimal (bool) – same as ‘save_params’, but also refits the model on the entire training set
qsprpred.models.model module
This module holds the base class for QSPRmodels, model types should be a subclass.
- class qsprpred.models.model.QSPRModel(base_dir: str, alg: Type | None = None, name: str | None = None, parameters: dict | None = None, autoload=True, random_state: int | None = None)[source]
Bases:
JSONSerializable
,ABC
The definition of the common model interface for the package.
The QSPRModel handles model initialization, fitting, predicting and saving.
- Variables:
name (str) – name of the model
data (QSPRDataset) – data set used to train the model
alg (Type) – estimator class
parameters (dict) – dictionary of algorithm specific parameters
estimator (Any) – the underlying estimator instance of the type specified in
QSPRModel.alg
, ifQSPRModel.fit
or optimization was performedfeatureCalculators (MoleculeDescriptorsCalculator) – feature calculator instance taken from the data set or deserialized from file if the model is loaded without data
featureStandardizer (SKLearnStandardizer) – feature standardizer instance taken from the data set or deserialized from file if the model is loaded without data
baseDir (str) – base directory of the model, the model files are stored in a subdirectory
{baseDir}/{outDir}/
earlyStopping (EarlyStopping) – early stopping tracker for training of QSPRpred models that support early stopping (e.g. neural networks)
randomState (int) – Random state to use for all random operations for reproducibility.
Initialize a QSPR model instance.
If the model is loaded from file, the data set is not required. Note that the data set is required for fitting and optimization.
- Parameters:
base_dir (str) – base directory of the model, the model files are stored in a subdirectory
{baseDir}/{outDir}/
alg (Type) – estimator class
name (str) – name of the model
parameters (dict) – dictionary of algorithm specific parameters
autoload (bool) – if
True
, the estimator is loaded from the serialized file if it exists, otherwise a new instance of alg is createdrandom_state (int) – Random state to use for shuffling and other random operations.
- checkData(ds: QSPRDataset, exception: bool = True) bool [source]
Check if the model has a data set.
- Parameters:
ds (QSPRDataset) – data set to check
exception (bool) – if true, an exception is raised if no data is set
- Returns:
True if data is set, False otherwise (if exception is False)
- Return type:
- property classPath: str
Return the fully classified path of the model.
- Returns:
class path of the model
- Return type:
- convertToNumpy(X: DataFrame | ndarray | QSPRDataset, y: DataFrame | ndarray | QSPRDataset | None = None) tuple[numpy.ndarray, numpy.ndarray] | ndarray [source]
Convert the given data matrix and target matrix to np.ndarray format.
- Parameters:
X (pd.DataFrame, np.ndarray, QSPRDataset) – data matrix
y (pd.DataFrame, np.ndarray, QSPRDataset) – target matrix
- Returns:
data matrix and/or target matrix in np.ndarray format
- createPredictionDatasetFromMols(mols: list[str | rdkit.Chem.rdchem.Mol], smiles_standardizer: str | Callable[[str], str] = 'chembl', n_jobs: int = 1, fill_value: float = nan) tuple[qsprpred.data.tables.qspr.QSPRDataset, numpy.ndarray] [source]
Create a
QSPRDataset
instance from a list of SMILES strings.- Parameters:
- Returns:
a tuple containing the
QSPRDataset
instance and a boolean mask indicating which molecules failed to be processed- Return type:
- abstract fit(X: DataFrame | ndarray, y: DataFrame | ndarray, estimator: Any = None, mode: EarlyStoppingMode = EarlyStoppingMode.NOT_RECORDING, monitor: FitMonitor = None, **kwargs) Any | tuple[Any, int] | None [source]
Fit the model to the given data matrix or
QSPRDataset
.- Note. convertToNumpy can be called here, to convert the input data to
np.ndarray format.
Note. if no estimator is given, the estimator instance of the model is used.
- Note. if a model supports early stopping, the fit function should have the
early_stopping
decorator and the mode argument should be used to set the early stopping mode. If the model does not support early stopping, the mode argument is ignored.
- Parameters:
X (pd.DataFrame, np.ndarray) – data matrix to fit
y (pd.DataFrame, np.ndarray) – target matrix to fit
estimator (Any) – estimator instance to use for fitting
mode (EarlyStoppingMode) – early stopping mode
monitor (FitMonitor) – monitor for the fitting process, if None, the base monitor is used
kwargs – additional arguments to pass to the fit method of the estimator
- Returns:
fitted estimator instance int: in case of early stopping, the number of iterations
after which the model stopped training
- Return type:
Any
- fitDataset(ds: QSPRDataset, monitor=None, mode=EarlyStoppingMode.OPTIMAL, save_model=True, save_data=False, **kwargs) str [source]
Train model on the whole attached data set.
** IMPORTANT ** For models that supportEarlyStopping,
CrossValAssessor
should be run first, so that the average number of epochs from the cross-validation with early stopping can be used for fitting the model.- Parameters:
ds (QSPRDataset) – data set to fit this model on
monitor (FitMonitor) – monitor for the fitting process, if None, the base monitor is used
mode (EarlyStoppingMode) – early stopping mode for models that support early stopping, by default fit the ‘optimal’ number of epochs previously stopped at in model assessment on train or test set, to avoid the use of extra data for a validation set.
save_model (bool) – save the model to file
save_data (bool) – save the supplied dataset to file
kwargs – additional arguments to pass to fit
- Returns:
path to the saved model, if
save_model
is True- Return type:
- getParameters(new_parameters) dict | None [source]
Get the model parameters combined with the given parameters.
If both the model and the given parameters contain the same key, the value from the given parameters is used.
- static handleInvalidsInPredictions(mols: list[str], predictions: ndarray | list[numpy.ndarray], failed_mask: ndarray) ndarray [source]
Replace invalid predictions with None.
- Parameters:
mols (MoleculeTable) – molecules for which the predictions were made
predictions (np.ndarray) – predictions made by the model
failed_mask (np.ndarray) – boolean mask of failed predictions
- Returns:
predictions with invalids replaced by None
- Return type:
np.ndarray
- initFromDataset(data: QSPRDataset | None)[source]
- initRandomState(random_state)[source]
Set random state if applicable. Defaults to random state of dataset if no random state is provided,
- Parameters:
random_state (int) – Random state to use for shuffling and other random operations.
- property isMultiTask: bool
Return if model is a multitask model, taken from the data set or deserialized from file if the model is loaded without data.
- Returns:
True if model is a multitask model
- Return type:
- abstract loadEstimator(params: dict | None = None) object [source]
Initialize estimator instance with the given parameters.
If
params
isNone
, the default parameters will be used.
- abstract loadEstimatorFromFile(params: dict | None = None) object [source]
Load estimator instance from file and apply the given parameters.
- classmethod loadParamsGrid(fname: str, optim_type: str, model_types: str) ndarray [source]
Load parameter grids for bayes or grid search parameter optimization from json file.
- Parameters:
- Returns:
array with three columns containing modeltype, optimization type (grid or bayes) and model type
- Return type:
np.ndarray
- property optimalEpochs: int | None
Return the optimal number of epochs for early stopping.
- Returns:
optimal number of epochs
- Return type:
int | None
- property outDir: str
Return output directory of the model, the model files are stored in this directory (
{baseDir}/{name}
).- Returns:
output directory of the model
- Return type:
- property outPrefix: str
Return output prefix of the model files.
The model files are stored with this prefix (i.e.
{outPrefix}_meta.json
).- Returns:
output prefix of the model files
- Return type:
- abstract predict(X: DataFrame | ndarray | QSPRDataset, estimator: Any = None) ndarray [source]
Make predictions for the given data matrix or
QSPRDataset
.Note. convertToNumpy can be called here, to convert the input data to np.ndarray format.
- Note. if no estimator is given, the estimator instance of the model
is used.
- Parameters:
X (pd.DataFrame, np.ndarray, QSPRDataset) – data matrix to predict
estimator (Any) – estimator instance to use for fitting
- Returns:
2D array containing the predictions, where each row corresponds to a sample in the data and each column to a target property
- Return type:
np.ndarray
- predictDataset(dataset: QSPRDataset, use_probas: bool = False) ndarray | list[numpy.ndarray] [source]
Make predictions for the given dataset.
- Parameters:
dataset – a
QSPRDataset
instanceuse_probas – use probabilities if this is a classification model
- Returns:
an array of predictions or a list of arrays of predictions (for classification models with use_probas=True)
- Return type:
np.ndarray | list[np.ndarray]
- predictMols(mols: List[str | Mol], use_probas: bool = False, smiles_standardizer: str | callable = 'chembl', n_jobs: int = 1, fill_value: float = nan, use_applicability_domain: bool = False) ndarray | list[numpy.ndarray] [source]
Make predictions for the given molecules.
- Parameters:
mols (List[str | Mol]) – list of SMILES strings
use_probas (bool) – use probabilities for classification models
smiles_standardizer – either
chembl
,old
, or a partial function that reads and standardizes smiles.n_jobs – Number of jobs to use for parallel processing.
fill_value – Value to use for missing values in the feature matrix.
use_applicability_domain – Use applicability domain to return if a molecule is within the applicability domain of the model.
- Returns:
- an array of predictions or a list of arrays of predictions
(for classification models with use_probas=True)
- np.ndarray[bool]: boolean mask indicating which molecules fall
within the applicability domain of the model
- Return type:
np.ndarray | list[np.ndarray]
- abstract predictProba(X: DataFrame | ndarray | QSPRDataset, estimator: Any = None) list[numpy.ndarray] [source]
Make predictions for the given data matrix or
QSPRDataset
, but use probabilities for classification models. Does not work with regression models.Note. convertToNumpy can be called here, to convert the input data to np.ndarray format.
- Note. if no estimator is given, the estimator instance of the model
is used.
- Parameters:
X (pd.DataFrame, np.ndarray, QSPRDataset) – data matrix to make predict
estimator (Any) – estimator instance to use for fitting
- Returns:
a list of 2D arrays containing the probabilities for each class, where each array corresponds to a target property, each row to a sample in the data and each column to a class
- Return type:
list[np.ndarray]
- save(save_estimator=False)[source]
Save model to file.
- Parameters:
save_estimator (bool) – Explicitly save the estimator to file, if
True
. Note that some models may save the estimator by default even if this argument isFalse
.- Returns:
absolute path to the metafile of the saved model str:
absolute path to the saved estimator, if
include_estimator
isTrue
- Return type:
- abstract saveEstimator() str [source]
Save the underlying estimator to file.
- Returns:
absolute path to the saved estimator
- Return type:
path (str)
- setParams(params: dict | None, reset_estimator: bool = True)[source]
Set model parameters. The estimator is also updated with the new parameters if ‘reload_estimator’ is
True
.
- abstract property supportsEarlyStopping: bool
Return if the model supports early stopping.
- Returns:
True if the model supports early stopping
- Return type:
- property task: ModelTasks
Return the task of the model, taken from the data set or deserialized from file if the model is loaded without data.
- Returns:
task of the model
- Return type:
qsprpred.models.monitors module
- class qsprpred.models.monitors.AssessorMonitor[source]
Bases:
FitMonitor
Base class for monitoring the assessment of a model.
- abstract onAssessmentEnd(predictions: DataFrame)[source]
Called after the assessment has finished.
- Parameters:
predictions (pd.DataFrame) – predictions of the assessment
- abstract onAssessmentStart(model: QSPRModel, data: QSPRDataset, assesment_type: str)[source]
Called before the assessment has started.
- Parameters:
model (QSPRModel) – model to assess
data (QSPRDataset) – data set used in assessment
assesment_type (str) – type of assessment
- abstract onBatchStart(batch: int)
Called before each batch of the training.
- Parameters:
batch (int) – index of the current batch
- abstract onEpochEnd(epoch: int, train_loss: float, val_loss: float | None = None)
Called after each epoch of the training.
- abstract onEpochStart(epoch: int)
Called before each epoch of the training.
- Parameters:
epoch (int) – index of the current epoch
- abstract onFitEnd(estimator: Any, best_epoch: int | None = None)
Called after the training has finished.
- Parameters:
estimator (Any) – estimator that was fitted
best_epoch (int | None) – index of the best epoch
- abstract onFitStart(model: QSPRModel, X_train: ndarray, y_train: ndarray, X_val: ndarray | None = None, y_val: ndarray | None = None)
Called before the training has started.
- Parameters:
model (QSPRModel) – model to be fitted
X_train (np.ndarray) – training data
y_train (np.ndarray) – training targets
X_val (np.ndarray | None) – validation data, used for early stopping
y_val (np.ndarray | None) – validation targets, used for early stopping
- abstract onFoldEnd(model_fit: Any | tuple[Any, int], fold_predictions: DataFrame)[source]
Called after each fold of the assessment.
- abstract onFoldStart(fold: int, X_train: ndarray, y_train: ndarray, X_test: ndarray, y_test: ndarray)[source]
Called before each fold of the assessment.
- Parameters:
fold (int) – index of the current fold
X_train (np.ndarray) – training data of the current fold
y_train (np.ndarray) – training targets of the current fold
X_test (np.ndarray) – test data of the current fold
y_test (np.ndarray) – test targets of the current fold
- class qsprpred.models.monitors.BaseMonitor[source]
Bases:
HyperparameterOptimizationMonitor
Base monitoring the fitting, training and optimization of a model.
Information about the fitting, training and optimization process is stored internally, but not logged. This class can be used as a base class for other other monitors that do log the information elsewhere.
If used to monitor hyperparameter optimization, the information about the underlying assessments and fits is stored in the assessments and fits attributes, respectively. If used to monitor assessment, the information about the fits is stored in the fits attribute.
- Variables:
config (dict) – configuration of the hyperparameter optimization
bestScore (float) – best score found during optimization
bestParameters (dict) – best parameters found during optimization
assessments (dict) –
dictionary of assessments, keyed by the iteration number (each assessment includes: assessmentModel, assessmentDataset, foldData,
predictions, estimators, fits)
scores (pd.DataFrame) – scores for each hyperparameter search iteration
model (QSPRModel) – model to optimize
data (QSPRDataset) – dataset used in optimization
assessmentType (str) – type of current assessment
assessmentModel (QSPRModel) – model to assess in current assessment
assessmentDataset (QSPRDataset) – data set used in current assessment
foldData (dict) – dictionary of input data, keyed by the fold index, of the current assessment
predictions (pd.DataFrame) – predictions for the dataset of the current assessment
estimators (dict) – dictionary of fitted estimators, keyed by the fold index of the current assessment
currentFold (int) – index of the current fold of the current assessment
fits (dict) –
dictionary of fit data, keyed by the fold index of the current assessment (each fit includes: fitData, fitLog, batchLog, bestEstimator,
bestEpoch)
fitData (dict) – dictionary of input data of the current fit of the current assessment
fitModel (QSPRModel) – model to fit in current fit of the current assessment
fitLog (pd.DataFrame) – log of the training process of the current fit of the current assessment
batchLog (pd.DataFrame) – log of the training process per batch of the current fit of the current assessment
currentEpoch (int) – index of the current epoch of the current fit of the current assessment
currentBatch (int) – index of the current batch of the current fit of the current assessment
bestEstimator (Any) – best estimator of the current fit of the current assessment
bestEpoch (int) – index of the best epoch of the current fit of the current assessment
- onAssessmentEnd(predictions: DataFrame)[source]
Called after the assessment has finished.
- Parameters:
predictions (pd.DataFrame) – predictions of the assessment
- onAssessmentStart(model: QSPRModel, data: QSPRDataset, assesment_type: str)[source]
Called before the assessment has started.
- Parameters:
model (QSPRModel) – model to assess
data (QSPRDataset) – data set used in assessment
assesment_type (str) – type of assessment
- onBatchStart(batch: int)[source]
Called before each batch of the training.
- Parameters:
batch (int) – index of the current batch
- onEpochEnd(epoch: int, train_loss: float, val_loss: float | None = None)[source]
Called after each epoch of the training.
- onEpochStart(epoch: int)[source]
Called before each epoch of the training.
- Parameters:
epoch (int) – index of the current epoch
- onFitEnd(estimator: Any, best_epoch: int | None = None)[source]
Called after the training has finished.
- Parameters:
estimator (Any) – estimator that was fitted
best_epoch (int | None) – index of the best epoch
- onFitStart(model: QSPRModel, X_train: ndarray, y_train: ndarray, X_val: ndarray | None = None, y_val: ndarray | None = None)[source]
Called before the training has started.
- Parameters:
model (QSPRModel) – model to be fitted
data (QSPRDataset) – data set used in training
X_train (np.ndarray) – training data
y_train (np.ndarray) – training targets
X_val (np.ndarray | None) – validation data, used for early stopping
y_val (np.ndarray | None) – validation targets, used for early stopping
- onFoldEnd(model_fit: Any | tuple[Any, int], fold_predictions: DataFrame)[source]
Called after each fold of the assessment.
- onFoldStart(fold: int, X_train: ndarray, y_train: ndarray, X_test: ndarray, y_test: ndarray)[source]
Called before each fold of the assessment.
- Parameters:
fold (int) – index of the current fold
X_train (np.ndarray) – training data of the current fold
y_train (np.ndarray) – training targets of the current fold
X_test (np.ndarray) – test data of the current fold
y_test (np.ndarray) – test targets of the current fold
- onIterationEnd(score: float, scores: list[float])[source]
Called after each iteration of the hyperparameter optimization.
- onIterationStart(params: dict)[source]
Called before each iteration of the hyperparameter optimization.
- Parameters:
params (dict) – parameters used for the current iteration
- onOptimizationEnd(best_score: float, best_parameters: dict)[source]
Called after the hyperparameter optimization has finished.
- onOptimizationStart(model: QSPRModel, data: QSPRDataset, config: dict, optimization_type: str)[source]
Called before the hyperparameter optimization has started.
- Parameters:
model (QSPRModel) – model to optimize
data (QSPRDataset) – data set used in optimization
config (dict) – configuration of the hyperparameter optimization
optimization_type (str) – type of hyperparameter optimization
- class qsprpred.models.monitors.FileMonitor(save_optimization: bool = True, save_assessments: bool = True, save_fits: bool = True)[source]
Bases:
BaseMonitor
Monitor hyperparameter optimization, assessment and fitting to files.
- Parameters:
- onAssessmentEnd(predictions: DataFrame)[source]
Called after the assessment has finished.
- Parameters:
predictions (pd.DataFrame) – predictions of the assessment
- onAssessmentStart(model: QSPRModel, data: QSPRDataset, assesment_type: str)[source]
Called before the assessment has started.
- Parameters:
model (QSPRModel) – model to assess
data (QSPRDataset) – data set used in assessment
assesment_type (str) – type of assessment
- onBatchStart(batch: int)
Called before each batch of the training.
- Parameters:
batch (int) – index of the current batch
- onEpochEnd(epoch: int, train_loss: float, val_loss: float | None = None)
Called after each epoch of the training.
- onEpochStart(epoch: int)
Called before each epoch of the training.
- Parameters:
epoch (int) – index of the current epoch
- onFitEnd(estimator: Any, best_epoch: int | None = None)[source]
Called after the training has finished.
- Parameters:
estimator (Any) – estimator that was fitted
best_epoch (int | None) – index of the best epoch
- onFitStart(model: QSPRModel, X_train: ndarray, y_train: ndarray, X_val: ndarray | None = None, y_val: ndarray | None = None)[source]
Called before the training has started.
- Parameters:
model (QSPRModel) – model to be fitted
X_train (np.ndarray) – training data
y_train (np.ndarray) – training targets
X_val (np.ndarray | None) – validation data, used for early stopping
y_val (np.ndarray | None) – validation targets, used for early stopping
- onFoldEnd(model_fit: Any | tuple[Any, int], fold_predictions: DataFrame)
Called after each fold of the assessment.
- onFoldStart(fold: int, X_train: ndarray, y_train: ndarray, X_test: ndarray, y_test: ndarray)
Called before each fold of the assessment.
- Parameters:
fold (int) – index of the current fold
X_train (np.ndarray) – training data of the current fold
y_train (np.ndarray) – training targets of the current fold
X_test (np.ndarray) – test data of the current fold
y_test (np.ndarray) – test targets of the current fold
- onIterationEnd(score: float, scores: list[float])[source]
Called after each iteration of the hyperparameter optimization.
- onIterationStart(params: dict)[source]
Called before each iteration of the hyperparameter optimization.
- Parameters:
params (dict) – parameters used for the current iteration
- onOptimizationEnd(best_score: float, best_parameters: dict)
Called after the hyperparameter optimization has finished.
- onOptimizationStart(model: QSPRModel, data: QSPRDataset, config: dict, optimization_type: str)[source]
Called before the hyperparameter optimization has started.
- Parameters:
model (QSPRModel) – model to optimize
data (QSPRDataset) – data set used in optimization
config (dict) – configuration of the hyperparameter optimization
optimization_type (str) – type of hyperparameter optimization
- class qsprpred.models.monitors.FitMonitor[source]
Bases:
JSONSerializable
,ABC
Base class for monitoring the fitting of a model.
- abstract onBatchStart(batch: int)[source]
Called before each batch of the training.
- Parameters:
batch (int) – index of the current batch
- abstract onEpochEnd(epoch: int, train_loss: float, val_loss: float | None = None)[source]
Called after each epoch of the training.
- abstract onEpochStart(epoch: int)[source]
Called before each epoch of the training.
- Parameters:
epoch (int) – index of the current epoch
- abstract onFitEnd(estimator: Any, best_epoch: int | None = None)[source]
Called after the training has finished.
- Parameters:
estimator (Any) – estimator that was fitted
best_epoch (int | None) – index of the best epoch
- abstract onFitStart(model: QSPRModel, X_train: ndarray, y_train: ndarray, X_val: ndarray | None = None, y_val: ndarray | None = None)[source]
Called before the training has started.
- Parameters:
model (QSPRModel) – model to be fitted
X_train (np.ndarray) – training data
y_train (np.ndarray) – training targets
X_val (np.ndarray | None) – validation data, used for early stopping
y_val (np.ndarray | None) – validation targets, used for early stopping
- class qsprpred.models.monitors.HyperparameterOptimizationMonitor[source]
Bases:
AssessorMonitor
Base class for monitoring the hyperparameter optimization of a model.
- abstract onAssessmentEnd(predictions: DataFrame)
Called after the assessment has finished.
- Parameters:
predictions (pd.DataFrame) – predictions of the assessment
- abstract onAssessmentStart(model: QSPRModel, data: QSPRDataset, assesment_type: str)
Called before the assessment has started.
- Parameters:
model (QSPRModel) – model to assess
data (QSPRDataset) – data set used in assessment
assesment_type (str) – type of assessment
- abstract onBatchStart(batch: int)
Called before each batch of the training.
- Parameters:
batch (int) – index of the current batch
- abstract onEpochEnd(epoch: int, train_loss: float, val_loss: float | None = None)
Called after each epoch of the training.
- abstract onEpochStart(epoch: int)
Called before each epoch of the training.
- Parameters:
epoch (int) – index of the current epoch
- abstract onFitEnd(estimator: Any, best_epoch: int | None = None)
Called after the training has finished.
- Parameters:
estimator (Any) – estimator that was fitted
best_epoch (int | None) – index of the best epoch
- abstract onFitStart(model: QSPRModel, X_train: ndarray, y_train: ndarray, X_val: ndarray | None = None, y_val: ndarray | None = None)
Called before the training has started.
- Parameters:
model (QSPRModel) – model to be fitted
X_train (np.ndarray) – training data
y_train (np.ndarray) – training targets
X_val (np.ndarray | None) – validation data, used for early stopping
y_val (np.ndarray | None) – validation targets, used for early stopping
- abstract onFoldEnd(model_fit: Any | tuple[Any, int], fold_predictions: DataFrame)
Called after each fold of the assessment.
- abstract onFoldStart(fold: int, X_train: ndarray, y_train: ndarray, X_test: ndarray, y_test: ndarray)
Called before each fold of the assessment.
- Parameters:
fold (int) – index of the current fold
X_train (np.ndarray) – training data of the current fold
y_train (np.ndarray) – training targets of the current fold
X_test (np.ndarray) – test data of the current fold
y_test (np.ndarray) – test targets of the current fold
- abstract onIterationEnd(score: float, scores: list[float])[source]
Called after each iteration of the hyperparameter optimization.
- abstract onIterationStart(params: dict)[source]
Called before each iteration of the hyperparameter optimization.
- Parameters:
params (dict) – parameters used for the current iteration
- abstract onOptimizationEnd(best_score: float, best_parameters: dict)[source]
Called after the hyperparameter optimization has finished.
- abstract onOptimizationStart(model: QSPRModel, data: QSPRDataset, config: dict, optimization_type: str)[source]
Called before the hyperparameter optimization has started.
- Parameters:
model (QSPRModel) – model to optimize
data (QSPRDataset) – data set used in optimization
config (dict) – configuration of the hyperparameter optimization
optimization_type (str) – type of hyperparameter optimization
- class qsprpred.models.monitors.ListMonitor(monitors: list[qsprpred.models.monitors.HyperparameterOptimizationMonitor])[source]
Bases:
HyperparameterOptimizationMonitor
Monitor that combines multiple monitors.
- Variables:
monitors (list[HyperparameterOptimizationMonitor]) – list of monitors
Initialize the monitor.
- Parameters:
monitors (list[HyperparameterOptimizationMonitor]) – list of monitors
- onAssessmentEnd(predictions: DataFrame)[source]
Called after the assessment has finished.
- Parameters:
predictions (pd.DataFrame) – predictions of the assessment
- onAssessmentStart(model: QSPRModel, data: QSPRDataset, assesment_type: str)[source]
Called before the assessment has started.
- Parameters:
model (QSPRModel) – model to assess
data (QSPRDataset) – data set used in assessment
assesment_type (str) – type of assessment
- onBatchStart(batch: int)[source]
Called before each batch of the training.
- Parameters:
batch (int) – index of the current batch
- onEpochEnd(epoch: int, train_loss: float, val_loss: float | None = None)[source]
Called after each epoch of the training.
- onEpochStart(epoch: int)[source]
Called before each epoch of the training.
- Parameters:
epoch (int) – index of the current epoch
- onFitEnd(estimator: Any, best_epoch: int | None = None)[source]
Called after the training has finished.
- Parameters:
estimator (Any) – estimator that was fitted
best_epoch (int | None) – index of the best epoch
- onFitStart(model: QSPRModel, X_train: ndarray, y_train: ndarray, X_val: ndarray | None = None, y_val: ndarray | None = None)[source]
Called before the training has started.
- Parameters:
model (QSPRModel) – model to be fitted
X_train (np.ndarray) – training data
y_train (np.ndarray) – training targets
X_val (np.ndarray | None) – validation data, used for early stopping
y_val (np.ndarray | None) – validation targets, used for early stopping
- onFoldEnd(model_fit: Any | tuple[Any, int], fold_predictions: DataFrame)[source]
Called after each fold of the assessment.
- onFoldStart(fold: int, X_train: ndarray, y_train: ndarray, X_test: ndarray, y_test: ndarray)[source]
Called before each fold of the assessment.
- Parameters:
fold (int) – index of the current fold
X_train (np.ndarray) – training data of the current fold
y_train (np.ndarray) – training targets of the current fold
X_test (np.ndarray) – test data of the current fold
y_test (np.ndarray) – test targets of the current fold
- onIterationEnd(score: float, scores: list[float])[source]
Called after each iteration of the hyperparameter optimization.
- onIterationStart(params: dict)[source]
Called before each iteration of the hyperparameter optimization.
- Parameters:
params (dict) – parameters used for the current iteration
- onOptimizationEnd(best_score: float, best_parameters: dict)[source]
Called after the hyperparameter optimization has finished.
- onOptimizationStart(model: QSPRModel, data: QSPRDataset, config: dict, optimization_type: str)[source]
Called before the hyperparameter optimization has started.
- Parameters:
model (QSPRModel) – model to optimize
data (QSPRDataset) – data set used in optimization
config (dict) – configuration of the hyperparameter optimization
optimization_type (str) – type of hyperparameter optimization
- class qsprpred.models.monitors.NullMonitor[source]
Bases:
HyperparameterOptimizationMonitor
Monitor that does nothing.
- onAssessmentEnd(predictions: DataFrame)[source]
Called after the assessment has finished.
- Parameters:
predictions (pd.DataFrame) – predictions of the assessment
- onAssessmentStart(model: QSPRModel, data: QSPRDataset, assesment_type: str)[source]
Called before the assessment has started.
- Parameters:
model (QSPRModel) – model to assess
data (QSPRDataset) – data set used in assessment
assesment_type (str) – type of assessment
- onBatchStart(batch: int)[source]
Called before each batch of the training.
- Parameters:
batch (int) – index of the current batch
- onEpochEnd(epoch: int, train_loss: float, val_loss: float | None = None)[source]
Called after each epoch of the training.
- onEpochStart(epoch: int)[source]
Called before each epoch of the training.
- Parameters:
epoch (int) – index of the current epoch
- onFitEnd(estimator: Any, best_epoch: int | None = None)[source]
Called after the training has finished.
- Parameters:
estimator (Any) – estimator that was fitted
best_epoch (int | None) – index of the best epoch
- onFitStart(model: QSPRModel, X_train: ndarray, y_train: ndarray, X_val: ndarray | None = None, y_val: ndarray | None = None)[source]
Called before the training has started.
- Parameters:
model (QSPRModel) – model to be fitted
X_train (np.ndarray) – training data
y_train (np.ndarray) – training targets
X_val (np.ndarray | None) – validation data, used for early stopping
y_val (np.ndarray | None) – validation targets, used for early stopping
- onFoldEnd(model_fit: Any | tuple[Any, int], fold_predictions: DataFrame)[source]
Called after each fold of the assessment.
- onFoldStart(fold: int, X_train: ndarray, y_train: ndarray, X_test: ndarray, y_test: ndarray)[source]
Called before each fold of the assessment.
- Parameters:
fold (int) – index of the current fold
X_train (np.ndarray) – training data of the current fold
y_train (np.ndarray) – training targets of the current fold
X_test (np.ndarray) – test data of the current fold
y_test (np.ndarray) – test targets of the current fold
- onIterationEnd(score: float, scores: list[float])[source]
Called after each iteration of the hyperparameter optimization.
- onIterationStart(params: dict)[source]
Called before each iteration of the hyperparameter optimization.
- Parameters:
params (dict) – parameters used for the current iteration
- onOptimizationEnd(best_score: float, best_parameters: dict)[source]
Called after the hyperparameter optimization has finished.
- onOptimizationStart(model: QSPRModel, data: QSPRDataset, config: dict, optimization_type: str)[source]
Called before the hyperparameter optimization has started.
- Parameters:
model (QSPRModel) – model to optimize
data (QSPRDataset) – data set used in optimization
config (dict) – configuration of the hyperparameter optimization
optimization_type (str) – type of hyperparameter optimization
- class qsprpred.models.monitors.WandBMonitor(project_name: str, **kwargs)[source]
Bases:
BaseMonitor
Monitor hyperparameter optimization to weights and biases.
Monitor assessment to weights and biases.
- Parameters:
project_name (str) – name of the project to log to
kwargs – additional keyword arguments for wandb.init
- onAssessmentEnd(predictions: DataFrame)
Called after the assessment has finished.
- Parameters:
predictions (pd.DataFrame) – predictions of the assessment
- onAssessmentStart(model: QSPRModel, data: QSPRDataset, assesment_type: str)
Called before the assessment has started.
- Parameters:
model (QSPRModel) – model to assess
data (QSPRDataset) – data set used in assessment
assesment_type (str) – type of assessment
- onBatchStart(batch: int)
Called before each batch of the training.
- Parameters:
batch (int) – index of the current batch
- onEpochEnd(epoch: int, train_loss: float, val_loss: float | None = None)[source]
Called after each epoch of the training.
- onEpochStart(epoch: int)
Called before each epoch of the training.
- Parameters:
epoch (int) – index of the current epoch
- onFitEnd(estimator: Any, best_epoch: int | None = None)[source]
Called after the training has finished.
- Parameters:
estimator (Any) – estimator that was fitted
best_epoch (int | None) – index of the best epoch
- onFitStart(model: QSPRModel, X_train: ndarray, y_train: ndarray, X_val: ndarray | None = None, y_val: ndarray | None = None)[source]
Called before the training has started.
- Parameters:
model (QSPRModel) – model to train
- onFoldEnd(model_fit: Any | tuple[Any, int], fold_predictions: DataFrame)[source]
Called after each fold of the assessment.
- Parameters:
model_fit (Any |tuple[Any, int]) – fitted estimator of the current fold
- onFoldStart(fold: int, X_train: ndarray, y_train: ndarray, X_test: ndarray, y_test: ndarray)[source]
Called before each fold of the assessment.
- Parameters:
fold (int) – index of the current fold
X_train (np.ndarray) – training data of the current fold
y_train (np.ndarray) – training targets of the current fold
X_test (np.ndarray) – test data of the current fold
y_test (np.ndarray) – test targets of the current fold
- onIterationEnd(score: float, scores: list[float])
Called after each iteration of the hyperparameter optimization.
- onIterationStart(params: dict)
Called before each iteration of the hyperparameter optimization.
- Parameters:
params (dict) – parameters used for the current iteration
- onOptimizationEnd(best_score: float, best_parameters: dict)
Called after the hyperparameter optimization has finished.
- onOptimizationStart(model: QSPRModel, data: QSPRDataset, config: dict, optimization_type: str)
Called before the hyperparameter optimization has started.
- Parameters:
model (QSPRModel) – model to optimize
data (QSPRDataset) – data set used in optimization
config (dict) – configuration of the hyperparameter optimization
optimization_type (str) – type of hyperparameter optimization
qsprpred.models.scikit_learn module
Here the QSPRmodel classes can be found.
At the moment there is a class for sklearn type models. However, one for a pytorch DNN
model can be found in qsprpred.deep
. To add more types a model class implementing
the QSPRModel
interface can be added.
- class qsprpred.models.scikit_learn.SklearnModel(base_dir: str, alg=None, name: str | None = None, parameters: dict | None = None, autoload: bool = True, random_state: int | None = None)[source]
Bases:
QSPRModel
QSPRModel class for sklearn type models.
Wrap your sklearn model class in this class to use it with the
QSPRModel
interface.Initialize SklearnModel model.
- Parameters:
- checkData(ds: QSPRDataset, exception: bool = True) bool
Check if the model has a data set.
- Parameters:
ds (QSPRDataset) – data set to check
exception (bool) – if true, an exception is raised if no data is set
- Returns:
True if data is set, False otherwise (if exception is False)
- Return type:
- property classPath: str
Return the fully classified path of the model.
- Returns:
class path of the model
- Return type:
- cleanFiles()
Clean up the model files.
Removes the model directory and all its contents.
- convertToNumpy(X: DataFrame | ndarray | QSPRDataset, y: DataFrame | ndarray | QSPRDataset | None = None) tuple[numpy.ndarray, numpy.ndarray] | ndarray
Convert the given data matrix and target matrix to np.ndarray format.
- Parameters:
X (pd.DataFrame, np.ndarray, QSPRDataset) – data matrix
y (pd.DataFrame, np.ndarray, QSPRDataset) – target matrix
- Returns:
data matrix and/or target matrix in np.ndarray format
- createPredictionDatasetFromMols(mols: list[str | rdkit.Chem.rdchem.Mol], smiles_standardizer: str | Callable[[str], str] = 'chembl', n_jobs: int = 1, fill_value: float = nan) tuple[qsprpred.data.tables.qspr.QSPRDataset, numpy.ndarray]
Create a
QSPRDataset
instance from a list of SMILES strings.- Parameters:
- Returns:
a tuple containing the
QSPRDataset
instance and a boolean mask indicating which molecules failed to be processed- Return type:
- fit(X: DataFrame | ndarray, y: DataFrame | ndarray, estimator: Any = None, mode: Any = None, monitor: None = None, **kwargs)[source]
Fit the model to the given data matrix or
QSPRDataset
.- Note. convertToNumpy can be called here, to convert the input data to
np.ndarray format.
Note. if no estimator is given, the estimator instance of the model is used.
- Note. if a model supports early stopping, the fit function should have the
early_stopping
decorator and the mode argument should be used to set the early stopping mode. If the model does not support early stopping, the mode argument is ignored.
- Parameters:
X (pd.DataFrame, np.ndarray) – data matrix to fit
y (pd.DataFrame, np.ndarray) – target matrix to fit
estimator (Any) – estimator instance to use for fitting
mode (EarlyStoppingMode) – early stopping mode
monitor (FitMonitor) – monitor for the fitting process, if None, the base monitor is used
kwargs – additional arguments to pass to the fit method of the estimator
- Returns:
fitted estimator instance int: in case of early stopping, the number of iterations
after which the model stopped training
- Return type:
Any
- fitDataset(ds: QSPRDataset, monitor=None, mode=EarlyStoppingMode.OPTIMAL, save_model=True, save_data=False, **kwargs) str
Train model on the whole attached data set.
** IMPORTANT ** For models that supportEarlyStopping,
CrossValAssessor
should be run first, so that the average number of epochs from the cross-validation with early stopping can be used for fitting the model.- Parameters:
ds (QSPRDataset) – data set to fit this model on
monitor (FitMonitor) – monitor for the fitting process, if None, the base monitor is used
mode (EarlyStoppingMode) – early stopping mode for models that support early stopping, by default fit the ‘optimal’ number of epochs previously stopped at in model assessment on train or test set, to avoid the use of extra data for a validation set.
save_model (bool) – save the model to file
save_data (bool) – save the supplied dataset to file
kwargs – additional arguments to pass to fit
- Returns:
path to the saved model, if
save_model
is True- Return type:
- getParameters(new_parameters) dict | None
Get the model parameters combined with the given parameters.
If both the model and the given parameters contain the same key, the value from the given parameters is used.
- static handleInvalidsInPredictions(mols: list[str], predictions: ndarray | list[numpy.ndarray], failed_mask: ndarray) ndarray
Replace invalid predictions with None.
- Parameters:
mols (MoleculeTable) – molecules for which the predictions were made
predictions (np.ndarray) – predictions made by the model
failed_mask (np.ndarray) – boolean mask of failed predictions
- Returns:
predictions with invalids replaced by None
- Return type:
np.ndarray
- initFromDataset(data: QSPRDataset | None)
- initRandomState(random_state)
Set random state if applicable. Defaults to random state of dataset if no random state is provided,
- Parameters:
random_state (int) – Random state to use for shuffling and other random operations.
- property isMultiTask: bool
Return if model is a multitask model, taken from the data set or deserialized from file if the model is loaded without data.
- Returns:
True if model is a multitask model
- Return type:
- loadEstimator(params: dict | None = None) Any [source]
Load estimator from alg and params.
- Parameters:
params (dict) – parameters
- loadEstimatorFromFile(params: dict | None = None, fallback_load: bool = True)[source]
Load estimator from file.
- classmethod loadParamsGrid(fname: str, optim_type: str, model_types: str) ndarray
Load parameter grids for bayes or grid search parameter optimization from json file.
- Parameters:
- Returns:
array with three columns containing modeltype, optimization type (grid or bayes) and model type
- Return type:
np.ndarray
- property optimalEpochs: int | None
Return the optimal number of epochs for early stopping.
- Returns:
optimal number of epochs
- Return type:
int | None
- property outDir: str
Return output directory of the model, the model files are stored in this directory (
{baseDir}/{name}
).- Returns:
output directory of the model
- Return type:
- property outPrefix: str
Return output prefix of the model files.
The model files are stored with this prefix (i.e.
{outPrefix}_meta.json
).- Returns:
output prefix of the model files
- Return type:
- predict(X: DataFrame | ndarray | QSPRDataset, estimator: Any = None)[source]
See
QSPRModel.predict
.
- predictDataset(dataset: QSPRDataset, use_probas: bool = False) ndarray | list[numpy.ndarray]
Make predictions for the given dataset.
- Parameters:
dataset – a
QSPRDataset
instanceuse_probas – use probabilities if this is a classification model
- Returns:
an array of predictions or a list of arrays of predictions (for classification models with use_probas=True)
- Return type:
np.ndarray | list[np.ndarray]
- predictMols(mols: List[str | Mol], use_probas: bool = False, smiles_standardizer: str | callable = 'chembl', n_jobs: int = 1, fill_value: float = nan, use_applicability_domain: bool = False) ndarray | list[numpy.ndarray]
Make predictions for the given molecules.
- Parameters:
mols (List[str | Mol]) – list of SMILES strings
use_probas (bool) – use probabilities for classification models
smiles_standardizer – either
chembl
,old
, or a partial function that reads and standardizes smiles.n_jobs – Number of jobs to use for parallel processing.
fill_value – Value to use for missing values in the feature matrix.
use_applicability_domain – Use applicability domain to return if a molecule is within the applicability domain of the model.
- Returns:
- an array of predictions or a list of arrays of predictions
(for classification models with use_probas=True)
- np.ndarray[bool]: boolean mask indicating which molecules fall
within the applicability domain of the model
- Return type:
np.ndarray | list[np.ndarray]
- predictProba(X: DataFrame | ndarray | QSPRDataset, estimator: Any = None)[source]
- save(save_estimator=False)
Save model to file.
- Parameters:
save_estimator (bool) – Explicitly save the estimator to file, if
True
. Note that some models may save the estimator by default even if this argument isFalse
.- Returns:
absolute path to the metafile of the saved model str:
absolute path to the saved estimator, if
include_estimator
isTrue
- Return type:
- setParams(params: dict | None, reset_estimator: bool = True)
Set model parameters. The estimator is also updated with the new parameters if ‘reload_estimator’ is
True
.
- property task: ModelTasks
Return the task of the model, taken from the data set or deserialized from file if the model is loaded without data.
- Returns:
task of the model
- Return type:
qsprpred.models.tests module
This module holds the tests for functions regarding QSPR modelling.
- class qsprpred.models.tests.SklearnBaseModelTestCase(methodName='runTest')[source]
Bases:
ModelDataSetsPathMixIn
,ModelCheckMixIn
,QSPRTestCase
This class holds the tests for the SklearnModel class.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkOptimization(model: QSPRModel, ds: QSPRDataset, optimizer: HyperparameterOptimization)
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- fitTest(model: QSPRModel, ds: QSPRDataset)
Test model fitting, optimization and evaluation.
- Parameters:
model (QSPRModel) – The model to test.
ds (QSPRDataset) – The dataset to use for testing.
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- getModel(name: str, alg: Type | None = None, parameters: dict | None = None, random_state: int | None = None)[source]
Create a SklearnModel model.
- Parameters:
name (str) – the name of the model
alg (Type, optional) – the algorithm to use. Defaults to None.
dataset (QSPRDataset, optional) – the dataset to use. Defaults to None.
parameters (dict, optional) – the parameters to use. Defaults to None.
random_state (int, optional) – Random state to use for shuffling and other random operations. Defaults to None.
- Returns:
the model
- Return type:
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- property gridFile
- id()
- longMessage = True
- maxDiff = 640
- predictorTest(model: QSPRModel, dataset: QSPRDataset, comparison_model: QSPRModel | None = None, expect_equal_result=True, **pred_kwargs)
Test model predictions.
Checks if the shape of the predictions is as expected and if the predictions of the predictMols function are consistent with the predictions of the predict/predictProba functions. Also checks if the predictions of the model are the same as the predictions of the comparison model if given.
- Parameters:
model (QSPRModel) – The model to make predictions with.
dataset (QSPRDataset) – The dataset to make predictions for.
comparison_model (QSPRModel) – another model to compare the predictions with.
expect_equal_result (bool) – Whether the expected result should be equal or not equal to the predictions of the comparison model.
**pred_kwargs – Extra keyword arguments to pass to the predictor’s
predictMols
method.
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.models.tests.TestAttachedApplicabilityDomain(methodName='runTest')[source]
Bases:
ModelDataSetsPathMixIn
,QSPRTestCase
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.models.tests.TestEarlyStopping(methodName='runTest')[source]
Bases:
ModelDataSetsPathMixIn
,TestCase
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.models.tests.TestMetrics(methodName='runTest')[source]
Bases:
TestCase
Test the SklearnMetrics from the metrics module.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- countTestCases()
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- id()
- longMessage = True
- maxDiff = 640
- run(result=None)
- sample_data(task: ModelTasks, use_proba: bool = False)[source]
Sample data for testing.
- setUp()
Hook method for setting up the test fixture before exercising it.
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Hook method for deconstructing the test fixture after testing it.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- class qsprpred.models.tests.TestMonitors(methodName='runTest')[source]
Bases:
MonitorsCheckMixIn
,TestCase
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- baseMonitorTest(monitor: BaseMonitor, monitor_type: Literal['hyperparam', 'crossval', 'test', 'fit'], neural_net: bool)
Test the base monitor.
- checkOptimization(model: QSPRModel, ds: QSPRDataset, optimizer: HyperparameterOptimization)
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- fileMonitorTest(monitor: FileMonitor, monitor_type: Literal['hyperparam', 'crossval', 'test', 'fit'], neural_net: bool)
Test if the correct files are generated
- fitTest(model: QSPRModel, ds: QSPRDataset)
Test model fitting, optimization and evaluation.
- Parameters:
model (QSPRModel) – The model to test.
ds (QSPRDataset) – The dataset to use for testing.
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- property gridFile
- id()
- listMonitorTest(monitor: ListMonitor, monitor_type: Literal['hyperparam', 'crossval', 'test', 'fit'], neural_net: bool)
- longMessage = True
- maxDiff = 640
- predictorTest(model: QSPRModel, dataset: QSPRDataset, comparison_model: QSPRModel | None = None, expect_equal_result=True, **pred_kwargs)
Test model predictions.
Checks if the shape of the predictions is as expected and if the predictions of the predictMols function are consistent with the predictions of the predict/predictProba functions. Also checks if the predictions of the model are the same as the predictions of the comparison model if given.
- Parameters:
model (QSPRModel) – The model to make predictions with.
dataset (QSPRDataset) – The dataset to make predictions for.
comparison_model (QSPRModel) – another model to compare the predictions with.
expect_equal_result (bool) – Whether the expected result should be equal or not equal to the predictions of the comparison model.
**pred_kwargs – Extra keyword arguments to pass to the predictor’s
predictMols
method.
- run(result=None)
- runMonitorTest(model, data, monitor_type, test_method, nerual_net, *args, **kwargs)
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- trainModelWithMonitoring(model: ~qsprpred.models.model.QSPRModel, ds: ~qsprpred.data.tables.qspr.QSPRDataset, hyperparam_monitor: ~qsprpred.models.monitors.HyperparameterOptimizationMonitor, crossval_monitor: ~qsprpred.models.monitors.AssessorMonitor, test_monitor: ~qsprpred.models.monitors.AssessorMonitor, fit_monitor: ~qsprpred.models.monitors.FitMonitor) -> (<class 'qsprpred.models.monitors.HyperparameterOptimizationMonitor'>, <class 'qsprpred.models.monitors.AssessorMonitor'>, <class 'qsprpred.models.monitors.AssessorMonitor'>, <class 'qsprpred.models.monitors.FitMonitor'>)
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.models.tests.TestSklearnClassification(methodName='runTest')[source]
Bases:
SklearnBaseModelTestCase
Test the SklearnModel class for classification models.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkOptimization(model: QSPRModel, ds: QSPRDataset, optimizer: HyperparameterOptimization)
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- fitTest(model: QSPRModel, ds: QSPRDataset)
Test model fitting, optimization and evaluation.
- Parameters:
model (QSPRModel) – The model to test.
ds (QSPRDataset) – The dataset to use for testing.
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- getModel(name: str, alg: Type | None = None, parameters: dict | None = None, random_state: int | None = None)
Create a SklearnModel model.
- Parameters:
name (str) – the name of the model
alg (Type, optional) – the algorithm to use. Defaults to None.
dataset (QSPRDataset, optional) – the dataset to use. Defaults to None.
parameters (dict, optional) – the parameters to use. Defaults to None.
random_state (int, optional) – Random state to use for shuffling and other random operations. Defaults to None.
- Returns:
the model
- Return type:
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- property gridFile
- id()
- longMessage = True
- maxDiff = 640
- predictorTest(model: QSPRModel, dataset: QSPRDataset, comparison_model: QSPRModel | None = None, expect_equal_result=True, **pred_kwargs)
Test model predictions.
Checks if the shape of the predictions is as expected and if the predictions of the predictMols function are consistent with the predictions of the predict/predictProba functions. Also checks if the predictions of the model are the same as the predictions of the comparison model if given.
- Parameters:
model (QSPRModel) – The model to make predictions with.
dataset (QSPRDataset) – The dataset to make predictions for.
comparison_model (QSPRModel) – another model to compare the predictions with.
expect_equal_result (bool) – Whether the expected result should be equal or not equal to the predictions of the comparison model.
**pred_kwargs – Extra keyword arguments to pass to the predictor’s
predictMols
method.
- run(result=None)
- setUp()
Hook method for setting up the test fixture before exercising it.
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testClassificationBasicFit = None
- testClassificationBasicFit_00_RFC_SINGLECLASS(**kw)
Test model training for classification models [with _=’RFC_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[None]].
- testClassificationBasicFit_01_RFC_SINGLECLASS(**kw)
Test model training for classification models [with _=’RFC_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[1, 42]].
- testClassificationBasicFit_02_RFC_SINGLECLASS(**kw)
Test model training for classification models [with _=’RFC_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[42, 42]].
- testClassificationBasicFit_03_RFC_MULTICLASS(**kw)
Test model training for classification models [with _=’RFC_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[None]].
- testClassificationBasicFit_04_RFC_MULTICLASS(**kw)
Test model training for classification models [with _=’RFC_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[1, 42]].
- testClassificationBasicFit_05_RFC_MULTICLASS(**kw)
Test model training for classification models [with _=’RFC_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[42, 42]].
- testClassificationBasicFit_06_XGBC_SINGLECLASS(**kw)
Test model training for classification models [with _=’XGBC_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’XGBC’, model_class=<class ‘xgboost.sklearn.XGBClassifier’>, random_state=[None]].
- testClassificationBasicFit_07_XGBC_SINGLECLASS(**kw)
Test model training for classification models [with _=’XGBC_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’XGBC’, model_class=<class ‘xgboost.sklearn.XGBClassifier’>, random_state=[1, 42]].
- testClassificationBasicFit_08_XGBC_SINGLECLASS(**kw)
Test model training for classification models [with _=’XGBC_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’XGBC’, model_class=<class ‘xgboost.sklearn.XGBClassifier’>, random_state=[42, 42]].
- testClassificationBasicFit_09_XGBC_MULTICLASS(**kw)
Test model training for classification models [with _=’XGBC_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’XGBC’, model_class=<class ‘xgboost.sklearn.XGBClassifier’>, random_state=[None]].
- testClassificationBasicFit_10_XGBC_MULTICLASS(**kw)
Test model training for classification models [with _=’XGBC_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’XGBC’, model_class=<class ‘xgboost.sklearn.XGBClassifier’>, random_state=[1, 42]].
- testClassificationBasicFit_11_XGBC_MULTICLASS(**kw)
Test model training for classification models [with _=’XGBC_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’XGBC’, model_class=<class ‘xgboost.sklearn.XGBClassifier’>, random_state=[42, 42]].
- testClassificationBasicFit_12_SVC_SINGLECLASS(**kw)
Test model training for classification models [with _=’SVC_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’SVC’, model_class=<class ‘sklearn.svm._classes.SVC’>, random_state=[None]].
- testClassificationBasicFit_13_SVC_MULTICLASS(**kw)
Test model training for classification models [with _=’SVC_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’SVC’, model_class=<class ‘sklearn.svm._classes.SVC’>, random_state=[None]].
- testClassificationBasicFit_14_KNNC_SINGLECLASS(**kw)
Test model training for classification models [with _=’KNNC_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’KNNC’, model_class=<class ‘sklearn.neighbors._classification.KNeighborsClassifier’>, random_state=[None]].
- testClassificationBasicFit_15_KNNC_MULTICLASS(**kw)
Test model training for classification models [with _=’KNNC_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’KNNC’, model_class=<class ‘sklearn.neighbors._classification.KNeighborsClassifier’>, random_state=[None]].
- testClassificationBasicFit_16_NB_SINGLECLASS(**kw)
Test model training for classification models [with _=’NB_SINGLECLASS’, task=<TargetTasks.SINGLECLASS: ‘SINGLECLASS’>, th=[6.5], model_name=’NB’, model_class=<class ‘sklearn.naive_bayes.GaussianNB’>, random_state=[None]].
- testClassificationBasicFit_17_NB_MULTICLASS(**kw)
Test model training for classification models [with _=’NB_MULTICLASS’, task=<TargetTasks.MULTICLASS: ‘MULTICLASS’>, th=[0, 2, 10, 1100], model_name=’NB’, model_class=<class ‘sklearn.naive_bayes.GaussianNB’>, random_state=[None]].
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.models.tests.TestSklearnClassificationMultiTask(methodName='runTest')[source]
Bases:
SklearnBaseModelTestCase
Test the SklearnModel class for multi-task classification models.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkOptimization(model: QSPRModel, ds: QSPRDataset, optimizer: HyperparameterOptimization)
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- fitTest(model: QSPRModel, ds: QSPRDataset)
Test model fitting, optimization and evaluation.
- Parameters:
model (QSPRModel) – The model to test.
ds (QSPRDataset) – The dataset to use for testing.
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- getModel(name: str, alg: Type | None = None, parameters: dict | None = None, random_state: int | None = None)
Create a SklearnModel model.
- Parameters:
name (str) – the name of the model
alg (Type, optional) – the algorithm to use. Defaults to None.
dataset (QSPRDataset, optional) – the dataset to use. Defaults to None.
parameters (dict, optional) – the parameters to use. Defaults to None.
random_state (int, optional) – Random state to use for shuffling and other random operations. Defaults to None.
- Returns:
the model
- Return type:
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- property gridFile
- id()
- longMessage = True
- maxDiff = 640
- predictorTest(model: QSPRModel, dataset: QSPRDataset, comparison_model: QSPRModel | None = None, expect_equal_result=True, **pred_kwargs)
Test model predictions.
Checks if the shape of the predictions is as expected and if the predictions of the predictMols function are consistent with the predictions of the predict/predictProba functions. Also checks if the predictions of the model are the same as the predictions of the comparison model if given.
- Parameters:
model (QSPRModel) – The model to make predictions with.
dataset (QSPRDataset) – The dataset to make predictions for.
comparison_model (QSPRModel) – another model to compare the predictions with.
expect_equal_result (bool) – Whether the expected result should be equal or not equal to the predictions of the comparison model.
**pred_kwargs – Extra keyword arguments to pass to the predictor’s
predictMols
method.
- run(result=None)
- setUp()
Hook method for setting up the test fixture before exercising it.
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testClassificationMultiTaskFit = None
- testClassificationMultiTaskFit_0_RFC(**kw)
Test model training for multitask classification models [with _=’RFC’, model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[None]].
- testClassificationMultiTaskFit_1_RFC(**kw)
Test model training for multitask classification models [with _=’RFC’, model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[1, 42]].
- testClassificationMultiTaskFit_2_RFC(**kw)
Test model training for multitask classification models [with _=’RFC’, model_name=’RFC’, model_class=<class ‘sklearn.ensemble._forest.RandomForestClassifier’>, random_state=[42, 42]].
- testClassificationMultiTaskFit_3_KNNC(**kw)
Test model training for multitask classification models [with _=’KNNC’, model_name=’KNNC’, model_class=<class ‘sklearn.neighbors._classification.KNeighborsClassifier’>, random_state=[None]].
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.models.tests.TestSklearnRegression(methodName='runTest')[source]
Bases:
SklearnBaseModelTestCase
Test the SklearnModel class for regression models.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkOptimization(model: QSPRModel, ds: QSPRDataset, optimizer: HyperparameterOptimization)
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- fitTest(model: QSPRModel, ds: QSPRDataset)
Test model fitting, optimization and evaluation.
- Parameters:
model (QSPRModel) – The model to test.
ds (QSPRDataset) – The dataset to use for testing.
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- getModel(name: str, alg: Type | None = None, parameters: dict | None = None, random_state: int | None = None)
Create a SklearnModel model.
- Parameters:
name (str) – the name of the model
alg (Type, optional) – the algorithm to use. Defaults to None.
dataset (QSPRDataset, optional) – the dataset to use. Defaults to None.
parameters (dict, optional) – the parameters to use. Defaults to None.
random_state (int, optional) – Random state to use for shuffling and other random operations. Defaults to None.
- Returns:
the model
- Return type:
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- property gridFile
- id()
- longMessage = True
- maxDiff = 640
- predictorTest(model: QSPRModel, dataset: QSPRDataset, comparison_model: QSPRModel | None = None, expect_equal_result=True, **pred_kwargs)
Test model predictions.
Checks if the shape of the predictions is as expected and if the predictions of the predictMols function are consistent with the predictions of the predict/predictProba functions. Also checks if the predictions of the model are the same as the predictions of the comparison model if given.
- Parameters:
model (QSPRModel) – The model to make predictions with.
dataset (QSPRDataset) – The dataset to make predictions for.
comparison_model (QSPRModel) – another model to compare the predictions with.
expect_equal_result (bool) – Whether the expected result should be equal or not equal to the predictions of the comparison model.
**pred_kwargs – Extra keyword arguments to pass to the predictor’s
predictMols
method.
- run(result=None)
- setUp()
Hook method for setting up the test fixture before exercising it.
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testRegressionBasicFit = None
- testRegressionBasicFit_0_RFR(**kw)
Test model training for regression models [with _=’RFR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’RFR’, model_class=<class ‘sklearn.ensemble._forest.RandomForestRegressor’>, random_state=[None]].
- testRegressionBasicFit_1_RFR(**kw)
Test model training for regression models [with _=’RFR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’RFR’, model_class=<class ‘sklearn.ensemble._forest.RandomForestRegressor’>, random_state=[1, 42]].
- testRegressionBasicFit_2_RFR(**kw)
Test model training for regression models [with _=’RFR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’RFR’, model_class=<class ‘sklearn.ensemble._forest.RandomForestRegressor’>, random_state=[42, 42]].
- testRegressionBasicFit_3_XGBR(**kw)
Test model training for regression models [with _=’XGBR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’XGBR’, model_class=<class ‘xgboost.sklearn.XGBRegressor’>, random_state=[None]].
- testRegressionBasicFit_4_XGBR(**kw)
Test model training for regression models [with _=’XGBR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’XGBR’, model_class=<class ‘xgboost.sklearn.XGBRegressor’>, random_state=[1, 42]].
- testRegressionBasicFit_5_XGBR(**kw)
Test model training for regression models [with _=’XGBR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’XGBR’, model_class=<class ‘xgboost.sklearn.XGBRegressor’>, random_state=[42, 42]].
- testRegressionBasicFit_6_PLSR(**kw)
Test model training for regression models [with _=’PLSR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’PLSR’, model_class=<class ‘sklearn.cross_decomposition._pls.PLSRegression’>, random_state=[None]].
- testRegressionBasicFit_7_SVR(**kw)
Test model training for regression models [with _=’SVR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’SVR’, model_class=<class ‘sklearn.svm._classes.SVR’>, random_state=[None]].
- testRegressionBasicFit_8_KNNR(**kw)
Test model training for regression models [with _=’KNNR’, task=<TargetTasks.REGRESSION: ‘REGRESSION’>, model_name=’KNNR’, model_class=<class ‘sklearn.neighbors._regression.KNeighborsRegressor’>, random_state=[None]].
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.models.tests.TestSklearnRegressionMultiTask(methodName='runTest')[source]
Bases:
SklearnBaseModelTestCase
Test the SklearnModel class for multi-task regression models.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkOptimization(model: QSPRModel, ds: QSPRDataset, optimizer: HyperparameterOptimization)
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- fitTest(model: QSPRModel, ds: QSPRDataset)
Test model fitting, optimization and evaluation.
- Parameters:
model (QSPRModel) – The model to test.
ds (QSPRDataset) – The dataset to use for testing.
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- getModel(name: str, alg: Type | None = None, parameters: dict | None = None, random_state: int | None = None)
Create a SklearnModel model.
- Parameters:
name (str) – the name of the model
alg (Type, optional) – the algorithm to use. Defaults to None.
dataset (QSPRDataset, optional) – the dataset to use. Defaults to None.
parameters (dict, optional) – the parameters to use. Defaults to None.
random_state (int, optional) – Random state to use for shuffling and other random operations. Defaults to None.
- Returns:
the model
- Return type:
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- property gridFile
- id()
- longMessage = True
- maxDiff = 640
- predictorTest(model: QSPRModel, dataset: QSPRDataset, comparison_model: QSPRModel | None = None, expect_equal_result=True, **pred_kwargs)
Test model predictions.
Checks if the shape of the predictions is as expected and if the predictions of the predictMols function are consistent with the predictions of the predict/predictProba functions. Also checks if the predictions of the model are the same as the predictions of the comparison model if given.
- Parameters:
model (QSPRModel) – The model to make predictions with.
dataset (QSPRDataset) – The dataset to make predictions for.
comparison_model (QSPRModel) – another model to compare the predictions with.
expect_equal_result (bool) – Whether the expected result should be equal or not equal to the predictions of the comparison model.
**pred_kwargs – Extra keyword arguments to pass to the predictor’s
predictMols
method.
- run(result=None)
- setUp()
Hook method for setting up the test fixture before exercising it.
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- testRegressionMultiTaskFit = None
- testRegressionMultiTaskFit_0_RFR(**kw)
Test model training for multitask regression models [with _=’RFR’, model_name=’RFR’, model_class=<class ‘sklearn.ensemble._forest.RandomForestRegressor’>, random_state=[None]].
- testRegressionMultiTaskFit_1_RFR(**kw)
Test model training for multitask regression models [with _=’RFR’, model_name=’RFR’, model_class=<class ‘sklearn.ensemble._forest.RandomForestRegressor’>, random_state=[1, 42]].
- testRegressionMultiTaskFit_2_RFR(**kw)
Test model training for multitask regression models [with _=’RFR’, model_name=’RFR’, model_class=<class ‘sklearn.ensemble._forest.RandomForestRegressor’>, random_state=[42, 42]].
- testRegressionMultiTaskFit_3_KNNR(**kw)
Test model training for multitask regression models [with _=’KNNR’, model_name=’KNNR’, model_class=<class ‘sklearn.neighbors._regression.KNeighborsRegressor’>, random_state=[None]].
- validate_split(dataset)
Check if the split has the data it should have after splitting.
- class qsprpred.models.tests.TestSklearnSerialization(methodName='runTest')[source]
Bases:
SklearnBaseModelTestCase
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- classmethod addClassCleanup(function, /, *args, **kwargs)
Same as addCleanup, except the cleanup items are called even if setUpClass fails (unlike tearDownClass).
- addCleanup(function, /, *args, **kwargs)
Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.
Cleanup items are called even if setUp fails (unlike tearDown).
- addTypeEqualityFunc(typeobj, function)
Add a type specific assertEqual style function to compare a type.
This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.
- Parameters:
typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.
- assertAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
If the two objects compare equal then they will automatically compare almost equal.
- assertCountEqual(first, second, msg=None)
Asserts that two iterables have the same elements, the same number of times, without regard to order.
- self.assertEqual(Counter(list(first)),
Counter(list(second)))
- Example:
[0, 1, 1] and [1, 0, 1] compare equal.
[0, 0, 1] and [0, 1] compare unequal.
- assertDictEqual(d1, d2, msg=None)
- assertEqual(first, second, msg=None)
Fail if the two objects are unequal as determined by the ‘==’ operator.
- assertFalse(expr, msg=None)
Check that the expression is false.
- assertGreater(a, b, msg=None)
Just like self.assertTrue(a > b), but with a nicer default message.
- assertGreaterEqual(a, b, msg=None)
Just like self.assertTrue(a >= b), but with a nicer default message.
- assertIn(member, container, msg=None)
Just like self.assertTrue(a in b), but with a nicer default message.
- assertIs(expr1, expr2, msg=None)
Just like self.assertTrue(a is b), but with a nicer default message.
- assertIsInstance(obj, cls, msg=None)
Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
- assertIsNone(obj, msg=None)
Same as self.assertTrue(obj is None), with a nicer default message.
- assertIsNot(expr1, expr2, msg=None)
Just like self.assertTrue(a is not b), but with a nicer default message.
- assertIsNotNone(obj, msg=None)
Included for symmetry with assertIsNone.
- assertLess(a, b, msg=None)
Just like self.assertTrue(a < b), but with a nicer default message.
- assertLessEqual(a, b, msg=None)
Just like self.assertTrue(a <= b), but with a nicer default message.
- assertListEqual(list1, list2, msg=None)
A list-specific equality assertion.
- Parameters:
list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertLogs(logger=None, level=None)
Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.
This method must be used as a context manager, and will yield a recording object with two attributes:
output
andrecords
. At the end of the context manager, theoutput
attribute will be a list of the matching formatted log messages and therecords
attribute will be a list of the corresponding LogRecord objects.Example:
with self.assertLogs('foo', level='INFO') as cm: logging.getLogger('foo').info('first message') logging.getLogger('foo.bar').error('second message') self.assertEqual(cm.output, ['INFO:foo:first message', 'ERROR:foo.bar:second message'])
- assertMultiLineEqual(first, second, msg=None)
Assert that two multi-line strings are equal.
- assertNoLogs(logger=None, level=None)
Fail unless no log messages of level level or higher are emitted on logger_name or its children.
This method must be used as a context manager.
- assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)
Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).
Objects that are equal automatically fail.
- assertNotEqual(first, second, msg=None)
Fail if the two objects are equal as determined by the ‘!=’ operator.
- assertNotIn(member, container, msg=None)
Just like self.assertTrue(a not in b), but with a nicer default message.
- assertNotIsInstance(obj, cls, msg=None)
Included for symmetry with assertIsInstance.
- assertNotRegex(text, unexpected_regex, msg=None)
Fail the test if the text matches the regular expression.
- assertRaises(expected_exception, *args, **kwargs)
Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertRaises(SomeException): do_something()
An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.
The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:
with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
- assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)
Asserts that the message in a raised exception matches a regex.
- Parameters:
expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.
- assertRegex(text, expected_regex, msg=None)
Fail the test unless the text matches the regular expression.
- assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.
- Parameters:
seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.
- assertSetEqual(set1, set2, msg=None)
A set-specific equality assertion.
- Parameters:
set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.
assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).
- assertTrue(expr, msg=None)
Check that the expression is true.
- assertTupleEqual(tuple1, tuple2, msg=None)
A tuple-specific equality assertion.
- Parameters:
tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.
- assertWarns(expected_warning, *args, **kwargs)
Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.
If called with the callable and arguments omitted, will return a context object used like this:
with self.assertWarns(SomeWarning): do_something()
An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.
The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:
with self.assertWarns(SomeWarning) as cm: do_something() the_warning = cm.warning self.assertEqual(the_warning.some_attribute, 147)
- assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)
Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.
- Parameters:
expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.
- checkOptimization(model: QSPRModel, ds: QSPRDataset, optimizer: HyperparameterOptimization)
- clearGenerated()
Remove the directories that are used for testing.
- countTestCases()
- createLargeMultitaskDataSet(name='QSPRDataset_multi_test', target_props=[{'name': 'HBD', 'task': <TargetTasks.MULTICLASS: 'MULTICLASS'>, 'th': [-1, 1, 2, 100]}, {'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
preparation_settings (dict) – dictionary containing preparation settings
random_state (int) – random state to use for splitting and shuffling
- Returns:
a
QSPRDataset
object- Return type:
- createLargeTestDataSet(name='QSPRDataset_test_large', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42, n_jobs=1, chunk_size=None)
Create a large dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createSmallTestDataSet(name='QSPRDataset_test_small', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], preparation_settings=None, random_state=42)
Create a small dataset for testing purposes.
- Parameters:
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
preparation_settings (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- createTestDataSetFromFrame(df, name='QSPRDataset_test', target_props=[{'name': 'CL', 'task': <TargetTasks.REGRESSION: 'REGRESSION'>}], random_state=None, prep=None, n_jobs=1, chunk_size=None)
Create a dataset for testing purposes from the given data frame.
- Parameters:
df (pd.DataFrame) – data frame containing the dataset
name (str) – name of the dataset
target_props (List of dicts or TargetProperty) – list of target properties
random_state (int) – random state to use for splitting and shuffling
prep (dict) – dictionary containing preparation settings
- Returns:
a
QSPRDataset
object- Return type:
- debug()
Run the test without collecting errors in a TestResult
- defaultTestResult()
- classmethod doClassCleanups()
Execute all class cleanup functions. Normally called for you after tearDownClass.
- doCleanups()
Execute all cleanup functions. Normally called for you after tearDown.
- classmethod enterClassContext(cm)
Same as enterContext, but class-wide.
- enterContext(cm)
Enters the supplied context manager.
If successful, also adds its __exit__ method as a cleanup function and returns the result of the __enter__ method.
- fail(msg=None)
Fail immediately, with the given message.
- failureException
alias of
AssertionError
- fitTest(model: QSPRModel, ds: QSPRDataset)
Test model fitting, optimization and evaluation.
- Parameters:
model (QSPRModel) – The model to test.
ds (QSPRDataset) – The dataset to use for testing.
- classmethod getAllDescriptors()
Return a list of (ideally) all available descriptor sets. For now they need to be added manually to the list below.
TODO: would be nice to create the list automatically by implementing a descriptor set registry that would hold all installed descriptor sets.
- getBigDF()
Get a large data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- classmethod getDataPrepGrid()
Return a list of many possible combinations of descriptor calculators, splits, feature standardizers, feature filters and data filters. Again, this is not exhaustive, but should cover a lot of cases.
- Returns:
a generator that yields tuples of all possible combinations as stated above, each tuple is defined as: (descriptor_calculator, split, feature_standardizer, feature_filters, data_filters)
- Return type:
grid
- classmethod getDefaultCalculatorCombo()
Makes a list of default descriptor calculators that can be used in tests. It creates a calculator with only morgan fingerprints and rdkit descriptors, but also one with them both to test behaviour with multiple descriptor sets. Override this method if you want to test with other descriptor sets and calculator combinations.
- static getDefaultPrep()
Return a dictionary with default preparation settings.
- getModel(name: str, alg: Type | None = None, parameters: dict | None = None, random_state: int | None = None)
Create a SklearnModel model.
- Parameters:
name (str) – the name of the model
alg (Type, optional) – the algorithm to use. Defaults to None.
dataset (QSPRDataset, optional) – the dataset to use. Defaults to None.
parameters (dict, optional) – the parameters to use. Defaults to None.
random_state (int, optional) – Random state to use for shuffling and other random operations. Defaults to None.
- Returns:
the model
- Return type:
- classmethod getPrepCombos()
Return a list of all possible preparation combinations as generated by
getDataPrepGrid
as well as their names. The generated list can be used to parameterize tests with the given named combinations.
- getSmallDF()
Get a small data frame for testing purposes.
- Returns:
a
pandas.DataFrame
containing the dataset- Return type:
pd.DataFrame
- property gridFile
- id()
- longMessage = True
- maxDiff = 640
- predictorTest(model: QSPRModel, dataset: QSPRDataset, comparison_model: QSPRModel | None = None, expect_equal_result=True, **pred_kwargs)
Test model predictions.
Checks if the shape of the predictions is as expected and if the predictions of the predictMols function are consistent with the predictions of the predict/predictProba functions. Also checks if the predictions of the model are the same as the predictions of the comparison model if given.
- Parameters:
model (QSPRModel) – The model to make predictions with.
dataset (QSPRDataset) – The dataset to make predictions for.
comparison_model (QSPRModel) – another model to compare the predictions with.
expect_equal_result (bool) – Whether the expected result should be equal or not equal to the predictions of the comparison model.
**pred_kwargs – Extra keyword arguments to pass to the predictor’s
predictMols
method.
- run(result=None)
- setUp()
Hook method for setting up the test fixture before exercising it.
- classmethod setUpClass()
Hook method for setting up class fixture before running tests in the class.
- setUpPaths()
Set up the test environment.
- shortDescription()
Returns a one-line description of the test, or None if no description has been provided.
The default implementation of this method returns the first line of the specified test method’s docstring.
- skipTest(reason)
Skip this test.
- subTest(msg=<object object>, **params)
Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.
- tearDown()
Remove all files and directories that are used for testing.
- classmethod tearDownClass()
Hook method for deconstructing the class fixture after running all tests in the class.
- validate_split(dataset)
Check if the split has the data it should have after splitting.