Overview of available features
Data Sources
DataSource
: Base class for data sources.
Data sources are used to load data from a source programmatically.
Papyrus
: Papyrus (See data collection with Papyrus tutorial.)
Data Filters
DataFilter
: Base class for data filters.
Data filters are used to filter data based on some criteria. Examples can be found in the data preparation tutorial.
CategoryFilter
: CategoryFilterRepeatsFilter
: RepeatsFilter
Descriptor Sets
DescriptorSet
: Base class for descriptor sets.
Descriptor sets are used to calculate molecular descriptors for a set of molecules. Examples can be found in the descriptor calculation tutorial.
DrugExPhyschem
: DrugExPhyschemPredictorDesc
:PredictorDescRDKitDescs
: RDKitDescsSmilesDescs
: SmilesDescsTanimotoDistances
: TanimotoDistancesDataFrameDescriptorSet
: DataFrameDescriptorSetFingerprint
: FingerprintAtomPairFP
: AtomPairFPAvalonFP
: AvalonFPLayeredFP
: LayeredFPMaccsFP
: MaccsFPMorganFP
: MorganFPPatternFP
: PatternFPRDKitFP
: RDKitFPRDKitMACCSFP
: RDKitMACCSFPTopologicalFP
: TopologicalFP
ExtendedValenceSignature
: ExtendedValenceSignatureMold2
: Mold2Mordred
: MordredPaDEL
: PaDELProteinDescriptorSet
: ProteinDescriptorSetProDec
: ProDec
Fingerprint
: FingerprintCDKAtomPairs2DFP
: CDKAtomPairs2DFPCDKEStateFP
: CDKEStateFPCDKExtendedFP
: CDKExtendedFPCDKFP
: CDKFPCDKGraphOnlyFP
: CDKGraphOnlyFPCDKKlekotaRothFP
: CDKKlekotaRothFPCDKMACCSFP
: CDKMACCSFPCDKPubchemFP
: CDKPubchemFPCDKSubstructureFP
: CDKSubstructureFP
Data Splitters
DataSplit
: Base class for data splitters.
Data splitters are used to split data into training and test sets. Examples can be found in the data splitting tutorial.
RandomSplit
: RandomSplitScaffoldSplit
: ScaffoldSplitterTemporalSplit
: StratifiedSplitterManualSplit
: ManualSplitBootstrapSplit
: BootstrapSplitGBMTDataSplit
: GBMTDataSplitGBMTRandomSplit
: GBMTRandomSplitClusterSplit
: ClusterSplit
LeaveTargetsOut
: LeaveTargetsOutPCMSplit
: PCMSplitTemporalPerTarget
: TemporalPerTarget
Feature Filters
FeatureFilter
: Base class for feature filters.
Feature filters are used to filter features based on some criteria. Examples can be found in the data preparation tutorial.
HighCorrelationFilter
: HighCorrelationFilterLowVarianceFilter
: LowVarianceFilterBorutaFilter
: BorutaFilter (numpy
version restricted tonumpy<1.24.0
)
Models
QSPRModel
: Base class for models.
Models are used to predict properties of molecules. A general example can be found in the quick start tutorial. More detailed information can be found throughout the basic and advanced modelling tutorials.
SklearnModel
: SklearnModel
PCMModel
: PCMModel (See PCM tutorial.)
More information can be found in the deep learning tutorial.
DNNModel
: DNNModelChempropModel
: ChempropModel (See Chemprop tutorial.)PyBoostModel
: PyBoostModel
Metrics
Metric
: Base class for metrics
Metrics are used to evaluate the performance of models. More information can be found in the model assessment tutorial.
SklearnMetrics
: SklearnMetricsMaskedMetric
: MaskedMetricCalibrationError
: CalibrationErrorBEDROC
: BEDROCEnrichmentFactor
: EnrichmentFactorRobustInitialEnhancement
: RobustInitialEnhancementPrevalence
: PrevalenceSensitivity
: SensitivitySpecificity
: SpecificityPositivePredictivity
: PositivePredictivityNegativePredictivity
: NegativePredictivityCohenKappa
: CohenKappaBalancedPositivePredictivity
: BalancedPositivePredictivityBalancedNegativePredictivity
: BalancedNegativePredictivityBalancedMatthewsCorrcoeff
: BalancedMatthewsCorrcoeffBalancedCohenKappa
: BalancedCohenKappaKSlope
: KSlopeR20
: R20KPrimeSlope
: KPrimeSlopeRPrime20
: RPrime20Pearson
: PearsonSpearman
: SpearmanKendall
: KendallAverageFoldError
: AverageFoldErrorAbsoluteAverageFoldError
: AbsoluteAverageFoldErrorPercentageWithinFoldError
: PercentageWithinFoldError
Model Assessors
ModelAssessor
: Base class for model assessors.
Model assessors are used to assess the performance of models. More information be found in the model assessment tutorial.
CrossValAssessor
: CrossValAssessorTestSetAssessor
: TestSetAssessor
Hyperparameter Optimizers
HyperparameterOptimization
: Base class for hyperparameter optimizers.
Hyperparameter optimizers are used to optimize the hyperparameters of models. More information can be found in the hyperparameter optimization tutorial.
GridSearchOptimization
: GridSearchOptimizationOptunaOptimization
: OptunaOptimization
Model Plots
ModelPlot
: Base class for model plots.
Model plots are used to visualize the performance of models. Examples can be found throughout the basic and advanced modelling tutorials.
RegressionPlot
: RegressionPlotCorrelationPlot
: CorrelationPlotWilliamsPlot
: WilliamsPlot
ClassifierPlot
: ClassifierPlotROCPlot
: ROCPlotPRCPlot
: PRCPlotCalibrationPlot
: CalibrationPlotMetricsPlot
: MetricsPlotConfusionMatrixPlot
: ConfusionMatrixPlot
Monitors
FitMonitor
: Base class for monitoring model fittingAssessorMonitor
: Base class for monitoring model assessment (subclass ofFitMonitor
)HyperparameterOptimizationMonitor
: Base class for monitoring hyperparameter optimization (subclass ofAssessorMonitor
)
Monitors are used to monitor the training of models. More information can be found in the model monitoring tutorial.
NullMonitor
: NullMonitorListMonitor
: ListMonitorBaseMonitor
: BaseMonitorFileMonitor
: FileMonitorWandBMonitor
: WandBMonitor
Scaffolds
Scaffold
: Base class for scaffolds.
Class for calculating molecular scaffolds of different kinds
Murcko
: MurckoBemisMurcko
: BemisMurcko
Clustering
MoleculeClusters
: Base class for clustering molecules.
Classes for clustering molecules
RandomClusters
: RandomClustersScaffoldClusters
: ScaffoldClustersFPSimilarityClusters
: FPSimilarityClustersFPSimilarityMaxMinClusters
: FPSimilarityMaxMinClustersFPSimilarityLeaderPickerClusters
: FPSimilarityLeaderPickerClusters