Overview of available features
Data Sources
DataSource: Base class for data sources.
Data sources are used to load data from a source programmatically.
Papyrus: Papyrus (See data collection with Papyrus tutorial.)
Data Filters
DataFilter: Base class for data filters.
Data filters are used to filter data based on some criteria. Examples can be found in the data preparation tutorial.
CategoryFilter: CategoryFilterRepeatsFilter: RepeatsFilter
Descriptor Sets
DescriptorSet: Base class for descriptor sets.
Descriptor sets are used to calculate molecular descriptors for a set of molecules. Examples can be found in the descriptor calculation tutorial.
DrugExPhyschem: DrugExPhyschemPredictorDesc:PredictorDescRDKitDescs: RDKitDescsSmilesDescs: SmilesDescsTanimotoDistances: TanimotoDistancesDataFrameDescriptorSet: DataFrameDescriptorSetFingerprint: FingerprintAtomPairFP: AtomPairFPAvalonFP: AvalonFPLayeredFP: LayeredFPMaccsFP: MaccsFPMorganFP: MorganFPPatternFP: PatternFPRDKitFP: RDKitFPRDKitMACCSFP: RDKitMACCSFPTopologicalFP: TopologicalFP
ExtendedValenceSignature: ExtendedValenceSignatureMold2: Mold2Mordred: MordredPaDEL: PaDELProteinDescriptorSet: ProteinDescriptorSetProDec: ProDec
Fingerprint: FingerprintCDKAtomPairs2DFP: CDKAtomPairs2DFPCDKEStateFP: CDKEStateFPCDKExtendedFP: CDKExtendedFPCDKFP: CDKFPCDKGraphOnlyFP: CDKGraphOnlyFPCDKKlekotaRothFP: CDKKlekotaRothFPCDKMACCSFP: CDKMACCSFPCDKPubchemFP: CDKPubchemFPCDKSubstructureFP: CDKSubstructureFP
Data Splitters
DataSplit: Base class for data splitters.
Data splitters are used to split data into training and test sets. Examples can be found in the data splitting tutorial.
RandomSplit: RandomSplitScaffoldSplit: ScaffoldSplitterTemporalSplit: StratifiedSplitterManualSplit: ManualSplitBootstrapSplit: BootstrapSplitGBMTDataSplit: GBMTDataSplitGBMTRandomSplit: GBMTRandomSplitClusterSplit: ClusterSplit
LeaveTargetsOut: LeaveTargetsOutPCMSplit: PCMSplitTemporalPerTarget: TemporalPerTarget
Feature Filters
FeatureFilter: Base class for feature filters.
Feature filters are used to filter features based on some criteria. Examples can be found in the data preparation tutorial.
HighCorrelationFilter: HighCorrelationFilterLowVarianceFilter: LowVarianceFilterBorutaFilter: BorutaFilter (numpyversion restricted tonumpy<1.24.0)
Models
QSPRModel: Base class for models.
Models are used to predict properties of molecules. A general example can be found in the quick start tutorial. More detailed information can be found throughout the basic and advanced modelling tutorials.
SklearnModel: SklearnModel
PCMModel: PCMModel (See PCM tutorial.)
More information can be found in the deep learning tutorial.
DNNModel: DNNModelChempropModel: ChempropModel (See Chemprop tutorial.)PyBoostModel: PyBoostModel
Metrics
Metric: Base class for metrics
Metrics are used to evaluate the performance of models. More information can be found in the model assessment tutorial.
SklearnMetrics: SklearnMetricsMaskedMetric: MaskedMetricCalibrationError: CalibrationErrorBEDROC: BEDROCEnrichmentFactor: EnrichmentFactorRobustInitialEnhancement: RobustInitialEnhancementPrevalence: PrevalenceSensitivity: SensitivitySpecificity: SpecificityPositivePredictivity: PositivePredictivityNegativePredictivity: NegativePredictivityCohenKappa: CohenKappaBalancedPositivePredictivity: BalancedPositivePredictivityBalancedNegativePredictivity: BalancedNegativePredictivityBalancedMatthewsCorrcoeff: BalancedMatthewsCorrcoeffBalancedCohenKappa: BalancedCohenKappaKSlope: KSlopeR20: R20KPrimeSlope: KPrimeSlopeRPrime20: RPrime20Pearson: PearsonSpearman: SpearmanKendall: KendallAverageFoldError: AverageFoldErrorAbsoluteAverageFoldError: AbsoluteAverageFoldErrorPercentageWithinFoldError: PercentageWithinFoldError
Model Assessors
ModelAssessor: Base class for model assessors.
Model assessors are used to assess the performance of models. More information be found in the model assessment tutorial.
CrossValAssessor: CrossValAssessorTestSetAssessor: TestSetAssessor
Hyperparameter Optimizers
HyperparameterOptimization: Base class for hyperparameter optimizers.
Hyperparameter optimizers are used to optimize the hyperparameters of models. More information can be found in the hyperparameter optimization tutorial.
GridSearchOptimization: GridSearchOptimizationOptunaOptimization: OptunaOptimization
Model Plots
ModelPlot: Base class for model plots.
Model plots are used to visualize the performance of models. Examples can be found throughout the basic and advanced modelling tutorials.
RegressionPlot: RegressionPlotCorrelationPlot: CorrelationPlotWilliamsPlot: WilliamsPlot
ClassifierPlot: ClassifierPlotROCPlot: ROCPlotPRCPlot: PRCPlotCalibrationPlot: CalibrationPlotMetricsPlot: MetricsPlotConfusionMatrixPlot: ConfusionMatrixPlot
Monitors
FitMonitor: Base class for monitoring model fittingAssessorMonitor: Base class for monitoring model assessment (subclass ofFitMonitor)HyperparameterOptimizationMonitor: Base class for monitoring hyperparameter optimization (subclass ofAssessorMonitor)
Monitors are used to monitor the training of models. More information can be found in the model monitoring tutorial.
NullMonitor: NullMonitorListMonitor: ListMonitorBaseMonitor: BaseMonitorFileMonitor: FileMonitorWandBMonitor: WandBMonitor
Scaffolds
Scaffold: Base class for scaffolds.
Class for calculating molecular scaffolds of different kinds
Murcko: MurckoBemisMurcko: BemisMurcko
Clustering
MoleculeClusters: Base class for clustering molecules.
Classes for clustering molecules
RandomClusters: RandomClustersScaffoldClusters: ScaffoldClustersFPSimilarityClusters: FPSimilarityClustersFPSimilarityMaxMinClusters: FPSimilarityMaxMinClustersFPSimilarityLeaderPickerClusters: FPSimilarityLeaderPickerClusters