drugex.utils package
Submodules
drugex.utils.download module
Utility functions to download files.
- drugex.utils.download.check_sha256sum(filename, sha256)[source]
Check the SHA256 checksum of a file corresponds to that given.
- drugex.utils.download.download_file(url, out_path, extract_out_path=None, byte_size=None, sha256sum=None, progress=True, callback=None) None [source]
Download a file and extract its content if is a ZIP or TAR.GZ file.
- Parameters:
url (str) – URL of the file to be downloaded.
out_path (str) – Path the file should be written to
extract_out_path (str) – Path to extract the content of the ZIP/TAR.GZ file into
byte_size (int) – Size in bytes of the file to be downloaded; ignored if None
sha256sum (str) – SHA256 checksum to compare that of the downloaded file to
progress (bool) – should progress be shown
callback (Callable[[], Any]) – callback function to be called after each chunk of the file is downloaded
drugex.utils.fingerprints module
drugex.utils.gcmol module
gcmol
Created by: Martin Sicho On: 10.06.22, 16:50
- drugex.utils.gcmol.canonicalize(smiles: str, include_stereocenters=True) str | None [source]
Canonicalize the SMILES strings with RDKit. The algorithm is detailed under https://pubs.acs.org/doi/full/10.1021/acs.jcim.5b00543 :param smiles: SMILES string to canonicalize :param include_stereocenters: whether to keep the stereochemical information in the canonical SMILES string
- Returns:
Canonicalized SMILES string, None if the molecule is invalid.
- drugex.utils.gcmol.canonicalize_list(smiles_list: Iterable[str], include_stereocenters=True) List[str] [source]
Canonicalize a list of smiles. Filters out repetitions and removes corrupted molecules. :param smiles_list: molecules as SMILES strings :param include_stereocenters: whether to keep the stereochemical information in the canonical SMILES strings
- Returns:
The canonicalized and filtered input smiles.
- drugex.utils.gcmol.remove_duplicates(list_with_duplicates)[source]
Removes the duplicates and keeps the ordering of the original list. For duplicates, the first occurrence is kept and the later occurrences are ignored. :param list_with_duplicates: list that possibly contains duplicates
- Returns:
A list with no duplicates.
drugex.utils.optim module
drugex.utils.pareto module
- drugex.utils.pareto.get_Pareto_fronts(scores)[source]
Identify the Pareto fronts from a given set of scores.
- Parameters:
scores (numpy.ndarray) – An (n_points, n_scores) array of scores.
- Returns:
A list containing the indices of points belonging to each Pareto front.
- Return type:
list of numpy.ndarray