Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression

Authors

DOI:

https://doi.org/10.5599/admet.1879

Keywords:

General solubility equation (GSE), Abraham solvation equation (ABSOLV), flexible-acceptor GSE(Φ,B), Consensus model, decision-tree Exclusive Or (XOR) model, Kier molecular flexibility index (Φ), drug-like molecules, machine learning (ML), intrinsic solubility
Graphical Abstract

Abstract

Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction model, by comparing the performances of the GSE to Abraham’s Solvation Equation (ABSOLV), and Random Forest regression (RFR) machine-learning (ML) method. Large, well-curated aqueous intrinsic solubility databases are available. However, drugs may be sparsely distributed in chemical space, concentrated in clusters. Even a large database might overlook some regions. Test compounds from under-represented portions of space may be poorly predicted, as might be the case with the ‘loose’ set of 32 drugs in the Second Solubility Challenge (2020). There appears to be still a need for better coverage of drug space. Increasingly, current trends in predictions of solubility use calculated input descriptors, which may be an advantage for exploring properties of molecules yet to be synthesized. The risk may be that overall prediction approaches might be based on accumulated uncertainty. The increasing use of ML/AI methods can lead to accurate predictions, but such predictions may not readily suggest the strategies to pursue in selecting yet-to-be-synthesized compounds. Based on our latest findings, we recommend predictions based on both ‘grouped’ ABSOLV(GRP) and ‘Flexible Acceptor’ GSE(Φ,B) models with the provided best-fit parameters, where Φ is the Kier molecular flexibility index and B is the Abraham H-bond acceptor strength. For molecules with Φ < 11, the prudent choice is to pick the Consensus Model, the average of ABSOLV(GRP) and GSE(Φ,B). For more flexible molecules, GSE(Φ,B) is recommended.

Downloads

Download data is not yet available.

References

A. Llinàs, R.C. Glen, J.M. Goodman. Solubility challenge: can you predict solubilities of 32 molecules using a database of 100 reliable measurements? Journal of Chemical Information and Modeling 48 (2008) 1289-1303. https://doi.org/10.1021/ci800058v.

A.J. Hopfinger, E.X. Esposito, A. Llinàs, R.C. Glen, J.M. Goodman. Findings of the challenge to predict aqueous solubility. Journal of Chemical Information and Modeling 49 (2009) 1-5. https://doi.org/10.1021/ci800436c.

A. Llinas, A. Avdeef. Solubility challenge revisited after ten years, with multi-lab shake-flask data, using tight (SD ∼ 0.17 log) and loose (SD ∼ 0.62 log) test sets. Journal of Chemical Information and Modeling 59 (2019) 3036-3040. https://doi.org/10.1021/acs.jcim.9b00345.

A. Llinas, I. Oprisiu, A. Avdeef. Findings of the second challenge to predict aqueous solubility. J. Chem. Inf. Model. 60 (2020) 4791-4803. https://doi.org/10.1021/acs.jcim.0c00701.

M. Oja, S. Sild, G. Piir, U. Maran. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 14 (2022) 2248. https://doi.org/10.3390/pharmaceutics14102248.

D.S. Palmer, J.B.O. Mitchell. Is experimental data quality the limiting factor in predicting the aqueous solubility of druglike molecules? Molecular Pharmaceutics 11 (2014) 2962-2972. https://doi.org/10.1021/mp500103r.

A. Avdeef. Predicting Solubility of New Drugs - Handbook of Critically Curated Data for Pharma-ceutical Research. CRC Press, Boca Raton, FL, USA, 2024 ISBN:‎ 978-1032617671. https://www.barnesandnoble.com/w/predicting-solubility-of-new-drugs-alex-avdeef/1143832638.

A. Avdeef. Suggested improvements for measurement of equilibrium solubility-pH of ionizable drugs. ADMET & DMPK 3 (2105) 84-109. https://doi.org/10.5599/admet.3.2.193.

A. Avdeef, E. Fuguet, A. Llinàs, C. Ràfols, E. Bosch, G. Völgyi, T. Verbić, E. Boldyreva, K. Takács-Novák. Equilibrium solubility measurement of ionizable drugs – consensus recommendations for improving data quality. ADMET & DMPK 4 (2016) 117-178. https://doi.org/10.5599/admet.4.2.292.

A. Veseli, S. Žakelj, A. Kristl. A review of methods for solubility determination in biopharmaceutical drug characterization. Drug Development and Industrial Pharmacy 45 (2019) 1717-1724. https://doi.org/10.1080/03639045.2019.1665062.

A. Ono, N. Matsumura, T. Kimoto, Y. Akiyama, S. Funaki, N. Tamura, S. Hayashi, Y. Kojima, M. Fushimi, H. Sudaki, R. Aihara, Y. Haruna, M. Jiko, M. Iwasaki, T. Fujita, K. Sugano. Harmonizing solubility measurement to lower inter-laboratory variance – progress of consortium of biopharmaceutical tools (CoBiTo) in Japan. ADMET & DMPK 7 (2019) 183-195. http://dx.doi.org/10.5599/admet.704.

M. Vertzoni, J. Alsenz, P. Augustijns, A. Bauer-Brandl, C.A.S. Bergström, J. Brouwers, A. Müllerz, G. Perlovich, C. Saal, K. Sugano, C. Reppas. UNGAP best practice for improving solubility data quality of orally administered drugs. European Journal of Pharmaceutical Sciences 168 (2022) 106043. https://doi.org/10.1016/j.ejps.2021.106043.

N. Sun, A. Avdeef. Biorelevant pKa (37 oC) Predicted from the 2D Structure of the Molecule and its pKa at 25oC. Journal of Pharmaceutical and Biomedical 56 (2011) 173-182. https://doi.org/10.1016/j.jpba.¬2011.05.007.

A. Avdeef. Solubility temperature dependence predicted from 2D structure. ADMET & DMPK 3 (2015) 298-344. https://doi.org/10.5599/admet.3.4.259.

A. Avdeef. Multi-lab intrinsic solubility measurement reproducibility in CheqSol and shake-flask methods. ADMET & DMPK 7 (2019) 210-219. http://dx.doi.org/10.5599/admet.698.

A. Avdeef. Prediction of aqueous intrinsic solubility of druglike molecules using Random Forest regression trained with Wiki-pS0 database. ADMET & DMPK 8 (2020) 29-77. https://dx.doi.org/10.5599/admet.766.

A. Avdeef, M. Kansy. Can small drugs predict the intrinsic aqueous solubility of ‘beyond Rule of 5’ big drugs? ADMET & DMPK 8 (2020) 180–206. https://dx.doi.org/10.5599/admet.794.

A. Avdeef, M. Kansy. ‘Flexible-Acceptor’ General Solubility Equation for beyond Rule of 5 drugs. Mol. Pharmaceutics 17 (2021) 3930-3940. https://doi.org/10.1021/acs.molpharmaceut.0c00689.

A. Avdeef, M. Kansy. Predicting solubility of newly-approved drugs (2016-2020) with a simple ABSOLV and GSE(Flexible Acceptor) Consensus Model outperforming random forest regression. Journal of Solution Chemistry 51 (2022) 1020-1055. https://doi.org/10.1007/s10953-022-01141-7.

A. Avdeef, M. Kansy. Trends in PhysChem Properties of Newly Approved Drugs over the Last Six Years, Predicting Solubility of Drugs Approved in 2021. Journal of Solution Chemistry 51 (2022) 1455-1481. https://doi.org/10.1007/s10953-022-01199-3.

S.H. Yalkowsky, S.C. Valvani. Solubility and partitioning I: Solubility of nonelectrolytes in water. Journal of Pharmaceutical Sciences 69 (1980) 912-922. https://doi.org/10.1002/jps.2600690814.

S.H. Yalkowsky, S. Banerjee. Aqueous Solubility: Methods of Estimation for Organic Compounds. Marcel Dekker, Inc.: New York. (1992) p. 142. https://api.semanticscholar.org/CorpusID:92804093.

Y. Ran, N, Jain, S.H. Yalkowsky. Prediction of aqueous solubility of organic compounds by the General Solubility Equation (GSE). J. Chem. Inf. Comput. Sci. 41 (2001) 1208-1217. https://doi.org/10.1021/ci010287z.

N. Jain, G. Yang, S.G. Machatha, S.H. Yalkowsky. Estimation of the aqueous solubility of weak electrolytes. International Journal of Pharmaceutics 319 (2006) 169-171. https://doi.org/10.1016/j.ijpharm.2006.04.022.

S.H. Yalkowsky, Y. He, P. Jain. Handbook of Aqueous Solubility Data, Second Edition. CRC Press: Boca Raton, FL, (2010). https://doi.org/10.1201/EBK1439802458.

D. Alantari, S. Yalkowsky. Comments on prediction of the aqueous solubility using the general solubility equation (GSE) versus a genetic algorithm and a support vector machine model. Pharmaceutical Development and Technology 23 (2018) 739-740. https://doi.org/10.1080/10837450.2017.1321663.

M.H. Abraham. Scales of hydrogen bonding - their construction and application to physicochemical and biochemical processes. Chemical Society Reviews 22 (1993) 73-83. https://doi.org/10.1039/CS9932200073.

M.H. Abraham, J. Le. The correlation and prediction of the solubility of compounds in water using an amended solvation energy relationship. Journal of Pharmaceutical Sciences 88 (1999) 868-880. https://doi.org/10.1021/js9901007.

J.A. Platts, D. Butina, M.H. Abraham, A. Hersey. Estimation of molecular linear free energy relation descriptors using a group contribution approach. Journal of Chemical Information and Computer Sciences 39 (1999) 835-845. https://doi.org/10.1021/ci980339t.

L. Breiman. Random forests. Machine Learning 45 (2001) 5-32. https://doi.org/10.1023/A:1010933404324.

D.S. Palmer, N.M. O’Boyle, R.C. Glen, J.B.O. Mitchell. Random Forest models to predict aqueous solubility. Journal of Chemical Information and Modeling 47 (2007) 150-158. https://doi.org/10.1021/¬ci060164k.

W.P. Walters. What are our models really telling us? A practical tutorial on avoiding common mistakes when building predictive models, in Chemoinformatics for Drug Discovery. J. Bajorath (Ed.). John Wiley & Sons, Hoboken, NJ, 2014, pp. 1-31. https://doi.org/10.1002/9781118742785.ch1.

A. Liaw. Random Forests What, Why, and How. https://www.youtube.com/watch?v=XJnjlpW9w5A. (YouTube lecture) https://nyhackr.blob.core.windows.net/presentations/Random-Forests-What-Why-and-How_Andy_Liaw.pdf (Accessed 23 Nov 2022).

L.B. Kier. An index of molecular flexibility from kappa shape attributes. Quant. Struct.-Act. Relat. 8 (1989) 221-224. https://doi.org/10.1002/qsar.19890080307.

A. Avdeef. Do you know your r2? ADMET & DMPK 9 (2021) 69-74. https://doi.org/10.5599/admet.888.

C.A. Lipinski, F. Lombardo, B.W. Dominy, P.J. Feeney. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Advanced Drug Delivery Reviews 23 (1997) 3-25. https://doi.org/10.1016/s0169-409x(00)00129-0.

B.C. Doak, B. Over, F. Giordanetto, J. Kihlberg. Oral druggable space beyond the Rule of 5: insights from drugs and clinical candidates. Chemistry & Biology 21 (2014) 1115-1142. https://doi.org/10.1016/j.chembiol.2014.08.013.

D.A. DeGoey, H.-J. Chen, P.B. Cox, M. D. Wendt. Beyond the Rule of 5: Lessons Learned from AbbVie’s Drugs and Compound Collection. Journal of Medicinal Chemistry 61 (2018) 2636-2651. https://doi.org/10.1021/acs.jmedchem.7b00717.

G. Ermondi, M. Vallaro, G. Goetz, M. Shalaeva, G. Caron. Experimental lipophilicity for beyond Rule of 5 compounds. Future Drug. Discov. 1 (2019) https://doi.org/10.4155/fdd-2019-0002.

G. Caron, J. Kihlberg, G. Ermondi. Intramolecular hydrogen bonding: An opportunity for improved design in medicinal chemistry. Medicinal Research Reviews 39 (2019) 1707-1729. https://doi.org/10.1002/med.21562.

G. Ermondi, M. Vallaro, G. Goetz, M. Shalaeva, G. Caron. Updating the portfolio of physicochemical descriptors related to permeability in the beyond the rule of 5 chemical space. European Journal of Pharmaceutical Sciences 146 (2020) 105274. https://doi.org/10.1016/j.ejps.2020.105274.

G. Caron, V. Digiesi, S. Solaro, G. Ermondi. Flexibility in early drug discovery: focus on the beyond-Rule-of-5 chemical space. Drug Discovery Today 25 (2020) 621-627. https://doi.org/10.1016/j.drudis.2020.01.012.

G. Ermondi, V. Poongavanam, M. Vallaro, J. Kihlberg, G. Caron, G. Solubility prediction in the bRo5 chemical space: where are we right now? ADMET & DMPK 8 (2020) 207-214. https://doi.org/10.5599/admet.834.

D.G. Jiménez, M.R. Sebastiano, M. Vallaro, V. Mileo, D. Pizzirani, E. Moretti, G. Ermondi, G. Caron. Designing soluble PROTACs: strategies and preliminary guidelines. Journal of Medicinal Chemistry 65 (2022) 12639-12649. https://doi.org/10.1021/acs.jmedchem.2c00201.

A. Mullard. 2021 FDA drug approvals. The FDA approved 50 novel drugs in 2021, including the first KRAS inhibitor for cancer and the first anti-amyloid antibody for Alzheimer’s disease. Nature Reviews Drug Discovery 21 (2022) 83-88. https://doi.org/10.1038/d41573-022-00001-9.

L.D. Hughes, D.S. Palmer, F. Nigsch, J.B.O. Mitchell. Why are some properties more difficult to predict than others? A study of QSPR models of solubility, melting point, and log P. Journal of Chemical Information and Modeling 48 (2008) 220-232. https://doi.org/10.1021/ci700307p.

C.A. Lipinski. Drug-like properties and the causes of poor solubility and poor permeability. Journal of Pharmacological and Toxicological Methods 44 (2000) 235-249. https://doi.org/10.1016/s10568719(00)00107-6.

Downloads

Published

21-08-2023 — Updated on 21-08-2023

How to Cite

Avdeef, A. (2023). Mechanistically transparent models for predicting aqueous solubility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression . ADMET and DMPK, 11(3), 317–330. https://doi.org/10.5599/admet.1879

Issue

Section

Reviews

Most read articles by the same author(s)

1 2 > >>