publications – andreas karwath

2022

Wehr, Matthias M.; Sarang, Satinder S.; Rooseboom, Martijn; Boogaard, Peter J.; Karwath, Andreas; Escher, Sylvia E.

RespiraTox – Development of a QSAR model to predict human respiratory irritants Journal Article

In: Regulatory Toxicology and Pharmacology, vol. 128, pp. 105089, 2022.

Links | BibTeX | Tags: cheminformatics, machine learning, QSAR

2020

Escher, S E; Mangelsdorf, I; Hoffmann-Doerr, S; Partosch, F; Karwath, Andreas; Schroeder, K; Zapf, A; Batke, M

Time extrapolation in regulatory risk assessment: The impact of study differences on the extrapolation factors Journal Article

In: Regul Toxicol Pharmacol, vol. 112, pp. 104584, 2020, ISSN: 0273-2300.

Links | BibTeX | Tags: cheminformatics, QSAR

2014

Gütlein, Martin; Karwath, Andreas; Kramer, Stefan

CheS-Mapper 2.0 for visual validation of (Q)SAR models Journal Article

In: J. Cheminformatics, vol. 6, no. 1, pp. 41, 2014.

Abstract | Links | BibTeX | Tags: cheminformatics, data mining, graph mining, validation, visualization

@article{gutlein2014,

title = {CheS-Mapper 2.0 for visual validation of (Q)SAR models},

author = {Martin Gütlein and Andreas Karwath and Stefan Kramer},

url = {http://dx.doi.org/10.1186/s13321-014-0041-7},

doi = {10.1186/s13321-014-0041-7},

year  = {2014},

date = {2014-09-23},

journal = {J. Cheminformatics},

volume = {6},

number = {1},

pages = {41},

abstract = {Background



Sound statistical validation is important to evaluate and compare the overall performance of (Q)SAR models. However, classical validation does not support the user in better understanding the properties of the model or the underlying data. Even though, a number of visualization tools for analyzing (Q)SAR information in small molecule datasets exist, integrated visualization methods that allow the investigation of model validation results are still lacking.



Results



We propose visual validation, as an approach for the graphical inspection of (Q)SAR model validation results. The approach applies the 3D viewer CheS-Mapper, an open-source application for the exploration of small molecules in virtual 3D space. The present work describes the new functionalities in CheS-Mapper 2.0, that facilitate the analysis of (Q)SAR information and allows the visual validation of (Q)SAR models. The tool enables the comparison of model predictions to the actual activity in feature space. The approach is generic: It is model-independent and can handle physico-chemical and structural input features as well as quantitative and qualitative endpoints.



Conclusions



Visual validation with CheS-Mapper enables analyzing (Q)SAR information in the data and indicates how this information is employed by the (Q)SAR model. It reveals, if the endpoint is modeled too specific or too generic and highlights common properties of misclassified compounds. Moreover, the researcher can use CheS-Mapper to inspect how the (Q)SAR model predicts activity cliffs. The CheS-Mapper software is freely available at http://ches-mapper.org.



Graphical abstract



Comparing actual and predicted activity values with CheS-Mapper.},

keywords = {cheminformatics, data mining, graph mining, validation, visualization},

pubstate = {published},

tppubtype = {article}

}

2013

Gütlein, Martin; Helma, Christoph; Karwath, Andreas; Kramer, Stefan

A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR Journal Article

In: Molecular Informatics, vol. 32, no. 5-6, pp. 516-528, 2013.

Abstract | Links | BibTeX | Tags: cheminformatics, crossvalidation, external validation, QSAR, validation

@article{guetlein2013,

title = {A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR},

author = {Martin Gütlein and Christoph Helma and Andreas Karwath and Stefan Kramer},

url = {http://onlinelibrary.wiley.com/doi/10.1002/minf.201200134/abstract},

doi = {10.1002/minf.201200134},

year  = {2013},

date = {2013-10-14},

urldate = {2013-10-14},

journal = {Molecular Informatics},

volume = {32},

number = {5-6},

pages = {516-528},

abstract = {(Q)SAR model validation is essential to ensure the quality of inferred models and to indicate future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to accept the (Q)SAR model, and to approve its use in real world scenarios as alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model, in particular whether to employ variants of cross-validation or external test set validation, is still under discussion. In this paper, we empirically compare a k-fold cross-validation with external test set validation. To this end we introduce a workflow allowing to realistically simulate the common problem setting of building predictive models for relatively small datasets. The workflow allows to apply the built and validated models on large amounts of unseen data, and to compare the performance of the different validation approaches. The experimental results indicate that cross-validation produces higher performant (Q)SAR models than external test set validation, reduces the variance of the results, while at the same time underestimates the performance on unseen compounds. The experimental results reported in this paper suggest that, contrary to current conception in the community, cross-validation may play a significant role in evaluating the predictivity of (Q)SAR models.},

keywords = {cheminformatics, crossvalidation, external validation, QSAR, validation},

pubstate = {published},

tppubtype = {article}

}

2012

Seeland, Madeleine; Karwath, Andreas; Kramer, Stefan

A structural cluster kernel for learning on graphs Conference

The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, ACM ACM, New York, NY, USA, 2012, ISBN: 978-1-4503-1462-6.

Abstract | Links | BibTeX | Tags: cheminformatics, clustering, data mining, kernels, QSAR, suport vector machines

Gütlein, Martin; Karwath, Andreas; Kramer, Stefan

CheS-Mapper - Chemical Space Mapping and Visualization in 3D Journal Article

In: J. Cheminformatics, vol. 4, pp. 7, 2012.

Abstract | Links | BibTeX | Tags: cheminformatics, clustering, dimensionality reduction, QSAR, visualization

2009

Schulz, Hannes; Kersting, Kristian; Karwath, Andreas

ILP, the Blind, and the Elephant: Euclidean Embedding of Co-proven Queries Conference

Inductive Logic Programming, 19th International Conference, ILP 2009, Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2009, ISBN: 978-3-642-13839-3.

Abstract | Links | BibTeX | Tags: cheminformatics, dimensionality reduction, inductive logic programming, relational learning, scientific knowledge, visualization

2006

Karwath, Andreas; De Raedt, Luc

SMIREP: Predicting Chemical Activity from SMILES Journal Article

In: Journal of Chemical Information and Modeling, vol. 46, no. 6, pp. 2432 - 2444, 2006.

Abstract | Links | BibTeX | Tags: cheminformatics, graph mining, machine learning, QSAR, relational learning, scientific knowledge

@article{karwath06c,

title = {SMIREP: Predicting Chemical Activity from SMILES},

author = {Andreas Karwath and De Raedt, Luc},

url = {http://pubs.acs.org/doi/abs/10.1021/ci060159g},

doi = {10.1021/ci060159g},

year  = {2006},

date = {2006-10-12},

journal = {Journal of Chemical Information and Modeling},

volume = {46},

number = {6},

pages = {2432 - 2444},

abstract = {Most approaches to structure-activity-relationship (SAR) prediction proceed in two steps. In the first step, a typically large set of fingerprints, or fragments of interest, is constructed (either by hand or by some recent data mining techniques). In the second step, machine learning techniques are applied to obtain a predictive model. The result is often not only a highly accurate but also hard to interpret model. In this paper, we demonstrate the capabilities of a novel SAR algorithm, SMIREP, which tightly integrates the fragment and model generation steps and which yields simple models in the form of a small set of IF-THEN rules. These rules contain SMILES fragments, which are easy to understand to the computational chemist. SMIREP combines ideas from the well-known IREP rule learner with a novel fragmentation algorithm for SMILES strings. SMIREP has been evaluated on three problems: the prediction of binding activities for the estrogen receptor (Environmental Protection Agency's (EPA's) Distributed Structure-Searchable Toxicity (DSSTox) National Center for Toxicological Research estrogen receptor (NCTRER) Database), the prediction of mutagenicity using the carcinogenic potency database (CPDB), and the prediction of biodegradability on a subset of the Environmental Fate Database (EFDB). In these applications, SMIREP has the advantage of producing easily interpretable rules while having predictive accuracies that are comparable to those of alternative state-of-the-art techniques.},

keywords = {cheminformatics, graph mining, machine learning, QSAR, relational learning, scientific knowledge},

pubstate = {published},

tppubtype = {article}

}

Karwath, Andreas; Kersting, Kristian

Relational Sequence Alignments Conference

Proc. The 4th International Workshop on Mining and Learning with Graphs, MLG 2006, % editor = Thomas Gärtner and Gemma C. Garriga and Thorsten Meinl, % month = September, 2006, (workshop).

BibTeX | Tags: bioinformatics, cheminformatics, relational learning, scientific knowledge

2004

Bringmann, Björn; Karwath, Andreas

Frequent SMILES Miscellaneous

Lernen, Wissensentdeckung und Adaptivität, Workshop GI Fachgruppe Maschinelles Lernen, part of LWA, 2004, (Berlin, Germany).

Abstract | BibTeX | Tags: cheminformatics, graph mining, machine learning

Karwath, Andreas; De Raedt, Luc

Predictive Graph Mining Conference

The International Workshop on Mining Graphs, Trees and Sequences, MGTS 2004, 2004, (workshop).

BibTeX | Tags: cheminformatics, graph mining, machine learning, QSAR

Karwath, Andreas; De Raedt, Luc

Predictive Graph Mining Conference

The 7th International Conference of Discovery Science, DS 2004, vol. 3245, Lecture Notes in Artificial Intelligence Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2004, ISBN: 978-3-540-23357-2.

Abstract | Links | BibTeX | Tags: cheminformatics, graph mining, machine learning, QSAR