publications – andreas karwath

2009

Schulz, Hannes; Kersting, Kristian; Karwath, Andreas

ILP, the Blind, and the Elephant: Euclidean Embedding of Co-proven Queries Conference

Inductive Logic Programming, 19th International Conference, ILP 2009, Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2009, ISBN: 978-3-642-13839-3.

Abstract | Links | BibTeX | Tags: cheminformatics, dimensionality reduction, inductive logic programming, relational learning, scientific knowledge, visualization

2008

Karwath, Andreas; Kersting, Kristian; Landwehr, Niels

Boosting Relational Sequence Alignments Conference

The 8th IEEE International Conference on Data Mining, ICDM 2008, IEEE, 2008, ISBN: 978-0-7695-3502-9.

Abstract | Links | BibTeX | Tags: inductive logic programming, machine learning, relational learning, scientific knowledge

Kersting, Kristian; De Raedt, Luc; Gutmann, Bernd; Karwath, Andreas; Landwehr, Niels

Relational Sequence Learning Book Chapter

In: Probabilistic Inductive Logic Programming - Theory and Applications, vol. 4911, pp. 28-55, Springer Verlag, Berlin Heidelberg, Germany, 2008, ISBN: 978-3-540-78651-1.

Abstract | Links | BibTeX | Tags: inductive logic programming, machine learning, relational learning, scientific knowledge

2007

Karwath, Andreas; Kersting, Kristian

Relational Sequence Alignments and Logos Conference

Inductive Logic Programming, 16th International Conference, ILP 2006, vol. 4455, Lecture Notes in Computer Science Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2007, ISBN: 978-3-540-73846-6.

Abstract | Links | BibTeX | Tags: bioinformatics, inductive logic programming, relational learning, scientific knowledge

King, Ross D.; Karwath, Andreas; Clare, Amanda; Dehaspe, Luc

Logic and the Automatic Acquisition of Scientific Knowledge: An Application to Functional Genomics Conference

Computational Discovery of Scientific Knowledge, Introduction, Techniques, and Applications in Environmental and Life Sciences, vol. 4660, Lecture Notes in Computer Science Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2007, ISBN: 978-3-540-73919-7.

Abstract | Links | BibTeX | Tags: bioinformatics, data mining, inductive logic programming, machine learning, relational learning, scientific knowledge

2006

Clare, Amanda; Karwath, Andreas; Ougham, Helen; King, Ross D.

Functional bioinformatics for Arabidopsis thaliana Journal Article

In: Bioinformatics, vol. 22, no. 9, pp. 1130-1136, 2006.

Abstract | Links | BibTeX | Tags: bioinformatics, data mining, inductive logic programming, machine learning, relational learning, scientific knowledge

2002

Karwath, Andreas; King, Ross D.

Homology Induction: the use of machine learning to improve sequence similarity searches Journal Article

In: BMC Bioinformatics, vol. 3, no. 1, 2002.

Abstract | Links | BibTeX | Tags: bioinformatics, data mining, inductive logic programming, machine learning, relational learning

@article{karwath02a,

title = {Homology Induction: the use of machine learning to improve sequence similarity searches},

author = {Andreas Karwath and Ross D. King},

url = {http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-3-11},

doi = {10.1186/1471-2105-3-11},

year  = {2002},

date = {2002-04-23},

journal = {BMC Bioinformatics},

volume = {3},

number = {1},

abstract = {Background

The inference of homology between proteins is a key problem in molecular biology The current best approaches only identify ~50% of homologies (with a false positive rate set at 1/1000).



Results

We present Homology Induction (HI), a new approach to inferring homology. HI uses machine learning to bootstrap from standard sequence similarity search methods. First a standard method is run, then HI learns rules which are true for sequences of high similarity to the target (assumed homologues) and not true for general sequences, these rules are then used to discriminate sequences in the twilight zone. To learn the rules HI describes the sequences in a novel way based on a bioinformatic knowledge base, and the machine learning method of inductive logic programming. To evaluate HI we used the PDB40D benchmark which lists sequences of known homology but low sequence similarity. We compared the HI methodoly with PSI-BLAST alone and found HI performed significantly better. In addition, Receiver Operating Characteristic (ROC) curve analysis showed that these improvements were robust for all reasonable error costs. The predictive homology rules learnt by HI by can be interpreted biologically to provide insight into conserved features of homologous protein families.



Conclusions

HI is a new technique for the detection of remote protein homolgy – a central bioinformatic problem. HI with PSI-BLAST is shown to outperform PSI-BLAST for all error costs. It is expect that similar improvements would be obtained using HI with any sequence similarity method.

},

keywords = {bioinformatics, data mining, inductive logic programming, machine learning, relational learning},

pubstate = {published},

tppubtype = {article}

}

Karwath, Andreas

Large Logical Đatabases and their Applications to Molecular Biology PhD Thesis

University of Wales, Aberystwyth, 2002.

BibTeX | Tags: bioinformatics, data mining, inductive logic programming, machine learning, relational learning, scientific knowledge

2001

Karwath, Andreas; King, Ross D.

An automated ILP server in the field of bioinformatics Conference

The Eleventh International Conference on Inductive Logic Programming, ILP 2001, vol. 2157, Lecture Notes in Computer Science Springer-Verlag Berlin Heidelberg Springer Verlag, Berlin Heidelberg, Germany, 2001, ISBN: 978-3-540-42538-0.

Abstract | Links | BibTeX | Tags: bioinformatics, data mining, inductive logic programming, machine learning, relational learning

King, Ross D.; Karwath, Andreas; Clare, Amanda; Dehaspe, Luc

The utility of different representations of protein sequence for predicting functional class Journal Article

In: Bioinformatics, vol. 17, no. 5, pp. 445-454, 2001.

Abstract | Links | BibTeX | Tags: bioinformatics, data mining, inductive logic programming, relational learning, scientific knowledge

2000

King, Ross D.; Karwath, Andreas; Clare, Amanda; Dehaspe, Luc

Accurate prediction of protein functional class from sequence in the Mycobacterium tuberculosis and Escherichia coli genomes using data mining. Journal Article

In: Yeast (Comparative and Functional Genomics), vol. 17, pp. 283-293, 2000.

Abstract | Links | BibTeX | Tags: bioinformatics, data mining, inductive logic programming, relational learning, scientific knowledge

King, Ross D.; Karwath, Andreas; Clare, Amanda; Dehaspe, Luc

Genome scale prediction of protein functional class from sequence using data mining Conference

The Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, The Association for Computing Machinery, New York, USA, 2000.

Links | BibTeX | Tags: bioinformatics, data mining, inductive logic programming, relational learning