SH Muggleton
Measuring performance when positives are rare: relative advantage versus predictive accuracy - a biological case-study
Muggleton, SH; Bryant, CH; Srinivasan, A
Authors
Contributors
RL de Mántaras
Editor
E Plaza
Editor
Abstract
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky-like grammar representations are useful for learning accurate comprehensible predictors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CProgol is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Performance is measured using both predictive accuracy and a new cost function, em Relative Advantage (RA). The RA results show that searching for NPPs by using our best NPP predictor as a filter is more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.
Presentation Conference Type | Conference Paper (published) |
---|---|
Publication Date | Jan 1, 2000 |
Deposit Date | Feb 17, 2009 |
Publicly Available Date | Feb 17, 2009 |
Publisher | Springer |
Pages | 300-312 |
Series Title | Lecture notes in computer science |
Series Number | 1810 |
Book Title | Machine learning: ECML 2000: 11th European conference on machine learning, Barcelona, Catalonia, Spain, May 31-June 2 2000 |
ISBN | 9783540676027 |
Keywords | inductive logic programming |
Publisher URL | https://doi.org/10.1007/3-540-45164-1_32 |
Additional Information | Paper originally presented at the 11th European Conference on Machine Learning Barcelona, Catalonia, Spain, May 31 – June 2, 2000 Proceedings. |
Files
bryant_ecml2k.pdf
(217 Kb)
PDF
You might also like
Pruning classification rules with instance reduction methods
(2015)
Journal Article
Predicting functional upstream open reading frames in Saccharomyces cerevisiae
(2009)
Journal Article
A first step towards learning which uORFs regulate gene expression
(2006)
Journal Article
A parser for the efficient induction of biological grammars
(2005)
Presentation / Conference
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search