SH Muggleton
Measuring performance when positives are rare: relative advantage versus predictive accuracy - a biological case-study
Muggleton, SH; Bryant, CH; Srinivasan, A
Authors
Contributors
RL de Mántaras
Editor
E Plaza
Editor
Abstract
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky-like grammar representations are useful for learning accurate comprehensible predictors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CProgol is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Performance is measured using both predictive accuracy and a new cost function, em Relative Advantage (RA). The RA results show that searching for NPPs by using our best NPP predictor as a filter is more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.
Citation
Muggleton, S., Bryant, C., & Srinivasan, A. (2000). Measuring performance when positives are rare: relative advantage versus predictive accuracy - a biological case-study. In R. de Mántaras, & E. Plaza (Eds.), Machine learning: ECML 2000: 11th European conference on machine learning, Barcelona, Catalonia, Spain, May 31-June 2 2000 (300-312)
Publication Date | Jan 1, 2000 |
---|---|
Deposit Date | Feb 17, 2009 |
Publicly Available Date | Feb 17, 2009 |
Publisher | Springer |
Pages | 300-312 |
Series Title | Lecture notes in computer science |
Series Number | 1810 |
Book Title | Machine learning: ECML 2000: 11th European conference on machine learning, Barcelona, Catalonia, Spain, May 31-June 2 2000 |
ISBN | 9783540676027 |
Keywords | inductive logic programming |
Publisher URL | https://doi.org/10.1007/3-540-45164-1_32 |
Additional Information | Paper originally presented at the 11th European Conference on Machine Learning Barcelona, Catalonia, Spain, May 31 – June 2, 2000 Proceedings. |
Files
bryant_ecml2k.pdf
(217 Kb)
PDF
You might also like
Pruning methods for rule induction
(2017)
Thesis
Pruning classification rules with instance reduction methods
(2015)
Journal Article
Preceding rule induction with instance reduction methods
(2013)
Conference Proceeding
Comparing the performance of object and object relational database systems on objects of varying complexity
(2012)
Conference Proceeding
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search