Skip to main content

Research Repository

Advanced Search

Integrating Bayesian networks and Simpson's paradox in data mining

Freitas, AA; McGarry, K; Correa, ES

Authors

AA Freitas

K McGarry

ES Correa



Contributors

F Russo
Editor

J Williamson
Editor

Abstract

This paper proposes to integrate two very different kinds of methods for data mining, namely the construction of Bayesian networks from data and the detection of occurrences of Simpson’s paradox. The former aims at discovering potentially causal knowledge in the data, whilst the latter aims at detecting surprising patterns in he data. By integrating these two kinds of methods we can hopefully discover patterns which are more likely to be useful to the user, a challenging data mining goal which is under-explored in the literature. The proposed integration method involves two approaches. The first approach uses the detection of occurrences of Simpson’s paradox as a preprocessing for a more effective construction of Bayesian networks; whilst the second approach uses the construction of a Bayesian network from data as a preprocessing for the detection of occurrences of Simpson’s paradox.

Citation

Freitas, A., McGarry, K., & Correa, E. (2007). Integrating Bayesian networks and Simpson's paradox in data mining. In F. Russo, & J. Williamson (Eds.), Causality and Probability in the Sciences (43-62). United Kingdom: College Publications

Publication Date Jan 1, 2007
Deposit Date Feb 10, 2017
Pages 43-62
Series Title Texts in Philosophy
Book Title Causality and Probability in the Sciences
ISBN 1904987354
Related Public URLs https://kar.kent.ac.uk/2511/