Hao Xia
Outlier detection method based on improved DPC algorithm and centrifugal factor
Xia, Hao; Zhou, Yu; Li, Jiguang; Yue, Xuezhen; Li, Jichun
Authors
Yu Zhou
Jiguang Li
Xuezhen Yue
Jichun Li
Abstract
Outlier detection aims to identify data anomalies exhibiting significant deviations from normal patterns. However, existing outlier detection methods based on k-nearest neighbors often struggle with challenges such as increasing outlier counts and cluster formation issues. Additionally, selecting appropriate nearest-neighbor parameters presents a significant challenge, as researchers commonly evaluate detection accuracy across various k values. To enhance the accuracy and robustness of outlier detection, in this paper we propose an outlier detection method based on the improved DPC algorithm and centrifugal factor. Initially, we leverage k-nearest neighbors, k-reciprocal nearest neighbors, and Gaussian kernel function to determine the local density of samples, particularly addressing scenarios where the DPC algorithm struggles to identify cluster centers in sparse clusters. Subsequently, to reduce the DPC algorithm’s computational complexity, we screen the samples based on mutual nearest neighbor counts and select cluster centers accordingly. Non-central points are then distributed using k-nearest neighbors, k-reciprocal nearest neighbors, and reverse k-nearest neighbors. The centrifugal factor, whose magnitude reflects the outlier degree of samples, is then computed by calculating the ratio of the local kernel density at the cluster center to that of samples. Finally, we propose a method for choosing the nearest neighbor parameter, k. To comprehensively evaluate the outlier detection performance of the proposed algorithm, we conduct experiments on 12 complex synthetic datasets and 25 public real-world datasets, comparing the results with 12 state-of-the-art outlier detection methods.
Citation
Xia, H., Zhou, Y., Li, J., Yue, X., & Li, J. (2024). Outlier detection method based on improved DPC algorithm and centrifugal factor. Information Sciences, 682, 121255. https://doi.org/10.1016/j.ins.2024.121255
Journal Article Type | Article |
---|---|
Acceptance Date | Jul 24, 2024 |
Online Publication Date | Jul 27, 2024 |
Publication Date | 2024-11 |
Deposit Date | Aug 23, 2024 |
Publicly Available Date | Jul 28, 2026 |
Journal | Information Sciences |
Print ISSN | 0020-0255 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 682 |
Pages | 121255 |
Series ISSN | 0020-0255 |
DOI | https://doi.org/10.1016/j.ins.2024.121255 |
Additional Information | This article is maintained by: Elsevier; Article Title: Outlier detection method based on improved DPC algorithm and centrifugal factor; Journal Title: Information Sciences; CrossRef DOI link to publisher maintained version: https://doi.org/10.1016/j.ins.2024.121255; Content Type: article; Copyright: © 2024 Elsevier Inc. All rights are reserved, including those for text and data mining, AI training, and similar technologies. |
Files
This file is under embargo until Jul 28, 2026 due to copyright reasons.
Contact J.Li56@salford.ac.uk to request a copy for personal use.