Streamwise feature selection on big data using noise resistant rough functional dependency

Document Type : Research Article

Author

Department of Computer Science, University of Guilan, Rasht, Iran

Abstract

Online Streaming Features (OSF) is a data streaming scenario, in which the number of instances is fixed while feature space grows with time. This paper presents a rough sets-based online feature selection algorithm for OSF.  The proposed method, which is called OSFS-NRFS, consists of two major steps: (1) online noise resistantly relevance analysis that discards irrelevant features and (2) online noise resistanlty redundancy analysis, which eliminates redundant features. To show the efficiency and accuracy of the proposed algorithm, it is compared with two state-of-the-art rough sets-based OSFS algorithms on eight high-dimensional data sets. The experiments demonstrate that the proposed algorithm is faster and achieves better classification results than the existing methods.

Keywords