Abstract—Mining high-dimensional business data is a popular and important problem. However, there are two challenges for mining such data, including 1) the curse of dimensionality and 2) the meaningfulness of the similarity measure in the high dimension space. This paper proposes a novel approach to overcome the problems, which builds a generalized multiple kernel machine (GMKM) on a special subspace created by the kernel orthonormalized partial least square (KOPLS). GMKM takes products of kernels-corresponding to a tensor product of feature spaces. This leads to a richer and much higher dimensional feature representation. Therefore, GMKM is powerful in identifying relevant features and their apposite kernel representation. KOPLS finds a low dimensional representation of data, which uncovers the hidden information and simultaneously respects the intrinsic geometry of data manifold. Our new system robustly overcomes the weakness of traditional multiple kernel machines, and outperforms traditional classification systems.
Index Terms—Data mining, multiple kernel learning, dimensionality reduction, support vector machine.
Shian-Chang Huang and Tung-Kuang Wu are with the National Changhua University of Education, Changhua, Taiwan (e-mail: shhuang@cc.ncue.edu.tw, tkwu@im.ncue.edu.tw).
Nan-Yu Wang is with Ta Hwa University of Science and Technology, Hsinchu, Taiwan (e-mail: nanyu@tust.edu.tw).
[PDF]
Cite: Shian-Chang Huang, Nan-Yu Wang, and Tung-Kuang Wu, "High Dimensional Data Mining Systems by Kernel Orthonormalized Partial Least Square Analysis," International Journal of Future Computer and Communication vol. 4, no. 5, pp. 364-367, 2015.