Details
Originalsprache | Englisch |
---|---|
Seitenumfang | 16 |
Fachzeitschrift | Neural processing letters |
Jahrgang | 53 |
Publikationsstatus | Veröffentlicht - Feb. 2021 |
Extern publiziert | Ja |
Abstract
Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Software
- Neurowissenschaften (insg.)
- Allgemeine Neurowissenschaft
- Informatik (insg.)
- Computernetzwerke und -kommunikation
- Informatik (insg.)
- Artificial intelligence
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: Neural processing letters, Jahrgang 53, 02.2021.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering
AU - Zhu, Xiaofei
AU - Do, Khoi Duy
AU - Guo, Jiafeng
AU - Xu, Jun
AU - Dietze, Stefan
N1 - Funding Information: The work was partially supported by the National Natural Science Foundation of China (No. 61722211), the Federal Ministry of Education and Research (No. 01LE1806A), the Natural Science Foundation of Chongqing (No. cstc2017jcyjBX0059), and the Beijing Academy of Artificial Intelligence (No. BAAI2019ZD0306).
PY - 2021/2
Y1 - 2021/2
N2 - Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.
AB - Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.
KW - Clustering
KW - Deep neural networks
KW - Manifold constraint
KW - Stacked autoencoder
UR - http://www.scopus.com/inward/record.url?scp=85092801329&partnerID=8YFLogxK
U2 - 10.1007/s11063-020-10375-9
DO - 10.1007/s11063-020-10375-9
M3 - Article
AN - SCOPUS:85092801329
VL - 53
JO - Neural processing letters
JF - Neural processing letters
SN - 1370-4621
ER -