Locality Sensitive Hash Aggregated Nonlinear Neighbourhood Matrix Factorization for Online Sparse Big Data Analysis

Li, Zixuan; Li, Hao; Li, Kenli; Wu, Fan; Chen, Lydia; Li, Keqin

Abstract:Matrix factorization (MF) can extract the low-rank features and integrate the information of the data manifold distribution from high-dimensional data, which can consider the nonlinear neighbourhood information. Thus, MF has drawn wide attention for low-rank analysis of sparse big data, e.g., Collaborative Filtering (CF) Recommender Systems, Social Networks, and Quality of Service. However, the following two problems exist: 1) huge computational overhead for the construction of the Graph Similarity Matrix (GSM), and 2) huge memory overhead for the intermediate GSM. Therefore, GSM-based MF, e.g., kernel MF, graph regularized MF, etc., cannot be directly applied to the low-rank analysis of sparse big data on cloud and edge platforms. To solve this intractable problem for sparse big data analysis, we propose Locality Sensitive Hashing (LSH) aggregated MF (LSH-MF), which can solve the following problems: 1) The proposed probabilistic projection strategy of LSH-MF can avoid the construction of the GSM. Furthermore, LSH-MF can satisfy the requirement for the accurate projection of sparse big data. 2) To run LSH-MF for fine-grained parallelization and online learning on GPUs, we also propose CULSH-MF, which works on CUDA parallelization. Experimental results show that CULSH-MF can not only reduce the computational time and memory overhead but also obtain higher accuracy. Compared with deep learning models, CULSH-MF can not only save training time but also achieve the same accuracy performance.

Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2111.11682 [cs.DC]
	(or arXiv:2111.11682v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.2111.11682

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Locality Sensitive Hash Aggregated Nonlinear Neighbourhood Matrix Factorization for Online Sparse Big Data Analysis

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators