详细信息
文献类型:会议论文
英文题名:Improved DBSCAN algorithm based on relative mass of the data field
作者:Zhu, Daoheng[1]; Li, Zhiqiang[1]; Hu, Pengpeng[1]; Su, Qianxin[1]; Liu, Run[1]
机构:[1] School of Electronics and Information Engineering, Guangdong Ocean University, Zhanjiang, 524088, China
会议论文集:International Conference on Computer Graphics, Artificial Intelligence, and Data Processing, ICCAID 2021
会议日期:December 24, 2021 - December 26, 2021
会议地点:Harbin, China
语种:英文
外文关键词:Cluster computing - Clustering algorithms - Computational efficiency - Efficiency - Image processing - Parameter estimation
外文摘要:The DBSCAN algorithm can discover clusters of arbitrary shapes, but it has difficulty in predicting the appropriate clustering parameters. In this study, the data field is introduced into the number field space, and the relative mass (RM) calculation method of the data field is proposed, and the first N points with larger mass in the data set are calculated as the initial points of clustering by the RM algorithm. Then the optimized influence factor sigma is used to calculate the force range radius to achieve the optimization of the field radius parameter, so as to select the appropriate clustering parameters. In addition, this study improves the efficiency of computing large datasets by implementing the improved algorithm for parallel computing in a distributed cluster. Finally, the effectiveness of the improved algorithm is verified on three publicly available datasets, and the efficiency of parallel computation is verified on three large datasets. The results show that (1) the improved DBSCAN algorithm can effectively solve the problem of difficult selection of clustering parameters. (2) The maximum speedup ratio of parallel computation reaches 2.12 when the size of the large data set is increased from 30,000 to 150,000 and the number of nodes involved in the computation is increased from one to five, and the average operation efficiency of the improved algorithm is improved by 32.45% compared with the original algorithm. ? 2022 SPIE.
参考文献:
正在载入数据...