详细信息
基于MIP和改进模糊K-Means算法的大数据聚类设计 被引量:4
Design for Large Data Clustering Based on MIP and Improved Fuzzy K-Means Algorithm
文献类型:期刊文献
中文题名:基于MIP和改进模糊K-Means算法的大数据聚类设计
英文题名:Design for Large Data Clustering Based on MIP and Improved Fuzzy K-Means Algorithm
作者:陈思慧[1]
机构:[1]广东海洋大学网络与教育技术中心,广东湛江524088
年份:2014
卷号:22
期号:4
起止页码:1270
中文期刊名:计算机测量与控制
外文期刊名:Computer Measurement &Control
收录:CSTPCD、、北大核心2011、北大核心
语种:中文
中文关键词:模糊K均值;聚类;大数据;距离
外文关键词:fuzzy K--Means; clustering; large data; distance
中文摘要:为了克服经典模糊K-Means算法在面对大数据聚类时所出现的聚类效率低和运行时间长的问题,提出了一种基于层次式MPI并行编程模型和改进模糊K-Means算法的大数据聚类方法;首先,引入多层MasterNode节点设计了一种改进的层次式MPI并行编程模型,然后,引入类间距离和类内距离得到一种最优聚类数的计算方式,并设计了一种改进的模糊K-均值聚类算法;采用SlaveNode节点并行运行改进的模糊K均值算法进行数据子集聚类,然后再通过各层MasterNode节点进行汇总和进一步处理;仿真实验表明文中方法能较为精确地实现大数据聚类,准确精确度较经典模糊K均值算法平均约高5.6%,弥补了经典模糊K-Means方法在处理大数据时的正确率低和低效的缺点,具有很强的优越性。
外文摘要:In order to conquer the defects of low clustering efficiency and long operation time in the classic fuzzy K--Means algorism, an improved algorism based on MPI (Message Passing Interface) and improved fuzzy K--Means algorithm was proposed. Firstly, the improved hierarchical MPI model was introduced by add the multiple--hierarchical MasterNodes, then a optimal duster number computing method was obtained by adding the distance in cluster and out of the cluster, the improved fuzzy K--Means algorithm was also obtained. The SlaveNode was used to operate the improved fuzzy K--Means algorithm in currency to cluster multiple sub--data sets, then it transfer the clustering re- sult to MasterNode to be further managed. The simulation experiment shows the method in this paper can realize the large data clustering ac- curately, the precision was improved by 5.6% compared with classic fuzzy K--Means algorism, so it conquers the low correctness rate and efficiency, therefore, it is proved to be having big priority.
参考文献:
正在载入数据...