详细信息
海洋资源大数据系统中缺陷多步预测方法 被引量:1
Multiple Steps for Predicting Number of Defects in Marine Resources Big Data System
文献类型:期刊文献
中文题名:海洋资源大数据系统中缺陷多步预测方法
英文题名:Multiple Steps for Predicting Number of Defects in Marine Resources Big Data System
作者:李昭[1,2];彭小红[1,2];谢仕义[1,2];梁春林[1,2];赵一[1,2];蔡莉华[1,2]
机构:[1]南方海洋科学与工程广东省实验室(湛江)南海资源大数据中心,广东湛江524013;[2]广东海洋大学数学与计算机学院,广东湛江524088
年份:2021
卷号:31
期号:6
起止页码:81
中文期刊名:计算机技术与发展
外文期刊名:Computer Technology and Development
收录:CSTPCD
基金:广东省科技计划项目(ZJW-2019-08,ZJW-2019-06,013S19006-007)。
语种:中文
中文关键词:(5-8个)软件缺陷预测;SMOTE技术;支持向量机;回归建模;多步预测
外文关键词:software defects prediction;SMOTE;support vector machine;regression modeling;multi-step prediction
中文摘要:软件缺陷预测是提升系统质量的有效途径,也是影响软件组件中缺陷检测与修复效率的关键因素。海洋资源大数据系统属于典型的软件密集型系统,在该系统的软件质量保障环节,针对缺陷预测中训练数据不平衡以及单一回归技术对有缺陷组件缺陷数预测的支持能力不足的现状,从两方面提升组件缺陷数预测的效能:提出采用SMOTE技术构建均衡样本数据集,对不均衡的样本数据集中有缺陷组件进行过采样,兼顾不同类别样本占比,提升预测的准确性;提出一种支持先分类后回归的缺陷数多步预测方法,利用支持向量机对组件进行分类,筛除分类结果中无缺陷组件,采用回归技术建立组件缺陷数预测模型,有效实现组件缺陷数多步预测,进一步提升预测准确性。通过开源数据集完成实验评估,结果表明多步预测方法的准确性优于单一使用回归技术的预测方法,多步预测方法具有较高的总体效能和实用性。
外文摘要:Software defects prediction is an effective way to improve system quality, but also a key factor affecting the effectiveness of defect detection and repair. The marine resources big data system is a typical software-intensive system. In the software quality assurance of the system, the efficiency of component defect number prediction is improved from two aspects in view of the imbalance of training data in defect prediction and the insufficient supporting ability of single regression technology to the defect number prediction of defective components. Construction of balanced sample data set based on SMOTE is proposed, which oversamples the defective components to weigh the proportion of samples belong to different categories, further to improve the accuracy of prediction. A multi-step prediction approach of defects is built to further improve the accuracy of prediction, which firstly utilizes SVM to classify components, then filters out the non-defective components in the results, with regression techniques to construct prediction model of component defects number. The evaluation of effectiveness is conducted based on the open source data sets, which shows that the accuracy of the multi-step prediction has outperforms the prediction just using regression technique, with high overall efficiency and practicality.
参考文献:
正在载入数据...