1 / 238
文档名称:

并行关联规则挖掘算法研究.doc

格式:doc   大小:3,303KB   页数:238页
下载后只包含 1 个 DOC 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

并行关联规则挖掘算法研究.doc

上传人:459972402 2018/8/17 文件大小:3.23 MB

下载得到文件列表

并行关联规则挖掘算法研究.doc

文档介绍

文档介绍:Abstract
Abstract
In the field of data mining, association rule mining algorithm is used to describe
the relationship and correlation among the excavated , the algorithm is
also widely applied in business, scientific research, and logistics etc. As the big-data
ear, the scale of mass data to be processed in data mining expands from
one-dimension to multi-dimension. In mass data processing, the traditional
association rule mining algorithms are stretched thin puting power and
efficiency of parallelization. At the same time, the rapid development of high
puting and puting brings new opportunities and challenge
for the parallel algorithm for mining association rules. The thesis improves the
traditional parallel association rule mining algorithm and then transplants the
improved algorithm in the experimental environment of high puting
and distributed puting. The testing results prove that the proposed method
can reduce the I/O and time overhead during mining frequent itemsets in the two
tested innovations of the thesis are summarized as follows:
(1)Based on the problems of traditional algorithm of improved FP-growth,
such as multi-access to multiple databases and high cost munication between
inter-processor, an improved parallel FP-growth (IPF) algorithm is proposed in this
paper. This algorithm utilizes an existing reconstruction mechanism of the efficient
tree to reconstruct the initialized FP-tree, which can decrease the I/O overhead in
mining frequent itemsets. The algorithm needs only one access to database which can
decrease the time overhead.
(2)The traditional parallel algorithms for mining frequent itemsets usually are
based on single MPI (Message Passing Interface) programming mode and hardly
obtain the ideal performance. So the thesis proposes a parallel FP-growth algorithm
based on MPI/OpenMP hybrid programming (HPC FP-growth,HFP).Combined with
the high puting clusters, the algorithm transplant IPF algorithm in
high perfor