為頻繁單變量不確定樣式產生摘要
48
Algorithm:
Summarize FU2Ps by Clustering (SFC)
Input:
A full set of FU2Ps
Parameters
w
,
ξ
, and
δ
Output:
A summary of the full set of FU2Ps
1. Generate an initial set of representative FU2Ps
RFU2Ps
;
2. Perform the modified
k
-medoid algorithm with
RFU2Ps
3. and get a set of clusters
SC
;
4.
repeat
5. Examine
SC
for a set of unsatisfied clusters
USC
;
6.
if
USC
is null
then
7. break;
8.
else
9.
for each
cluster
C
i
in
USC
do
10. Generate an initial set of representative FU2Ps for
C
i
CRFU2Ps
;
11. Perform the modified
k
-medoid algorithm on
C
i
with
CRFU2Ps
12. and get a set of clusters
CSC
;
13. Update
SC
with
CSC
;
14.
end for
15.
end if
16.
end repeat
17. Update
SC
by merging the clusters in
SC
;
18. Return the summary generated from
SC
;
Figure 3 The SFC Algorithm
4. The Experiments
The experiments are divided into two sets. In the first set of experiments, we evaluated
the SFC algorithm under various parameter settings. Different values of
w
,
ξ
, and
δ
, comprise
a set of parameter settings. The runtime for clustering and the quality of summarization
varied with different parameter settings, which indicates a good clustering result can be
derived by searching for a suitable parameter setting. In the second set of experiments, we
compared the representative FUPs derived by the SFC algorithms with MFU2Ps. The
MFU2Ps also summarize FU2Ps; however, as we will show in Section 4.3, MFU2Ps provide
a lower quality of summarization compared to FU2Ps.
In the experiments, we used a synthetic dataset and two real datasets. All the
experiments were performed on an IBM compatible PC with an Intel Core i7 CPU (3.40
GHz) and 16GB main memory, running Windows 7 Professional (64bit). The algorithms