臺大管理論叢 NTU Management Review VOL.27 NO.2S

為頻繁單變量不確定樣式產生摘要

members. Therefore, we can derive the quality index of a set of MFU2Ps by using equations

(7) and (8). Tables 9, 10, and 11 list the performance comparisons between the SFC

algorithm and the MFU2Ps derived from the three datasets, respectively. In these tables,

#FU2Ps means the number of FU2Ps and #Clusters means the number of clusters (it is also

the number of MFU2Ps). SFC (best) lists the statistics of the best summarization quality and

SFC (worst) lists the statistics of the worst summarization quality. The total distance, number

of clusters, and quality index listed in Table 9 are derived by setting

to 0.9,

to 0.01, and

to 1.0 for SFC (best) and

to 0.0,

to 0.15, and

to 1.0 for SFC (worst). In Table 10 and

Table 11,

is set to 1.0,

to 0.01, and

to 1.0 for SFC (best).

is set to 0.0,

to 0.07, and

to 1.0 for SFC (worst) in Table 10 and 0.0, 0.12, 1.0 in Table 11. In both real datasets, the

summarization quality of the SFC algorithm is much better than the summarization quality of

the MFU2Ps even in the worst case. The number of MFU2Ps in each real dataset is nearly

the number of FU2Ps. In the synthetic dataset, the best summarization quality is much better

than the summarization quality of MFU2Ps. Although the worst case performs worse than

the MFU2Ps, most of the settings still do better than the MFU2Ps. For generating summary,

the set of FU2Ps has to be retrieved first. In Tables 9, 10, and 11, Runtime

means the

runtime required for retrieving FU2Ps (for the SFC algorithm) or MFU2Ps; Runtime

means

the runtime required for generating summary from FU2Ps. The runtime required for

retrieving FU2Ps and then generating summary is roughly the same as the runtime required

for generating MFU2Ps.

In the second-last paragraph of the previous section, we introduce a representative

FU2P in the DY2009 dataset, [temperature 26°C to 30°C, relative humidity 67% to 76%];

All of its 94 cluster members are MFU2Ps. If we present MFU2Ps to users, the users have to

check all 94 FU2Ps even these FU2Ps looks very similar. Instead, the users only need to see

one representative FU2P when we summarize the FU2Ps.

Table 9 The Comparison on the Synthetic Dataset

SFC (best)

SFC (worst)

MFU2Ps

#FU2Ps

12409

Total distance

612.91

1859.86

753.46

#Clusters

220

3005

1085

Quality index

134840.20

5588876.69

817504.10

Runtime

(s)

121.36

132.15

Runtime

(s)

9.81

12.56