臺大管理論叢
第
27
卷第
2S
期
37
we do not have to compute the distances between frequent patterns at all. In addition, the
representative patterns retrieved by the SFC algorithm can be seen as the top-
k
most
important patterns because the representative patterns are real patterns that represent the rest
of the FU2Ps. The SFC algorithm also estimates the expected support of a U2 pattern. We
elaborate the details in the next section.
3. The Proposed Method
In this section, we elaborate on the proposed method. First, we define some terms used
throughout this proposal. Then, the distance measure used to evaluate the distance between
two FU2Ps is discussed. Next, the proposed SFC algorithm is introduced and discussed. In
Table 3, we list the symbols that used throughout this paper.
Table 3 The Symbols
Symbol
Description
ExProb(I
AS
, T)
existential probability
of an interval
I
AS
ϵ
I
A
ExSupport(Pat) expected support
of a U2 pattern
Pat
D
ExSup
expected support part of distance measure
D
App
appearance part of distance measure
w
weight used in distance measure
ξ
multiplier for determining the number of initial clusters
δ
multiplier for determination to merge clusters or not
Dist
Sum
(i)
sum of the distances between
i
and the other FU2Ps
a(i)
average of the distances between
i
and all other FU2Ps in the same cluster
b(i)
lowest average of the distances between
i
and all the FU2Ps in a cluster
AvgDin
average of the distances between the medoid and the other FU2Ps in the cluster
AvgDout
average of the distances between the new medoid and the FU2Ps in the merged
cluster
3.1 Preliminaries
We first give definitions of terms concerning the univariate uncertain data and frequent
U2 patterns. These definitions are quoted from previous studies (Liu, 2012; Liu and Wang,
2013).
Definition 1
. A
transaction
comprises one or more non-repeated attributes. An attribute
is associated with an interval and a probability density function, which assigns a probability
to each value in the interval. (Liu, 2012)
Definition 2
. In a database, the
base intervals
are atomic sub-intervals formed by each