Table of Contents Table of Contents
Previous Page  37 /342 Next Page
Information
Show Menu
Previous Page 37 /342 Next Page
Page Background

臺大管理論叢

27

卷第

2S

37

we do not have to compute the distances between frequent patterns at all. In addition, the

representative patterns retrieved by the SFC algorithm can be seen as the top-

k

most

important patterns because the representative patterns are real patterns that represent the rest

of the FU2Ps. The SFC algorithm also estimates the expected support of a U2 pattern. We

elaborate the details in the next section.

3. The Proposed Method

In this section, we elaborate on the proposed method. First, we define some terms used

throughout this proposal. Then, the distance measure used to evaluate the distance between

two FU2Ps is discussed. Next, the proposed SFC algorithm is introduced and discussed. In

Table 3, we list the symbols that used throughout this paper.

Table 3 The Symbols

Symbol

Description

ExProb(I

AS

, T)

existential probability

of an interval

I

AS

ϵ

I

A

ExSupport(Pat) expected support

of a U2 pattern

Pat

D

ExSup

expected support part of distance measure

D

App

appearance part of distance measure

w

weight used in distance measure

ξ

multiplier for determining the number of initial clusters

δ

multiplier for determination to merge clusters or not

Dist

Sum

(i)

sum of the distances between

i

and the other FU2Ps

a(i)

average of the distances between

i

and all other FU2Ps in the same cluster

b(i)

lowest average of the distances between

i

and all the FU2Ps in a cluster

AvgDin

average of the distances between the medoid and the other FU2Ps in the cluster

AvgDout

average of the distances between the new medoid and the FU2Ps in the merged

cluster

3.1 Preliminaries

We first give definitions of terms concerning the univariate uncertain data and frequent

U2 patterns. These definitions are quoted from previous studies (Liu, 2012; Liu and Wang,

2013).

Definition 1

. A

transaction

comprises one or more non-repeated attributes. An attribute

is associated with an interval and a probability density function, which assigns a probability

to each value in the interval. (Liu, 2012)

Definition 2

. In a database, the

base intervals

are atomic sub-intervals formed by each