臺大管理論叢 NTU Management Review VOL.27 NO.2S

臺大管理論叢

第

卷第

期

uniform distribution.

Mining frequent patterns is a good way to discover the intrinsic properties and hidden

regularity of a univariate uncertain database. In addition, frequent patterns can be utilized to

deal with classification (Thabtah, 2007) and cluster analysis (Wang, Wang, Yang, and Yu,

2002). Hereafter, a

U2 pattern

refers to a pattern composed of univariate uncertain attributes,

and a

frequent U2 pattern

indicates a U2 pattern that frequently appears. We hereafter

abbreviate a frequent U2 pattern as

FU2P

Mining FU2Ps has been addressed in (Liu, 2012). However, the number of FU2Ps

derived from univariate uncertain data is usually large, except for cases of mining with a

very high minimum support. Returning the full set of FU2Ps to users may frustrate them

because it is impossible for the users to manually go over such a large collection of patterns.

Therefore, deriving a concise representation of the FU2Ps is required. To the best of our

knowledge, in the literature, only one proposal attempts to get a concise representation of the

FU2Ps: Liu (2014) proposed mining maximal frequent patterns from univariate uncertain

data. A maximal frequent pattern is a frequent pattern which has no frequent supersets. A

maximal frequent pattern of univariate uncertain data will be hereafter referred to as a

MFUP. The left column of Table 2 presents the first set of FU2Ps, which contains six FU2Ps.

For example, [

:[15, 30],

:[104, 178],

:[55, 66]] represents a FU2P, where the intervals

of attributes

, and

are [15, 30], [104, 178], and [55, 66], respectively. Only two

FU2Ps are MFU2Ps in the first set, namely, [

:[15, 30],

:[104, 178],

:[55, 66]] and

[

:[20, 35],

:[203, 251],

:[44, 51]]. The second and third FU2Ps are subsets of the first

FU2P; while the fifth and sixth FU2Ps are subsets of the fourth FU2P. In addition, the full set

of FU2Ps can be enumerated from the set of MFU2Ps by using the Apriori property, i.e., a

non-empty subset of a frequent pattern is also a frequent pattern. Therefore, we can return

only the set of MFU2Ps to users instead of the full set of FU2Ps. However, the set of

MFU2Ps may not concisely represent the full set of FU2Ps in some univariate uncertain

datasets. For instance, the right column of Table 2 shows the second set of FU2Ps, where

each FU2P is not a subset of the other five FU2Ps. Therefore, all FU2Ps are MFU2Ps. In

fact, the second and third FU2Ps are only slightly different from the first FU2P in the

intervals of

and

respectively. We can treat the three FU2Ps as a group. The same

observation is made for the remaining three FU2Ps, while the fifth and sixth FU2Ps are

slightly different from the fourth FU2P in the intervals of

and

respectively. The three

FU2Ps can form another group. For each group, we generate an informative summary to

represent the FU2Ps in the group. We select the first FU2P as the representative FU2P for the