Table of Contents Table of Contents
Previous Page  45 /342 Next Page
Information
Show Menu
Previous Page 45 /342 Next Page
Page Background

臺大管理論叢

27

卷第

2S

45

(6)

The range of silhouette is between -1 and 1. A positive silhouette means the FU2P

i

is

more similar to the FU2Ps in the same cluster than the FU2Ps in other clusters. In contrast, a

negative silhouette means

i

is more similar to the FU2Ps in other clusters, i.e.,

i

is

inappropriately assigned to the current cluster. If silhouette value is equal to 0,

i

is on the

border of two clusters.

For each cluster derived in Step 2, we compute the silhouette of each FU2P in the

cluster. If one or more FU2Ps in the cluster have a negative silhouette, which often happens

in a large cluster, this cluster is an

unsatisfied cluster

. A large cluster means the number of

FU2Ps in the cluster is large. This implies the medoid of the cluster cannot well represent all

of the FU2Ps in the cluster. Therefore, for each unsatisfied cluster, we apply the same

procedure described in Steps 1 and 2 to decompose the unsatisfied cluster into several

clusters. Note that the number of initial medoids is calculated by multiplying the number of

FU2Ps in the unsatisfied cluster with

ξ

.

If there exists no unsatisfied cluster, the SFC algorithm tries to merge similar clusters. It

is possible that we might decompose too much because the number of initial clusters is too

large. We discuss how to merge clusters in the next section.

3.3.4 Step 5: Merge Similar Clusters

The SFC algorithm uses a recursive process to merge clusters. At each iteration, the

distance between each pair of medoids is computed. The pair of clusters which has the

smallest distance between medoids is considered for merging. For each cluster of the pair,

we compute the average of the distances between the medoid and the other FU2Ps in the

cluster. We denote the weighted average of the two average distances derived from the two

clusters as

AvgDin

, i.e.,

, where

N

1

(or

N

2

) and

AvgD

1

(or

AvgD

2

) is the number of FUPs and the average distance in the first (or second)

cluster, respectively. Then, the medoid of the merged cluster formed by merging two clusters

is identified. The average of the distances between the new medoid and the FU2Ps in the

merged cluster is computed, and is denoted as

AvgDout

. The two clusters are merged if the

following two criteria are met: 1)

AvgDout

<

AvgDin

×

δ

; and 2) there exists no FU2P with a

negative silhouette in the merged cluster.