Grouping of Perfumes

Ten nominally similar perfumes for use in a disinfectant were assessed as follow:

Ten small measured samples of the disinfectant, differing only in respect of the perfume, were placed in ten identical containers. A panel of 30 judges were asked to smell the samples and to group them according to whether or not any difference in smell could be detected ; the number of groups could be any number from one (no detectable difference) to ten (all different). Each panel member was also asked to classify each group of samples according to acceptability of perfume into ‘like’, ‘indifferent’ or ‘dislike’.

The data are given in Table S.14. Samples judged to smell alike are bracketed together; thus, the first judge allocated the 10 samples into 4 groups, the second into 8 groups, etc. The identification of any groups in which the perfumes appear to be indistinguishable is of importance, and also the acceptability of the perfume in these groups.

The Analysis

The primary object is to see whether the ten samples may be grouped into homogeneous clusters

The first step is to decide on some measure of association, or similarity, which may be used as a basis for cluster analysis. We take simply the number of times that each pair of samples was grouped together; see Fig. S.14. This array is then analysed using BMDP 1M, a simple linkage cluster program.

Once clusters have been identified it requires only a simple analysis to check which clusters are ‘liked’ or ‘disliked’.

An alternative procedure to the single linkage cluster analysis is metrical multidimensional scaling. A configuration in a small number of dimensions is determined by finding the largest eigenvalues of the similarity matrix, this having first been transformed into a distance matrix and ‘centred’ (Mardia et al., 1979, Chapter 14). The eigenvalues corresponding to the largest eigenvalues are taken in pairs, giving coordinates which can be plotted to provide a visual display of the configuration. Provided that number of dimensions is small, ideally not more than two, this is an effective way to identify cluster.

Table S.14. Allocation by thirty judges of ten perfumes into groups

Judge Acceptability
Like Indifferent Dislike
1 (2,5,8)(1,4,6,9,10) (3)(7)
2 (1)(6)(7) (3) (2,4,5)(8)(9)(10)
3 (1,2,4,5,6,9,10) (8) (3)(7)
4 (3,7,10) (5,6,8)(1,2,4,9)
5 (5)(6)(9) (1)(2)(4)(7)(8) (3)(10)
6 (1,4,7,8)(5)(6)(9)(10) (2)(3)
7 (7) (5)(4,8,9,10) (3)(1,2,6)
8 (1)(4)((5)(6)(10) (2,3,9)(7,8)
9 (1,4,5,7,8,10)(2,6,9) (3)
10 (1,2,5,6)(7,8,9,10) (3,4)
11 (1,4,8,9,10)(2,5,6,7) (3)
12 (8)(4,6)(9,10) (5)(7)(1,2) (3)
13 (2,7,9)(5,8,10) (1)(4,6) (3)
14 (2)(5)(7)(9) (1,4,8,10)(3,6)
15 (1,4,9)(2,5,6,7,8,10) (3)
16 (1,4,10) (2)(5)(6)(7)(8)(9) (3)
17 (5,6,8) (1,2,7,9,10)(3,4)
18 (1,4,7,9) (2,5,6,8,10) (3)
19 (2,3,8) (5,6,10) (1,4,7,9)
20 (1)(2)(4)(5)(6)(7)(8)

(9)(10)

(3)
21 (2,4,6,9)(8,10) (5,7)(1) (3)
22 (2,6,8,10) (1,4,5,9) (3)(7)
23 (4,9)(6,10) (3,7)(1,2,5,8)
24 (1,9,10) (2,5)(4,6,8) (3,7)
25 (7)(1,2,5)(4,6,8,9,10) (3)
26 (5,10) (1,2,6,9) (3,7)(4,8)
27 (1,4,7,9) (2,5,6,8,10) (3)
28 (1,2,6) (5,9,10) (3,4,7,8)
29 (2)(3,5,6,10) (1,4,7,8,9)
30 (1,4) (3,5) (2,7)(6,8)(9,10)

We concentrate here on using the single linkage cluster program BMDP 1M.

Fig.S.14. Similarity matrix: the number of times each pair of samples is grouped together

Sample

Sample

1 2 3 4 5 6 7 8 9 10
1 30 12 0 18 8 9 9 8 14 10
2 12 30 3 6 14 14 8 10 11 7
3 0 3 30 3 2 2 5 2 1 3
4 18 6 3 30 6 9 8 12 15 10
5 8 14 2 6 30 15 6 11 8 13
6 9 14 2 9 15 30 4 11 10 13
7 9 8 5 8 6 4 30 9 9 5
8 8 10 2 12 11 11 9 30 8 14
9 14 11 1 15 8 10 9 8 30 14
10 10 7 3 10 13 13 5 14 14 30

The instructions for BMDP 1M are as follows:

/program             title = ‘set  14 perfumes.   bmdp1m. ’.

/input                   variables = 10.  type  =  simi.       Similarity matrix input

format   =   free

/variable              names  = sample1,  sample2,   sample3,

sample4,   sample5,   samples6,   sample7,

sample8,  sample9,   sample10.

/end

30  12   0  18   8   9   9   8  14  10                    10 lines of data as in Fig. S.14

12  30   3   6  14  14   8  10  11   7

Initially each sample is considered as a separate cluster.  Thereafter, starting with one sample, further ones are linked, one at a time, such that each stage the two clusters so joined have maximum similarity. Output 3.1 gives a summary table of the clusters formed.

Samples 1 and 4 are linked, then samples 1, 4 and 9, etc. Suggested clusters are: samples 1, 4, 8, 9, 10; samples 2, 5, 6; sample 7; sample 3 as can be seen in the figure 3.2

Output 3.1

Summary table of clusters. BMDP 1M

Variable

No.

Other Boundary

Of Cluster

Number of Items

In Cluster

Distance of Similarity

When Cluster Formed

Name
Sample1 1 3 10 5.00
Sample4 4 1 2 18.00
Sample9 9 1 3 15.00
Sample8 8 10 2 14.00
Sample10 10 1 5 14.00
Sample2 2 6 3 14.00
Sample5 5 6 2 15.00
Sample6 6 1 8 13.00
Sample7 7 1 9 9.00
Sample3 3 1 10 5.00

Figure 3.2
Cluster-plot in BMDP

59