Errata for First Edition

This page contains the errata for the first edition. You can contact us via email if you want to report any errors.

Chapter 1: Data Mining and Analysis

p4, Section 1.3, line 13: as linear combination **should be** as a linear combination
p9, Example 1.3, 3rd line from end: \((153)^{1/3}\) should be \((152)^{1/3}\)
p9, Example 1.3, last line: \((4^3 + (-1)^3)^{1/3} = (63)^{1/3} = 3.98\) should be \((4^3 + |-1|^3)^{1/3} = (65)^{1/3} = 4.02\)
p24, Section 1.4.3, last line of subsection Univariate Sample:

where \(f_\mathbf{X}\) is the probability mass or density function for \(\mathbf{X}\)

should be

where \(f_X\) is the probability mass or density function for \(X\)
p30, Section 1.7, Q1: in (1.5) should be in Eq. (1.5)

Chapter 2: Numeric Attributes

p34, Equation (2.2): \(\hat{F}(x) \ge q\) should be \(F(x) \ge q\)
p34, Line after Equation (2.2):

That is, the inverse CDF gives the least value of \(X\), for which \(q\) fraction of the values are higher, and \(1 - q\) fraction of the values are lower.

should be

That is, the inverse CDF gives the least value of \(X\), for which \(q\) fraction of the values are '''lower''', and \(1 - q\) fraction of the values are '''higher'''.
p53, Example 2.6, line 1: ... range for \({\tt Income}\) is \(2700-300=2400\) should be ... range for \({\tt Income}\) is \(6000-300=5700\)
p55, In Eq (2.32): \(P(-k \le z \le k) = P\bigl(0 \le t \le k/\sqrt{2}\bigr)\) should be \(P(-k \le z \le k) = 2 \cdot P\bigl(0 \le t \le k/\sqrt{2}\bigr)\)
p58, Total and Generalized Variance, Line 2: ...product of its eigenvectors should be ...product of its eigenvalues
p58, two lines above Example 2.8: \(tr(\Lambda)\) should be \(tr(\mathbf{\Lambda})\)
p61, Q3: \(mu\) should be \(\mu\) so that it reads

\begin{equation*} \sum_{i=1}^n (x_i - \mu)^2 = n(\hat{\mu} - \mu)^2 + \sum_{i=1}^n (x_i - \hat{\mu})^2 \end{equation*}

Chapter 3: Categorical Attributes

p81, Table 3.6, Attribute value for \(X_2\): \({\tt Short} ( a_{23})\) should be \({\tt Long} ( a_{23})\)

Chapter 4: Graph Data

p103, 2 lines above Eq (4.3): \(\gamma_{jk} = 0\) should be \(\gamma_{jk}(v_i) = 0\)
p103, Eq (4.3): \(\gamma_{jk}\) should be \(\gamma_{jk}(v_i)\)
p103, Example 4.5, last line: \(\gamma_{jk} > 0\) should be \(\gamma_{jk}(v_5) > 0\)
p104, Example 4.5:

\(c(v_5) = \gamma_{18} + \gamma_{24} + \gamma_{27} + \gamma_{28} + \gamma_{38} + \gamma_{46} + \gamma_{48} + \gamma_{67} + \gamma_{68}\)

should be

\(c(v_5) = \gamma_{18}(v_5) + \gamma_{24}(v_5) + \gamma_{27}(v_5) + \gamma_{28}(v_5) + \gamma_{38}(v_5) + \gamma_{46}(v_5) + \gamma_{48}(v_5) + \gamma_{67}(v_5) + \gamma_{68}(v_5)\)
p107: \(\mathbf{p}_1 = \frac{1}{2} \pmatrix{1\\ 1\\ 2\\ 1\\ 2}\) should be \(\mathbf{p}_1 = \frac{1}{2} \pmatrix{1\\ 2\\ 2\\ 1\\ 2}\)
p127, 4th Line after Eq (4.22): initial \(n_0\) edges should be initial \(n_0\) nodes

Chapter 5: Kernel Methods

p138, Example 5.4:

\(\mathbf{\mu}_\phi = \sum_{i=1}^5 \phi(\mathbf{x}_i) = \sum_{i=1}^5 \mathbf{x}_i\)

should be

\(\mathbf{\mu}_\phi = \frac{1}{5}\sum_{i=1}^5 \phi(\mathbf{x}_i) = \frac{1}{5} \sum_{i=1}^5 \mathbf{x}_i\)
p140, 7th Line after Eq (5.3): \(\sum_{i=1}^{m_a} \sum_{j=1}^{m_a} \alpha_i \alpha_{\!j} K(\mathbf{x}_i, \mathbf{x})\) should be \(\sum_{i=1}^{m_a} \sum_{j=1}^{m_a} \alpha_i \alpha_{\!j} K(\mathbf{x}_i, \mathbf{x}_j)\)
p141, 3rd line and 10th Line before Sec 5.1.2: There is an extra left bracket in definition of \(\phi(\mathbf{x})\), that is,

\(\big( ( K(\mathbf{x}_1, \mathbf{x}), ...\) should be \(\big( K(\mathbf{x}_1, \mathbf{x}), ...\)
p144, 2nd line: \(\int a(\mathbf{x})^2\; d\mathbf{x} < 0\) should be \(\int a(\mathbf{x})^2\; d\mathbf{x} < \infty\)
p144, last line: \(\sum_{k=1}^q\) should be \(\sum_{k=0}^q\)
p156, Section 5.4.2: all occurrences of path/paths should be walk/walks
p160, Example 5.15L \(\mathbf{S} = -\mathbf{L} = \mathbf{A}-\mathbf{D}\) should be \(\mathbf{S} = -\mathbf{L} = \mathbf{A}-\mathbf{\Delta}\)

Chapter 6: High-dimensional Data

p164: In the definitions of the hyperball and and hypersphere

\(\mathbf{x} = (x_1, x_2, \ldots, x_d)\) should be \(\mathbf{x} = (x_1, x_2, \ldots, x_d)^T\)
p171: \(\mathbf{0}_d = (0_1,0_2,\ldots,0_d)\) should be \(\mathbf{0}_d = (0_1,0_2,\ldots,0_d)^T\)
p172, Section 6.6, 1st Line after Eq. (6.11):

\(\mu\) in equation \(\mu=\mathbf{0}_d\) should be in bold.
p178, section Volume in d dimensions:

\(x_1 = r \cos\theta_1\cos\theta_2 \cos\theta_3 = r c_2 c_2 c_3\) should be \(x_1 = r \cos\theta_1\cos\theta_2 \cos\theta_3 = r c_1 c_2 c_3\)

\(x_3 = r \cos\theta_1\sin\theta_2 = r c_1 s_1\) should be \(x_3 = r \cos\theta_1\sin\theta_2 = r c_1 s_2\)
p178, Equation for \(J(\theta_1, \theta_2, \theta_3)\), Entry in first row, fourth column: \(r c_1 c_2 s_3`**should be** :math:\)-r c_1 c_2 s_3`
p207, line 3, Alg 7.2: \(\eta_1, \eta_2, ..., \eta_d\) should be \(\eta_1, \eta_2, ..., \eta_n\)

Chapter 7: Dimensionality Reduction

p186, line 1: \(\mathbf{a}_r\) is vector should be \(\mathbf{a}_r\) is a vector
p207, line 3, Alg 7.2: \(\eta_1, \eta_2, ..., \eta_d\) should be \(\eta_1, \eta_2, ..., \eta_n\)

Chapter 8: Itemset Mining

p235, Example 8.13, 2nd last line: \(...,AB(3), AD(4),...\) should be \(..., AB(4), AD(3), ...\)
p236, 5th line: \(...,AD(4),...\) should be \(..., AD(3),...\)

Chapter 9: Summarizing Itemsets

p250, 2nd line under '''Generalized Itemsets''': \(k\)-tidsets should be \(k\) tidsets
p250, 4th line from bottom: \(Z = Y \setminus X\) should be \(Z = X \setminus Y\)
p252, Eq. (9.3) and Eq. (9.4): \(\bigl|X\setminus Y\bigr|\) should be \(\bigl|X\setminus W\bigr|\) on the right hand side in both equations, so that they read

\(\textbf{Upper Bounds} \bigl(\bigl|X\setminus Y\bigr| \text{is odd} \bigr): sup(X) \leq\sum_{Y \subseteq W \subset X} -1^{\bigl(\bigl|X\setminus W\bigr|+1\bigr)} sup(W)\)

\(\textbf{Lower Bounds} \bigl(\bigl|X\setminus Y\bigr| \text{is even}\bigr): sup(X) \geq\sum_{Y \subseteq W \subset X} -1^{\bigl(\bigl|X\setminus W\bigr|+1\bigr)} sup(W)\)
p254, Section '''Nonderivable Itemsets''', 1st Equation after line 1: \(\bigl|X\setminus Y\bigr|\) should be \(\bigl|X\setminus W\bigr|\) , so that it reads

\(\mathit{IE}(Y) = \sum_{Y \subseteq W \subset X}\, -1^{\bigl(\bigl|X\setminus W\bigr|+1\bigr)} \cdot sup(W)\)

Chapter 10: Sequence Mining

p264, alg 10.2, line 9: \(\mathbf{P}\) should be \(P_a\)

Chapter 11: Graph Pattern Mining

p288, sec 11.3, 2nd paragraph, line 6: \(sup(C) = sup(t)\) should be \(sup(C') = sup(t)\)
p290, Figure 11.8: The last tuple in the DFS-code for graph \(C_{19}\) should be \(\langle 2, 0, a, a \rangle\) and not \(\langle 2, 0, a, b\rangle\)
p292, Algorithm 11.2, Line 14: \(b=\langle u_r, v, L(u_r), L(v), L(u_r, v)\rangle\) should be \(b=\langle u_r, v, L(\phi(u_r)), L(\phi(v)), L(\phi(u_r),\phi(v))\rangle\)
p293, Figure 11.9 (c): There there should be one more extension for \(\phi_5\), namely \(\langle 0, 3, a, b\rangle\)
p294, Algorithm 11.3, Line 12: \(N_{G_j}\) should be \(N_{G}\)
p295, Algorithm 11.4, Line 0: \(C\) should be \(C = \{t_1, t_2, ..., t_k\}\)

Chapter 12: Pattern and Rule Assessment

p322 (Alg 12.1) and p326 (Alg 12.2): replace = with \(\gets\)

Chapter 13: Representative-based Clustering

p343, in 3rd equation: \(P(C_i)\) should be \(P(C_1)\)
p335, Algorithm 13.1, line 7: \(\mathbf{\mu}^t_i\) should be \(\mathbf{\mu}^{t-1}_i\)

Chapter 14: Hierarchical Clustering

p366, Fig 14.2: (a) \(m=1\), (b) \(m=2\), and (c) \(m=3\) should be (a) \(n=1\), (b) \(n=2\), and (c) \(n=3\), respectively.
p373, sec 14.4: EXERCISES AND PROJECTS should be EXERCISES
p373, Q1, \(SMC(X_i, X_j)\), \(JC(X_i, X_j)\), \(RC(X_i, X_j)\) should be \(SMC(\mathbf{x}_i, \mathbf{x}_j)\), \(JC(\mathbf{x}_i, \mathbf{x}_j)\), \(RC(\mathbf{x}_i, \mathbf{x}_j)\), respectively.

Chapter 15: Density-based Clustering

p385, line after Eq. (15.6): ... having two parts. A vector ... should be ... having two parts: a vector ...
p387, Alg 15.2, line 20: In the numerator \(K\left(\frac{\mathbf{x}_t - \mathbf{x}_i}{h} \right) \cdot \mathbf{x}_t\) should be \(K\left(\frac{\mathbf{x}_t - \mathbf{x}_i}{h} \right) \cdot \mathbf{x}_i\)

Chapter 16: Spectral and Graph Clustering

p411, 2nd last equation: \(\frac{1}{2}p_{rs}\) should be \(p_{rs}\) so that it reads

\(p_{rs} = \frac{d_r}{2m}\frac{d_s}{2m} = \frac{d_r d_s}{4m^2}\)
p413, Line 5: \(\sum_{j=1}^n \mathbf{d}^T \mathbf{c}_i\) should be \(\mathbf{d}^T \mathbf{c}_i\)
p413, Line 10: \((\mathbf{d}_i^T\mathbf{c}_i)^2\) should be \((\mathbf{d}^T\mathbf{c}_i)^2\)
p424, Q5: \(\mathbf{c}_n = \frac{1}{\sqrt{n}} \mathbf{1}\) should be \(\mathbf{c}_n = \frac{1}{\sqrt{\sum_{i=1}^n d_i}} \mathbf{\Delta}^{1/2}\mathbf{1}\)
p424, Q6 (b): \(\mathbf{K} = \mathbf{M}\) should be \(\mathbf{K} = \mathbf{M} + \mathbf{I}\)

Chaper 17: Clustering Validation

p428, Example 17.1, Table below 2nd para: \(n=100\) should be \(n=150\) for the total count
p463, Q10: Add the sentence Assume that the clusters are: \(C_1 = \{a,b, c,d, e\}, C_2 = \{g, i\}, C_3 = \{f,h, j \}, C_4 = \{k\}\).

Chapter 18: Probabilistic Classification

p472, Table 18.2: 13/50 should be 11/50
p472, Example 18.2, 2nd Para, lines 6 and 7: \(P(c_1|\mathbf{x})\) and \(P(c_2|\mathbf{x})\) should be \(\hat{P}(c_1|\mathbf{x})\) and \(\hat{P}(c_2|\mathbf{x})\), respectively.

Chapter 20: Linear Discriminant Analysis

p503: Example 20.2: There should be no transpose operator \(T\) on the mean vectors, i.e.,

\(\mathbf{\mu}_1 = \pmatrix{5.01\\3.42}^T \qquad \mathbf{\mu}_2 = \pmatrix{6.26\\2.87}^T \qquad \mathbf{\mu}_1 - \mathbf{\mu}_2= \pmatrix{-1.256\\0.546}^T\)

should be

\(\mathbf{\mu}_1 = \pmatrix{5.01\\3.42} \qquad \mathbf{\mu}_2 = \pmatrix{6.26\\2.87} \qquad \mathbf{\mu}_1 - \mathbf{\mu}_2 = \pmatrix{-1.256\\0.546}\)
p509, Example 20.4, line 4: ''iris-virginica'' should be \({\tt Iris\text{-}versicolor}\)
p512, Q1: In part (a) \(\mathbf{S}_B\) should be \(\mathbf{B}\), and in (b) \(\mathbf{S}_W\) should be \(\mathbf{S}\)

Chapter 21: Support Vector Machines

p526, 7th line, in \(L_{dual}\): \((C - \alpha_i + \beta_i)\) should be \((C - \alpha_i - \beta_i)\)
p536, Algorithm 21.1, line 15: \(\mathbf{\alpha}_{t+1} = \alpha\) should be \(\alpha_{t+1} \gets \alpha\)
p538, Example 21.8, line 5: homogeneous quadratic kernel \(K(\mathbf{x}_i,\mathbf{x}_j) = ( \mathbf{x}^T_i \mathbf{x}_j)^2\) should be inhomogeneous quadratic kernel \(K(\mathbf{x}_i,\mathbf{x}_j) = (1+ \mathbf{x}^T_i \mathbf{x}_j)^2\)