Errata for First Edition

This page contains the errata for the first edition. You can contact us via email if you want to report any errors.

Chapter 1: Data Mining and Analysis

  • p4, Section 1.3, line 13: as linear combination **should be** as a linear combination

  • p9, Example 1.3, 3rd line from end: (153)1/3 should be (152)1/3

  • p9, Example 1.3, last line: (43+(1)3)1/3=(63)1/3=3.98 should be (43+13)1/3=(65)1/3=4.02

  • p24, Section 1.4.3, last line of subsection Univariate Sample:

    where fX is the probability mass or density function for X

    should be

    where fX is the probability mass or density function for X

  • p30, Section 1.7, Q1: in (1.5) should be in Eq. (1.5)

Chapter 2: Numeric Attributes

  • p34, Equation (2.2): ˆF(x)q should be F(x)q

  • p34, Line after Equation (2.2):

    That is, the inverse CDF gives the least value of X, for which q fraction of the values are higher, and 1q fraction of the values are lower.

    should be

    That is, the inverse CDF gives the least value of X, for which q fraction of the values are '''lower''', and 1q fraction of the values are '''higher'''.

  • p53, Example 2.6, line 1: ... range for Income is 2700300=2400 should be ... range for Income is 6000300=5700

  • p55, In Eq (2.32): P(kzk)=P(0tk/2) should be P(kzk)=2P(0tk/2)

  • p58, Total and Generalized Variance, Line 2: ...product of its eigenvectors should be ...product of its eigenvalues

  • p58, two lines above Example 2.8: tr(Λ) should be tr(Λ)

  • p61, Q3: mu should be μ so that it reads

ni=1(xiμ)2=n(ˆμμ)2+ni=1(xiˆμ)2

Chapter 3: Categorical Attributes

  • p81, Table 3.6, Attribute value for X2: Short(a23) should be Long(a23)

Chapter 4: Graph Data

  • p103, 2 lines above Eq (4.3): γjk=0 should be γjk(vi)=0

  • p103, Eq (4.3): γjk should be γjk(vi)

  • p103, Example 4.5, last line: γjk>0 should be γjk(v5)>0

  • p104, Example 4.5:

    c(v5)=γ18+γ24+γ27+γ28+γ38+γ46+γ48+γ67+γ68

    should be

    c(v5)=γ18(v5)+γ24(v5)+γ27(v5)+γ28(v5)+γ38(v5)+γ46(v5)+γ48(v5)+γ67(v5)+γ68(v5)

  • p107: p1=12(11212) should be p1=12(12212)

  • p127, 4th Line after Eq (4.22): initial n0 edges should be initial n0 nodes

Chapter 5: Kernel Methods

  • p138, Example 5.4:

    μϕ=5i=1ϕ(xi)=5i=1xi

    should be

    μϕ=155i=1ϕ(xi)=155i=1xi

  • p140, 7th Line after Eq (5.3): Unexpected text node: ' ' should be Unexpected text node: ' '

  • p141, 3rd line and 10th Line before Sec 5.1.2: There is an extra left bracket in definition of ϕ(x), that is,

    ((K(x1,x),... should be (K(x1,x),...

  • p144, 2nd line: a(x)2dx<0 should be a(x)2dx<

  • p144, last line: qk=1 should be qk=0

  • p156, Section 5.4.2: all occurrences of path/paths should be walk/walks

  • p160, Example 5.15L S=L=AD should be S=L=AΔ

Chapter 6: High-dimensional Data

  • p164: In the definitions of the hyperball and and hypersphere

    x=(x1,x2,,xd) should be x=(x1,x2,,xd)T

  • p171: 0d=(01,02,,0d) should be 0d=(01,02,,0d)T

  • p172, Section 6.6, 1st Line after Eq. (6.11):

    μ in equation μ=0d should be in bold.

  • p178, section Volume in d dimensions:

    x1=rcosθ1cosθ2cosθ3=rc2c2c3 should be x1=rcosθ1cosθ2cosθ3=rc1c2c3

    x3=rcosθ1sinθ2=rc1s1 should be x3=rcosθ1sinθ2=rc1s2

  • p178, Equation for J(θ1,θ2,θ3), Entry in first row, fourth column: rc1c2s3shouldbe:math:-r c_1 c_2 s_3`

  • p207, line 3, Alg 7.2: η1,η2,...,ηd should be η1,η2,...,ηn

Chapter 7: Dimensionality Reduction

  • p186, line 1: ar is vector should be ar is a vector

  • p207, line 3, Alg 7.2: η1,η2,...,ηd should be η1,η2,...,ηn

Chapter 8: Itemset Mining

  • p235, Example 8.13, 2nd last line: ...,AB(3),AD(4),... should be ...,AB(4),AD(3),...

  • p236, 5th line: ...,AD(4),... should be ...,AD(3),...

Chapter 9: Summarizing Itemsets

  • p250, 2nd line under '''Generalized Itemsets''': k-tidsets should be k tidsets

  • p250, 4th line from bottom: Z=YX should be Z=XY

  • p252, Eq. (9.3) and Eq. (9.4): XY should be XW on the right hand side in both equations, so that they read

    Upper Bounds(XYis odd):sup(X)YWX1(XW+1)sup(W)

    Lower Bounds(XYis even):sup(X)YWX1(XW+1)sup(W)

  • p254, Section '''Nonderivable Itemsets''', 1st Equation after line 1: XY should be XW , so that it reads

    IE(Y)=YWX1(XW+1)sup(W)

Chapter 10: Sequence Mining

  • p264, alg 10.2, line 9: P should be Pa

Chapter 11: Graph Pattern Mining

  • p288, sec 11.3, 2nd paragraph, line 6: sup(C)=sup(t) should be sup(C)=sup(t)

  • p290, Figure 11.8: The last tuple in the DFS-code for graph C19 should be 2,0,a,a and not 2,0,a,b

  • p292, Algorithm 11.2, Line 14: b=ur,v,L(ur),L(v),L(ur,v) should be b=ur,v,L(ϕ(ur)),L(ϕ(v)),L(ϕ(ur),ϕ(v))

  • p293, Figure 11.9 (c): There there should be one more extension for ϕ5, namely 0,3,a,b

  • p294, Algorithm 11.3, Line 12: NGj should be NG

  • p295, Algorithm 11.4, Line 0: C should be C={t1,t2,...,tk}

Chapter 12: Pattern and Rule Assessment

  • p322 (Alg 12.1) and p326 (Alg 12.2): replace = with

Chapter 13: Representative-based Clustering

  • p343, in 3rd equation: P(Ci) should be P(C1)

  • p335, Algorithm 13.1, line 7: μti should be μt1i

Chapter 14: Hierarchical Clustering

  • p366, Fig 14.2: (a) m=1, (b) m=2, and (c) m=3 should be (a) n=1, (b) n=2, and (c) n=3, respectively.

  • p373, sec 14.4: EXERCISES AND PROJECTS should be EXERCISES

  • p373, Q1, SMC(Xi,Xj), JC(Xi,Xj), RC(Xi,Xj) should be SMC(xi,xj), JC(xi,xj), RC(xi,xj), respectively.

Chapter 15: Density-based Clustering

  • p385, line after Eq. (15.6): ... having two parts. A vector ... should be ... having two parts: a vector ...

  • p387, Alg 15.2, line 20: In the numerator K(xtxih)xt should be K(xtxih)xi

Chapter 16: Spectral and Graph Clustering

  • p411, 2nd last equation: 12prs should be prs so that it reads

    prs=dr2mds2m=drds4m2

  • p413, Line 5: nj=1dTci should be dTci

  • p413, Line 10: (dTici)2 should be (dTci)2

  • p424, Q5: cn=1n1 should be cn=1ni=1diΔ1/21

  • p424, Q6 (b): K=M should be K=M+I

Chaper 17: Clustering Validation

  • p428, Example 17.1, Table below 2nd para: n=100 should be n=150 for the total count

  • p463, Q10: Add the sentence Assume that the clusters are: C1={a,b,c,d,e},C2={g,i},C3={f,h,j},C4={k}.

Chapter 18: Probabilistic Classification

  • p472, Table 18.2: 13/50 should be 11/50

  • p472, Example 18.2, 2nd Para, lines 6 and 7: P(c1x) and P(c2x) should be ˆP(c1x) and ˆP(c2x), respectively.

Chapter 20: Linear Discriminant Analysis

  • p503: Example 20.2: There should be no transpose operator T on the mean vectors, i.e.,

    μ1=(5.013.42)Tμ2=(6.262.87)Tμ1μ2=(1.2560.546)T

    should be

    μ1=(5.013.42)μ2=(6.262.87)μ1μ2=(1.2560.546)

  • p509, Example 20.4, line 4: ''iris-virginica'' should be Iris-versicolor

  • p512, Q1: In part (a) SB should be B, and in (b) SW should be S

Chapter 21: Support Vector Machines

  • p526, 7th line, in Ldual: (Cαi+βi) should be (Cαiβi)

  • p536, Algorithm 21.1, line 15: αt+1=α should be αt+1α

  • p538, Example 21.8, line 5: homogeneous quadratic kernel K(xi,xj)=(xTixj)2 should be inhomogeneous quadratic kernel K(xi,xj)=(1+xTixj)2