Contingency Table Analysis

Download the datafile adult.data from the UCI Machine Learning Repository. This is a selection of the Census data from 1994, and it has 48842 instances over 14 categorial, real and integer attributes.

Compute the contingency matrix for variables education and race, and compute the \(\chi^2\) statistic using your own function, i.e., write a function that takes as input two categorical column-vectors, and returns the \(\chi^2\) value and its p-value. At the 99% confidence level, are education and race dependent?