, a contingency table
(also referred to as cross tabulation
or cross tab
) is often used to record and analyze the relation between two or more categorical variables
. It displays the (multivariate) frequency distribution
of the variables in a matrix
The term contingency table
was first used by Karl Pearson
in "On the Theory of Contingency and Its Relation to Association and Normal Correlation", part of the Drapers' Company Research Memoirs Biometric Series I
published in 1904.
A crucial problem of multivariate statistics is finding (direct-)dependence structure underlying the variables contained in high dimensional contingency tables. If some of the conditional independences
are revealed, then even the storage of the data can be done in a smarter way (see Lauritzen (2002)). In order to do this one can use information theory
concepts, which gain the information only from the distribution of probability, which can be expressed easily from the contingency table by the relative frequencies.
Suppose that we have two variables, sex (male or female) and handedness
(right- or left-handed). Further suppose that 100 individuals are randomly sampled from a very large population as part of a study of sex differences in handedness. A contingency table can be created to display the numbers of individuals who are male and right-handed, male and left-handed, female and right-handed, and female and left-handed. Such a contingency table is shown... Read More