||We consider the problem of simultaneously and optimally clustering
the rows and columns of a real-valued I x J data matrix X = (xij) by corresponding
row and columns partitions A = (A1;...;Am) and B = (B1;...;Bn),
with given m and n. We emphasize the need to base the clustering method on a
probabilistic model for the data and then to use standard methods from statistics
(e.g., maximum likelihood, divergence) to characterize optimum two-way
classifications. We survey some clustering criteria and algorithms proposed in
the literature for various data types. Special emphasis is given to the maximum
interaction clustering criterion proposed by the author in 1980. It can be shown
that it results as the maximum likelihood clustering method under a two-way
ANOVA model (with individual main effects, but cluster-specific interactions).
After a simple data transformation (double-centering) well-known two-way
SSQ clustering algorithms can directly be used for maximization.