Skip to main content

2023 | Buch

Analysis of Categorical Data from Historical Perspectives

Essays in Honour of Shizuhiko Nishisato

herausgegeben von: Eric J. Beh, Rosaria Lombardo, Jose G. Clavel

Verlag: Springer Nature Singapore

Buchreihe : Behaviormetrics: Quantitative Approaches to Human Behavior

insite
SUCHEN

Über dieses Buch

This collection of essays is in honor of Shizuhiko Nishisato on his 88th birthday and consists of invited contributions only. The book contains essays on the analysis of categorical data, which includes quantification theory, cluster analysis, and other areas of multidimensional data analysis, covering more than half a century of research by the 41 interdisciplinary and international researchers who are contributors. Thus, it offers the wisdom and experience of work past and present and attracts a new generation of researchers to this field. Central to this wisdom and experience is that of Prof. Nishisato, who has spent much of the past 60 years mentoring and providing leadership in the research of quantification theory, especially that of “dual scaling”. The book includes contributions by leading researchers who have worked alongside Prof. Nishisato, published with him, been mentored by him, or whose work has been influenced by the research he has undertaken over his illustrious career. This book inspires researchers young and old as it highlights the significant contributions, past and present, that Prof. Nishisato has made in his field.

Inhaltsverzeichnis

Frontmatter

Data Theory

Frontmatter
Gratitude: A Life Relived
Abstract
My life is filled with many events of life, happy childhood, the World War II, a big earthquake, the university education in Japan and the USA and jobs in Canada. The biography unfolds with many photos of international conferences, mentors, and friends. This chapter reveals how blessed the author’s life has been with an incredibly wide human network and ends with the heartfelt thanks to a countless number of mentors, friends, and the family members. Published books and selected research papers are also listed.
Shizuhiko Nishisato
Nishisato’s Psychometric World
Abstract
During his long career, Shizuhiko Nishisato has touched the lives of both his many friends and colleagues in data analysis. Their portraits in this paper illustrate the extent of this and serve as evidence and as a tribute.
Pieter M. Kroonenberg
My Recollections of People in the World of Data Science
Abstract
Episodes of some people in Japan and abroad are discussed who have involved themselves in the development of multivariate methods for categorical data and its related topics. Several stories of their families are also mentioned.
Shuichi Iwatsubo

On Association and Scaling Issues

Frontmatter
A Straightforward Approach to Chi-Squared Analysis of Associations in Contingency Tables
Abstract
In contrast to conventional wisdom that Pearson’s chi-squared at a contingency table is a criterion of statistical independence, rather than a measure of association, this paper establishes an operational meaning of the Pearson’s chi-squared as a measure of association. Its normalised version, phi-squared, is the average change of the probability of a category of a feature when a category of the other feature becomes known. Associations between individual categories are captured with Quetelet indexes between them. This allows for an operational interpretation of associations between individual categories, which is illustrated with a number of examples from the literature.
Boris Mirkin
Contrasts for Neyman’s Modified Chi-Square Statistic in One-Way Contingency Tables
Abstract
Pearson’s chi-square goodness-of-fit statistic is very popular in the analysis of contingency tables. Loisel and Takane (Behaviormetrika 50:335–360, 2022), however, argued against its use in multiple comparisons on the ground that in this statistic, hypothesised mean and variance-covariance structures of observed frequencies (proportions) are closely linked, so that rejecting the former necessarily implies rejecting the latter as well. To avoid this undesirable situation, they advocated the use of Neyman’s modified chi-square statistic, in which mean and variance-covariance structures are separate entities. They developed a theory of contrasts specifically tailored to Neyman’s statistic, focussing on the tests of (part) interaction effects in two-way and higher-order contingency tables. In this paper, we elaborate on tests for one-way tables or for one-way marginal tables derived from higher-order tables. These cases are separately treated here, since they present a special need that does not arise in the analysis of higher-order tables.
Yoshio Takane, Sébastien Loisel
From DUAL3 to dualScale: Implementing Nishisato’s Dual Scaling
Abstract
This paper presents the R package dualScale for dual scaling. The dualScale package is a substitute of the former DUAL3 Fortran programme (Nishisato and Nishisato 1994) and spread the dual scaling methodology to new users. Data from contingency tables, multiple-choice, rank order, etc. can be analysed with this R package. For each type of data, descriptions of functions, output summaries and graphical displays are presented with options for outputs and analysis. Examples that have been studied by Nishisato are used to illustrate some of the details of the analysis and the interpretations that come from their outputs.
Jose G. Clavel, Roberto de la Banda
Confounding, a Nuisance Addressed
Abstract
A method is presented that avoids spurious side effects of aggregation in the analysis of contingency tables, especially the side effect of confounding. As aggregation, used to isolate an effect of a predictor variable on the dependant variable, leads to confounding if the contingency table is non-orthogonal, the proposed method relies on the loss of information when the correlation between two variables with m and n categories is eliminated by replacing the two variables with a composite variable with \(m\times n\) categories. The effect size of the proposed new method of analysis is based on this loss of information, and it is guaranteed to be free of confounding effects.
Helmut Vorkauf
Correcting for Context Effects in Ratings
Abstract
In Nishisato’s dual-scaling framework, the analysis of rating data involves a specific type of data transformation that leads to rank order data. This transformation is referred to as successive categories and involves the creation of items corresponding to boundaries between the values of the rating scale. The resulting dual-scaling values for these boundaries can be used to quantify differences in respondents’ scale use. In this chapter, we show how a particular method, inspired by the work of Nishisato, can be used to de-bias observed ratings that have been subject to range and frequency manipulations.
Michel van de Velden, Ulf Böckenholt
Old and New Perspectives on Optimal Scaling
Abstract
Processing in machine learning qualitative variables having a very large number of modalities is an opportunity to revisit the theory of optimal scaling and its applications. This revisitation starts with the pioneers of scaling in statistics, psychometrics and psychology before moving on to more contemporary treatments of scaling that fall within the realm of machine learning and neural networks.
Hervé Abdi, Agostino Di Ciaccio, Gilbert Saporta
Marketing Data Analysis by the Dual Scaling Approach: An Update and a New Application
Abstract
This paper updates a 1988 review of marketing data analysis by dual scaling. Since then, the number of applications has grown considerably. However, the spread is still low compared to, for example, conjoint analysis. On the other side, recent progress in data collection, methodology, and related dual scaling software packages creates new opportunities. The ability to analyse complex and varied data (answers to open questions, associations, cross-tabulations, discrete choices, preferences, ratings) could be a decisive advantage and is demonstrated by a new large-scale marketing application. A sample of online shop customers (\(n = 4411\)) was asked to rank-order sustainable improvement options. Dual scaling helps managers and deciders to focus on preferred improvements.
Daniel Baier, Wolfgang Gaul
Power Transformations and Reciprocal Averaging
Abstract
This paper is concerned with the scaling of the row and column categories of a two-way contingency table when applying a power transformation to the elements of the table’s profiles; power transformations have been widely discussed in the correspondence analysis literature. We adopt the method of reciprocal averaging to produce a one-dimensional set of row and column scores that can be used to apply a correspondence analysis to the table. We also show how this scoring procedure can be extended to obtain a multi-dimensional sets of scores using singular value decomposition. Finally, we examine the application of these methods using the 1981 asbestos data of Irving Selikoff. Furthermore, we briefly discuss this scoring approach in terms of the dual scaling approach described in detail by Shizuhiko Nishisato and show how extending the power transformation approach fits within the scoring procedures he has discussed over the years.
Eric J. Beh, Rosaria Lombardo, Ting-Wu Wang
Dual Scaling of Rating Data
Abstract
When applied to contingency tables, dual scaling and correspondence analysis are mathematically equivalent methods. For the analysis of rating data, however, the methods differ. To a large extent this is due to differences in pre-processing of the data. In particular, in dual scaling, ratings are either transformed to rank order, or to successive category data before applying a customised dual scaling approach. In correspondence analysis, on the other hand, a so-called doubling of the original ratings is applied before applying the usual correspondence analysis formulas. In this paper, we consider these differences in detail. We propose a dual scaling variant that can be applied directly to the ratings and we compare theoretical as well as practical properties of the different approaches.
Michel van de Velden, Patrick J. F. Groenen
Whence Principal Components?
Abstract
The historical roots of principal component analysis are traced from its beginnings under Harold Hotelling in the 1930s to modifications and developments that were made shortly thereafter by several prominent psychometricians, such as Truman Lee Kelley and L. L. Thurstone. The emphasis in the present historical survey is on the origins of principal components per se. Other papers in this Festschrift for Shizuhiko Nishisato are expected to concentrate on the computational engine behind principal components as used for a number of areas in categorical data analysis.
Lawrence Hubert, Susu Zhang
The Emergence of Joint Scales in the Social and Behavioural Sciences: Cumulative Guttman Scaling and Single-Peaked Coombs Scaling
Abstract
Two approaches to psychological scaling evolved in the nineteenth century. The subject-centred approach quantifies individual differences in ability by standardised person scores. It started with Francis Galton’s invention of the statistical scale, the basis of classic mental and educational testing. The stimulus-centred approach provides psychological scale values of stimuli by using just noticeable differences on a physical scale. It was the basis of psychophysics, brainchild of Gustav Fechner, who developed classic analysis methods for experimental psychology. The main part of the paper concerns a third approach, developed in the period 1941–1964. It combines person scores and stimulus scale values in a joint scale; main initiators were Louis Guttman and Clyde Coombs. Both wanted to work with minimal assumptions about measurement level and score distributions. Guttman’s least squares quantification method is compared with his scalogram method, which was more popular in the social and behavioural sciences. We critically discuss how Coombs developed his unfolding technique for scales with ordered metric measurement level. Finally, we bridge the gap between these two pioneers of one-dimensional non-metric scaling by demonstrating that Coombs scales for paired comparisons or rank orders may be obtained by least squares Guttman scaling supplemented with fitting an additive model.
Willem J. Heiser, Jacqueline J. Meulman
A Probabilistic Unfolding Distance Model with the Variability in Objects
Abstract
Multidimensional unfolding models have been applied to several data types, for example, 2-mode 2-way proximity data. Of course, extensions of these models to multi-way data have also been proposed. When analysing preference data, a particular case of proximity data, we must exclude errors in data or extend the model to treat them. We propose a distance model that is a type of probabilistic distance model for treating the errors in data. A small simulation study was done, and an application to an actual data set is also shown
Tadashi Imaizumi
Analysis of Contingency Table by Two-Mode Two-Way Multidimensional Scaling with Bayesian Estimation
Abstract
Visualisation methods for contingency tables, such as correspondence analysis and dual scaling, are widely used in many research fields. These methods are particularly useful for analysing data on human behaviour, which often involves many qualitative variables. One method for visualising contingency tables is based on a log-linear model and multidimensional scaling. The advantage of this method is that the distances between categories of row and column variables are directly modelled. However, the distances between categories in same variables are not modelled. In this paper, we propose a visualisation method for contingency tables based on a log-linear model and multidimensional scaling with Bayesian estimation. Using Bayesian estimation, prior knowledge of the distance between categories in the same variable is incorporated. Moreover, if the distance between categories for the same variable is treated as missing, we can impute the missing value in the Bayesian estimation framework. To investigate the performance of the proposed method, we conducted a numerical experiment. Results indicated that the proposed method had the best recovery of distances. In addition, we applied the proposed method to real data and obtained reasonable coordinate vectors.
Jun Tsuchida, Hiroshi Yadohisa

On Correspondence Analysis and Related Methods

Frontmatter
What’s in a Name? Correspondence Analysis  . . .  Dual Scaling  . . .  Quantification Method III  . . .  Homogeneity Analysis  . . .
Abstract
This is an essay about nomenclature in statistics, in particular around the theme of dual scaling, a term invented by Shisuhiko Nishisato. The “branding” of statistical methods is examined, as well as the way equivalent terms compete for the attention of “consumers”, with some being adopted more easily and becoming more popular. While some terms are invented with no clear substantive meaning per se, others do convey meaning and are useful for statistical practice and clarity of exposition.
Michael Greenacre
History of Homogeneity Analysis Based on Co-Citations
Abstract
This contribution is a slightly edited text of a 14-page section of my non-digital Ph.D. thesis at Leiden University; see van Rijckevorsel (1987, pp. 41–55). The text has been digitised and has undergone minimal changes. It describes the analysis of co-citations in early papers on correspondence analysis (CA) also known as dual scaling. The prehistory of this technique roughly spans the period between 1930 and 1970. The citations are collected in a symmetric matrix of co-citations of articles by authors known at the time in the field of CA, even if it was not called that. The cells of the co-citation matrix contain the number of citations two authors have in common. The resulting list of authors from 1987 largely corresponds to the list that Shizuhiko Nishisato mentions in his 2021 retrospective; see Nishisato et al. (2021, p. 9). This convergence of selection and the objective nature of co-citations lend credence to the resulting conclusions.
Jan L. A. van Rijckevorsel
Low Lexical Frequencies in Textual Data Analysis
Abstract
The description of lexical tables (cross-tabulating vocabulary and texts) is commonly performed through correspondence analysis (CA) and is often supplemented by clustering and/or additive trees. In many cases, however, a distance matrix, based more simply on the presence or absence of words in texts (and closely related to the \(\phi \) coefficient of Pearson-Yule) can provide more meaningful visualisations. That matrix, easily derived from the correlation matrix of binary variables (presence-absence) is involved in the popular principal components analysis (PCA). After a review of the problems entailed in textual data analysis and information retrieval when dealing with low frequencies (and high discrepancies of frequencies), we show how the use of binary coding of lexical tables enriches and supplements other descriptive approaches.
Ludovic Lebart
Correspondence Analysis with Pre-Specified Marginals and Goodman’s Marginal-Free Correspondence Analysis
Abstract
Goodman (1996, JASA 91, 408–428) introduced marginal-free correspondence analysis where his principal aim was to reconcile Pearson’s correlation measure with Yule’s association measure in the analysis of contingency tables. We show that marginal-free correspondence analysis is a particular case of correspondence analysis with pre-specified marginals studied in the beginning of the 1980s by Benzécri and his students. Furthermore, we review the relationship between correspondence analysis and the RC association models.
Vartan Choulakian, Smail Mahdi
Group and Time Differences in Repeatedly Measured Binary Symptom Indicators: Matched Correspondence Analysis
Abstract
Examining group and time differences in binary indicators becomes complicated when two groups are repeatedly measured with interrelated binary indicators at admission and discharge. The matched correspondence analysis (CA) is introduced to study group and time differences with minimal statistical complexity. To demonstrate its application, patients with anorexia and bulimia who are repeatedly measured at admission and discharge with interrelated binary psychiatric symptom indicators are analysed. Using matched CA, dimensions are identified from the binary indicators based on group (anorexia versus bulimia) and time (discharge versus admission) differences. The statistical stability of the dimensions is then evaluated with a permutation test at the standard significance level of 0.01. The results show that two dimensions of group-difference and one dimension of time-difference are statistically stable, and their coordinates are used to demonstrate group and time differences, respectively. All six symptom indicators indicate that anorexia patients had more severe symptoms than bulimia patients. The time differences are interpreted as treatment efficacy, but they are minimal: only one of the six symptoms, “Depression Not Otherwise Specified”, improves after treatment for anorexic patients. A biplot with two statistically stable group-difference dimensions is constructed to estimate the correlation between age groups and symptom indicators in order to interpret their diagnostic relationship. Anorexic teens and young adults have stronger relationships with the symptom indictors than bulimic teens and young adults. The benefits and drawbacks of using matched CA of repeated measures to study group and time differences are discussed.
Se-Kang Kim
Trust of Nations
Represented by Hayashi’s Quantification Method III
Abstract
Hayashi’s Quantification Method III (QMIII) and Nishisato’s dual scaling, although developed independently in their respective fields, are closely related mathematically. In this paper, some applications of QMIII are presented in the context of our cross-national comparative survey which has been conducted by the Institute of Statistical Mathematics for more than half a century. In particular, we focus on data related to people’s sense of trust. We hope that it will provide us some basic information for evidence-based policymaking for the development of peace and prosperity in the world; see Note 1 in the Appendix.
Ryozo Yoshino
Deconstructing Multiple Correspondence Analysis
Abstract
This paper has two parts. In the first part we review the history of Multiple Correspondence Analysis (MCA) and Reciprocal Averaging Analysis (RAA). Specifically we comment on the 1950s exchange between Cyril Burt and Louis Guttman about MCA, and the distinction between scale analysis and factor analysis. In the second part of the paper we construct an MCA alternative, called Deconstructed Multiple Correspondence Analysis (DMCA), which is useful in the discussion of “dimensionality”, “variance explained”, and the “Guttman effect”, concepts that were important in the history covered in the first part.
Jan de Leeuw
Generalised Canonical Correlation and Multiple Correspondence Analyses Reformulated as Matrix Factorisation
Abstract
Generalised canonical correlation analysis (GCCA) is formulated as the least squares problem which can be called a homogeneity problem. The purpose of this paper is to show that GCCA can be reformulated as two other types of least squares problems; full matrix factorisation (FMF) and reduced matrix factorisation (RMF) problems, by proving the equivalence of the solution among homogeneity, FMF, and RMF problems. Here, the naming of FMF and RMF follows from the number of columns in a factored matrix in RMF being less than that in FMF. We also show how goodness-of-fit indices and their behaviours differ among the three problems in spite of their giving an identical solution. Parallel discussions are made for multiple correspondence analysis, which is closely related to GCCA.
Kohei Adachi, Henk A. L. Kiers, Takashi Murakami, Jos M. F. ten Berge
High-Dimensional Mixed-Data Regression Modelling Using the Gifi System with the Genetic Algorithm and Information Complexity
Abstract
This paper presents high-dimensional mixed-data regression modelling using the Gifi (1990) system along with the genetic algorithm (GA). Information complexity (ICOMP) and AIC-type criteria are derived and provided as the fitness functions to choose best subset of predictors. Statistical analysis and modelling of mixed data sets have been a challenging task for many practitioners and researchers for a long time. The usual classical statistical procedures for mixed data sets often fail to produce good results since most of these procedures assume that the underlying data is purely continuous even with the presence of categorical variables in the model. In this paper, we introduce and use Gifi (1990) system to transform the mixed data to a continuous space and perform a regression analysis and modelling currently faced problem in data science and machine learning methods. We illustrate the flexibility and the utility of our proposed approach on two real mixed data sets in performing multiple regression model and choosing the best subset of predictors.
Suman Katragadda, Hamparsum Bozdogan

General Topics

Frontmatter
Complex Difference System Models for Asymmetric Interaction
Abstract
Complex difference system models for asymmetric interaction addressed here were first proposed by the author at “The International Conference on Measurement and Multivariate Analysis” held on May, 2000, in Banff, Canada, which was organised by Shizuhiko Nishisato. In general, asymmetric interactions among members can be observed as a set of longitudinal asymmetric similarity matrices. Traditionally, these data have been analysed by various two-mode models. Although these models enable us to extract various structures underlying the asymmetric interactions among members, we are unable to extract dynamics underlying these interactions. Complex difference system models discussed in this paper enable us to describe curious dynamics of these interactions among members.
Naohito Chino
Introduction to the “s-concordance” and “s-discordance” of a Class with a Collection of Classes
Abstract
Our aim is to introduce a new measure which expresses the “concordance” and “discordance” between a class and a collection of classes denoted P, of a given population. We first define two basic functions \(f_{c}\) and \(g_{x}\), where \(f_{c} \left( x \right)\) expresses the fit of the representation \(x\) with \(c\) and \(g_{x} \left( {c,P} \right)\) expresses the proportion of classes \(c^{\prime}\) of \(P\) having a fit and a representation \(x^{\prime}\) to the class \(c^{\prime}\) close to that of \(c\). We show, for example, that by using dual scaling (Nishisato in Analysis of categorical data: dual scaling and its applications. University of Toronto Press, Toronto, 1980; Nishisato in elements of dual scaling: an introduction to practical data analysis. Lawrence Erlbaum Associates, Hillsdale, NJ, 1994; Nishisato in Multidimensional nonlinear descriptive analysis. Chapman & Hall/CRC, Boca Raton, FL, 2014), from a table describing each European country by socio-demographic variables, we can obtain from the higher to the lower concordance or discordance a ranking of all European countries. Then, we give the Axiomatic definitions of an s-concordance and s-discordance and examples of s-concordance and s-discordance families. We show that there exist useful links between concordances and copulas. A useful general formulation of the classical likelihood function where there underlying classes are given. In the case where \(P\) is unknown at the beginning, we give a general formulation of mixture decomposition—by the dynamic clustering method (DCM)—taking into account the concordance or discordance and allowing to construct P and the probability density representation of each of its classes. We finally give a way to visualise (in 2D or 3D) clusters of classes of P based on concordance.
Edwin Diday
Discrete Functional Data Analysis Based on Discrete Difference
Abstract
Functional data analysis has been proposed by Ramsay and Silverman, and since then, many theories and methods have been developed. We have already proposed discrete functional analysis (Mizuta 2006). However, in this paper, we refine it and provide a theoretical explanation of uncorrelated discrete differences. An effective approach in functional data analysis is the use of differentiation of continuous functions, especially by expressing the data set in terms of differential equations, which allows us to explore the structure of the data set. The same approach for discrete functions requires the consideration of the discrete difference of discrete functions, which corresponds to the derivative of a continuous function. Problems arise when ordinary discrete differences are used for data analysis. We show this problem and propose an improvement of the uncorrelated discrete difference. We also show how to use the uncorrelated discrete difference.
Masahiro Mizuta
Probability, Surprisal, and Information
Abstract
Performance tests and diagnostic scales are essential to modern societies and to the people that provide the data. Although statistical models for testing data have been researched for decades, it remains nearly universal that test and scale scores are counts of numbers of correct answers or sums of weights assigned a priori to question choice options. For several reasons, these sum scores are inefficient and misleading. Several modifications of psychometric testing theory are proposed that demonstrate large improvements in the quality of test scores and also reveal many details on the performance of test questions. Test taker performance is defined as a position on a one-dimensional manifold. Transforming probability into surprisal or information imbues these manifold positions with a rigorous metric whose unit is a generalisation of the bit. The estimation algorithm permits the analysis of data from thousands of test takers in a few minutes on consumer level computing equipment using an easy-to-use programme, TestGardener, that will be introduced in this paper.
James Ramsay
Metadaten
Titel
Analysis of Categorical Data from Historical Perspectives
herausgegeben von
Eric J. Beh
Rosaria Lombardo
Jose G. Clavel
Copyright-Jahr
2023
Verlag
Springer Nature Singapore
Electronic ISBN
978-981-9953-29-5
Print ISBN
978-981-9953-28-8
DOI
https://doi.org/10.1007/978-981-99-5329-5

Premium Partner