ABSTRACT
Representative sampling is considered crucial for predominately quantitative, positivist research. Researchers typically argue that a sample is representative when items are selected randomly from a population. However, random sampling is rare in empirical software engineering research because there are no credible sampling frames (population lists) for the units of analysis software engineering researchers study (e.g. software projects, code libraries, developers, projects). This means that most software engineering research does not support statistical generalization, but rejecting any particular study for lack of random sampling is capricious.
- Jorge Echeverría, Francisca Pérez, Andrés Abellanas, Jose Ignacio Panach, Carlos Cetina, and Óscar Pastor. 2016. Evaluating Bug-Fixing in Software Product Lines: an Industrial Case Study. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. ACM, 24. Google ScholarDigital Library
- Barbara Kitchenham and Stuart Charters. 2007. Guidelines for performing systematic literature reviews in software engineering. Technical report (2007).Google Scholar
- Kai Petersen, Robert Feldt, Shahid Mujtaba, and Michael Mattsson. 2008. Systematic Mapping Studies in Software Engineering.. In Proceedings of the Evaluation and Assessment in Software Engineering Conference (EASE) (Bari, Italy). ACM, 68--77. Google ScholarDigital Library
- Ewan Tempero, Craig Anslow, Jens Dietrich, Ted Han, Jing Li, Markus Lumpe, Hayden Melton, and James Noble. 2010. The Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies. In Proceedings of the 17th Asia Pacific Software Engineering Conference. IEEE, Sydney, Australia, 336--345. D0I Google ScholarDigital Library
Index Terms
- There is no random sampling in software engineering research
Recommendations
Random Sampling over Joins Revisited
SIGMOD '18: Proceedings of the 2018 International Conference on Management of DataJoins are expensive, especially on large data and/or multiple relations. One promising approach in mitigating their high costs is to just return a simple random sample of the full join results, which is sufficient for many tasks. Indeed, in as early as ...
Diversity in software engineering research
ESEC/FSE 2013: Proceedings of the 2013 9th Joint Meeting on Foundations of Software EngineeringOne of the goals of software engineering research is to achieve generality: Are the phenomena found in a few projects reflective of others? Will a technique perform as well on projects other than the projects it is evaluated on? While it is common ...
Sampling in software engineering research: a critical review and guidelines
AbstractRepresentative sampling appears rare in empirical software engineering research. Not all studies need representative samples, but a general lack of representative sampling undermines a scientific field. This article therefore reports a critical ...
Comments