Unlike a common prejudice, especially in public opinion, political polls often correctly predict the election results. Although they disregard many of the most basic statistical principles. A real mystery, hence, that we try to face analyzing some recent elections (National 2013 and European 2014). Concluding that, when conducting investigations in the sphere of human and social sciences, it is appropriate that some statistical principles (in particular the inference and the theory of the samples) should be (at least partially) abandoned in favor of other more appropriate and innovative methodological principles.
Every year the electoral outcomes are in many cases significantly different from those estimated by previous surveys. The most relevant reasons are linked to three peculiar elements: the floating social desiderability connected with the political offer; the increasing electoral volatility; the technique used to «correct» the unweighted results of the surveys. In the two last Italian elections, the prediction of the two stronger parties have been widely erroneous: Democrat Party (PD) and Five Stars Movement (M5s) were respectively over and under-estimated in the Parliamentary Election (2013) and, on the contrary, under and over-estimated in the following European Election (2014). The paper analyses the reasons of this failure, underlining the crucial role of the pollsters in their difficult effort to correct the raw data of the surveys, by means of the past trend linked to each political party.
«Ogni anno i risultati elettorali ci mostrano un panorama delle scelte dei cittadini completamente inaspettato, rispetto alle previsioni di voto. Colpa certo dei sondaggi, almeno in parte, che faticano sempre di più a catturare gli umori della popolazione, e ci forniscono quindi la possibilità di stupirci delle scelte degli elettori, di accostarci alle serate o alle nottate di voto con rinnovato interesse, non come una mera ratifica di quanto già si conosceva. La sorpresa diventa dunque la cifra stilistica dei programmi elettorali. Anzi: le sorprese. Perché solitamente gli exit-poll differiscono un poco dal - le rilevazioni demoscopiche che, ufficialmente, sono ferme a due settimane prima del voto; le proiezioni poi, quando sono ben fatte (come nel caso delle ultime europee), ci danno un quadro ancora più diversificato, smentendo tutti i pronostici e gli stessi exitpoll che le precedono. Dunque, meno male che ci sono i sondaggi, che ci permettono di tener viva la nostra attenzione, senza dare nulla per scontato…» (Dall'incipit dell'articolo)
Problems in sample design tend to appear more and more often in Web surveys, because these surveys are frequently sold leveraging price. Web is just another tool to collect data and the accuracy of sample design cannot be ignored. It is common to hide these sample problems through data weighting ignoring that weighting does not correct structural biases. The case history shows how to cope with situations of this kind and uses a bootstrap approach to correct sample biases and to provide more robust estimates.
Telephone surveys are a cheap and quick method of collecting survey data. When sampling the general population, Telephone Directories may be used as sampling frames. When these Directories are incomplete, non-coverage errors may occur and survey data may be biased. In Italy, 50% of individuals are excluded from the Telephone Directories. Using data from the 2012 Multipurpose survey, this paper explores the relationship between telephone under-coverage and data quality; in particular, it assesses the impact of telephone under-coverage on bias for a set of 32 survey items and explores the effectiveness of post-stratification strategies. We find that the exclusions of cell-only and unlisted individuals is a major source of bias and post-stratification strategies currently in place are ineffective in removing the bias. We also find that the nature of the relationship between under-coverage and bias is complicated: it varies between and within topics.
The paper aims to discuss, through two examples, how social researchers can reduce the occurrence of the non-response bias that, if not adequately considered, can interfere with the possibility to estimate data about the population even if the research design is based on a probability sampling. Based on ex-post test, the first example shows how it is relevant also for sociologists to adopt a probability sampling. The second example aims to discuss the pros and cos of Gallup data collection method applied in the last survey of EVS (European Values Study) in which is not allowed to substitute people who refuse to be interviewed or have an impediment for some reason. In conclusion, a proposal to improve the Gallup method in EVS is discussed.
Most large-scale secondary data sets from social international surveys (e.g., European Value Survey) are constructed using complex survey sample designs where the population of interest is stratified on a number of dimensions and oversampled within certain of these strata. Moreover, these complex sample designs are collected by means other than simple random samples: usually cluster sampling. The oversampling and cluster sampling must be addressed when analyzing complex sample data. This article shows, using real examples from international surveys, that ignoring the oversampling and clustering of observations might result in underestimated standard errors and biased parameter estimates and gives some gimmicks that researchers can use to address these problems and to make the most of standard statistical software.
The aim of this paper is to suggest how social sciences may afford the challenges and opportunities that arise from the spread of Big Data with regard to their applicability in social research. Even if Computational Social Sciences are more and more attractive for sociologists, there is an urgent need for wider critical reflection on the epistemological and methodological implications of their use in investigating social phenomena. The article begins by reviewing relevant contemporary debates about the nature and the features of Big Data, and then it critically explores four questions: the representativeness, validity, reliability and ethics of research based on Big Data analysis. In conclusion, it’s suggested that even if the rise of BD poses new opportunities for sociology, especially in terms of the possibility of probing hidden population and collecting data in a very quick way, the new methods cannot determine the decline of traditional methods at the present.