IZA DP No. 969: Identification, Characteristics and Impact of Faked Interviews in Surveys: An Analysis by Means of Genuine Fakes in the Raw Data of SOEP
published in: Allgemeines Statistisches Archiv, 2005, 89 (1), 7-20
To the best of our knowledge, most of the few methodological studies which analyze the impact of faked interviews on survey results are based on "artificial fakes" generated by project students in a "laboratory environment". In contrast, panel data provide a unique opportunity to identify data which are actually faked by interviewers. By comparing data of two waves almost all fakes are easily identifiable. So the raw data of the German Socio-Economic Panel Study (SOEP) provide a rich source of faked interviews because it is built on several sub-samples. However, because interviewers know that panel respondents will be interviewed again over the course of time, clever interviewers will not fake panel interviews. In fact, in raw data of SOEP the share is about only 0.5 percent of all records. The fakes are used for an analysis of the potential impact of non detected fakes on survey results. The major result is that the faked records have no impact on the mean and the proportions. But in very rare, exceptional cases there may be a bias in estimates of correlations and regression coefficients if fakes would not be detected. One should note that – except for some fakes in the first two waves of sample E – faked data were never disseminated within the widely-used SOEP. The fakes were detected before the data were released.