|
|
Date : | Wednesday April 2, 2003. | Time : | 3:30pm |
---|---|---|
Address | Burke Science Building | |
Room: | 138 |
In this presentation we emphasize on one extreme case of fraud; data fabrication. Our main objective is to find out how closely the correlation structures could be reconstructed by fabricated data. In order to investigate the correlation structures of fabricated data-sets, the summary statistics of two real data-sets were shown to faculty members at two medical schools and they were asked to make-up similar data-sets on their own. In the first example we considered two variables which were highly correlated; the height and the weight of 65 female students aged 19-22 (r = 0.43). The correlation coefficient for the 34 made-up data-sets ranged from -0.097 to 0.996. Most participants produced correlation coefficient greater than that of the real data-set. In the second example we considered two variables which were not correlated; the gestational ages (GA) and the weights of 637 newborn boys (r = 0.031). Each participant was asked to write down GA and weight for 40 babies in the ranges of the real data-set. The correlations between GA and weight for the 34 collected fabricated data-sets were in the range of -0.36 to 0.98. In conclusion made-up data-sets yield considerably higher correlation coefficients than the corresponding real data-sets.
This is joint work with Mahshid Dehghan-Kooshkghazi.