This week's revision of government estimates over the number of foreign workers in Britain has brought into sharp focus the whole business of collecting statistics. Critics assume it is a simple matter, and the failure to get it right is either charitably, a matter of ministerial incompetence, or conspiritorially, a sign of deliberate obfuscation. Having seen at first hand the trouble with government statistics I know it is neither.
There are three main types of statistic collected by the government's fiercely independent statisticians (the idea that these people have any political axe to grind is laughable). The first - the one that caused all the problems this week - is that based on a sample. The Labour Force Survey, which estimates activities for the entire adult population based on 59,000 households, or in education, the Youth Cohort Study, which looked at the activities of 1.9 million 16-19 year-olds based on samples of around 9000 in each age group, are good examples. The second is a census - a collection of information on every pupil in every school or every member of the population. And the third is an estimate derived from that census based on certain assumptions, as in the revised population estimates used to determine local government funding allocations, the subject of today's LGA complaints.
Of course, the second should be the most accurate: we know how many pupils are in each school, and about their exam achievement, gender, ethnicity and so on. Indeed it is the richness of data now available in education - and in health - that makes reform easier there (contrary to the ideological opponents of targets in education, the sort of floor targets set out by Gordon Brown yesterday have already been remarkably successful, if grossly under-reported). But the Labour Force Survey is better - as with opinion polls - when it is dealing with larger numbers related to the population as a whole - numbers in jobs or unemployed - than when it tries to deal with a subset that might reflect 2 or 3% of the workforce as with some migrant communities.
Indeed, the only way to get a better sense of those labour force figures without a great extra burden on business would be through a national ID register (though the Tory critics who shout loudest about these figures want nothing to do with that). As for the estimates, the LGA are right to suggest that the richer education and NHS data should be pooled with other information to get a better sense of who is living where. But even then it will never be 100% accurate. That is because of the nature of the statistics, not because of ineptitude or distortion.