A best guess can in fact be very good indeed, if the data it is based on is good and the sample size is large enough. It is called, in technical language, an estimate; and almost the whole business of statistics is to work out how good an estimate or a best guess is – to say how great or small its uncertainty and possible or probable error.
Leaving aside that example of Parliamentary language failure, the Public Administration Committee has said the migration statistics were "not fit for purpose" and do not accurately assess how many non-UK residents are entering and leaving the country.
They say that the figures are based on too small a sample – 5000 out of the official estimate of 163,000 migrants – and that there is too great an uncertainty in the estimates. ‘For net migration … the range is currently around plus or minus 35,000, which means there is a 95% chance that the true level of net migration falls within a range of around 70,000’ says their report. Immigration minister Mark Harper on the other hand defended the statistics as "accurate" and "very robust".
So which side is right? There are two intertwined issues: the quality of the data and the quality of the interpretation of it.
On the quality of the data: statisticians can work out the degree of certainty – the confidence interval – which a sample of 5000 out of around 163,000 should give. That is fairly bog-standard statistics, and almost all statisticians should be able to agree on the answer as calculated by the statisticians of the Office for National Statistics: the 95% chance, mentioned above, that the true level of net migration falls within a range of around 70,000 – that is, lies between 198,000 and 128,000. If policy-makers need more certainty than that, then the answer is obvious: take a bigger sample. Take a sample of 10,000, or 20,000; if you feel it is really necessary, count everyone.
Which, on the face of it, should be simple enough. Since Britain is an island it should be fairly easily to count how many people are coming into it and going out of it. It involves only a few airports and a few ports. Everyone going through them already has to show their passports. It would be perfectly possible to ask each one of them if they were just visiting for a while or staying permanently.
That would yield have a sample size of 100% – and no uncertainty at all in the figures. Well, not quite. The sample size would be nearly 100% (it would exclude those who are smuggled in inside container lorries, or on rubber dingies over beaches, but that (at a guess) is a very small proportion of migrants.) But it would almost certainly be inaccurate data – ‘dirty’ data in the jargon. Because of course many of the intending migrants might lie, and say they were just visiting. Without some sort of tracking and followup system, which the UK Borders Agency is apparently incapable of implementing, you would be back once again to an element of estimating – or best guessing. It would, though, be a smaller element when compared to the overall total.
The pressure group Migration Watch has long produced its own statistics on immigration. There is no obvious reason why their data should be any more accurate than the official figures, and there is plenty of reason why their analysis might be worse than the official ones since they are a pressure group with an agenda which they are trying to prove. (That is not to imply dishonesty; in statistical analyses many biases are completely unconscious. It is just that one’s instinct is to trust statistics from pressure groups less than from completely independent sources with no axe to grind. ) On the other hand, Migration Watch have pointed out many flaws in the official methods of data-gathering.
The Office for National Statistics has, we trust, no motive to make the estimates seem either higher or lower than they should be (and independence of data-gathering and analysis is written into their code of conduct.) Governments do have motives, but the civil servants of the ONS are independent of government and (again, we trust) cannot be sacked or have their careers blighted for producing figures that Ministers do not like. (That is not the case in Argentina, though: see previous artciles "Argentina – the lies" and "A small victory for honesty in Argentina – but the struggle must continue". But Migration Watch has flourished because migration is an issue that arouses strong passions – and because many people disbelieve the official figures. For just how widely public perceptions differ from official figures, see the recent survey by the Royal Statistical Society here.
In an ideal world, a) the official data would be clean enough, accurate enough, and of sample size big enough to give adequate confidence levels; b) the official analyses of that data would stand up to independent scrutiny, and c) the general public would believe and trust the answers. In a less than ideal world, a) and b) would be true. In the worst of all possible worlds neither as nor b) nor c) would be true.
We know we are not in an ideal world – the general public do not believe official statistics. We hope that the current row ends up demonstrating that we are in the less-than-ideal one. It would be comforting to learn that a) and b) are both true, even if c) is not.
It would be deeply depressing to discover than neither a) nor b) nor c) hold good, and that we are in the worst of all possible worlds. An adequate form of government needs official statistics that are true. Otherwise policies will end up based on prejudices and the influence of pressure groups instead of what is actually true.ing 10 degrees of freedom.