Comparison of Iterative Proportional Fitting and Simulated Annealing as synthetic population generation techniques: Importance of the rounding method Articles uri icon

publication date

  • March 2018

start page

  • 78

end page

  • 88

volume

  • 68

international standard serial number (ISSN)

  • 0198-9715

electronic international standard serial number (EISSN)

  • 1873-7587

abstract

  • Approaches to space-related problems that model decision-making and interactions at the level of individuals, and thus require disaggregated population data (i.e. specifying all attributes for each individual) are increasingly being used in various research domains. Actual population data is generally unavailable due to confidentiality and cost constraints. Therefore, synthetic population generation techniques based on aggregated marginal constraints and a random sample are often used. The two sample-based techniques most frequently used are Iterative Proportional Fitting (IPF) coupled with integerization and Simulated Annealing (SA) (SA is a special case of Combinatorial Optimization, CO). Several authors have emphasized the need for further research on comparing their relative performance. Thus, a methodology encompassing statistical analysis to compare IPF and SA is presented here. Technique performance is evaluated through the percentage classification error of the generated population against the reference population. Two cases are analyzed using the 2001 census microdata in Andalusia (Spain) and the 2000 Swiss Public Use Sample as reference populations, encompassing 6 socio-demographic attributes plus geographic location (municipalities and cantons). Aggregated marginal constraints and random samples are calculated from the reference population. A set of synthetic small area populations are generated using both techniques for various scenarios within each case, corresponding to different combinations of sample sizes, number of categories and number of generated populations. Results reveal the great importance of the integerization process applied to IPF's output. IPF coupled with a marginal distributions-controlled rounding outperforms populations generated with SA in all scenarios, while as SA generally outperforms IPF coupled with the commonly used Monte Carlo rounding.

keywords

  • synthetic population; iterative proportional fitting; simulated annealing; small area; IPF rounding