The maintenance and updating of Statistics Austria's business registerudrequires a regularly matching of the register against other data sources;udone of them is the register of tax units of the Austrian Federal Ministry ofudFinance. The matching process is based on string comparison via bigrams ofudenterprise names and addresses, and a quality class approach assigning pairsudof register units into classes of different compliance (i.e., matching quality)udbased on bigram similarity values and the comparison of other matching variables,udlike the NACE code or the year of foundation.udBased on methodological research concerning matching techniques carriedudout in the DIECOFIS project, an empirical comparison of the bigram methodudand other string matching techniques was conducted: the edit distance, theudJaro algorithm and the Jaro-Winkler algorithm, the longest common subsequenceudand the maximal match were selected as appropriate alternatives andudevaluated in the study.udThis paper briefly introduces Statistics Austria's business register and the correspondingudmaintenance process and reports on the results of the empiricaludstudy.
展开▼