EU Horizon 2020 Marie Sklodowska-Curie MSCA-ITN Project LONGPOP (Methodologies and Data mining techniques for the analysis of Big Data based on Longitudinal Population and Epidemiological Registers)
Longitudinal analysis, Demography, Population Studies, Epidemiology, Statistics, Mathematics, Economics, Geography, History, GIS, Big Data, Data Analysis, Quantitative Methods, Public Health
Record Linkage and Data managing of longitudinal population, socioeconomic and epidemiological registers

European governments are moving towards new statistical systems, abandoning costly statistical operations and making greater use of the de-identified administrative data routinely collected by government departments and other agencies. The use of these new datasets and longitudinal population registers require new and advanced skills in terms of data management and statistical techniques.

This work package is aimed at building new capacities among the Early-Stage Researchers to handle these databases. The results include:

  • Catalogue of longitudinal registers
  • Report on concepts and techniques for data curation, management and statistical analyses
  • Report on concepts and techniques for record linkage
  • Report and tools for data coding and harmonization
Geographic context in analyses of longitudinal individual-level data

Geographical context variables can provide many new insights in longitudinal demographic studies. By employing event history analysis techniques it is possible to follow individuals across time.

In this work package, the Early-Stage Researchers learn how to the linkage between GIS and longitudinal methodologies, initially through the digitalization and geocoding of historical maps and their combination with modern geographic information.

The results will be:

  • GIS mobility tool
  • Compilation of different GIS layers on mortality
  • Compilation of GIS layers on Italian demographic
  • Tool to visualize indicators of environmental exposures
  • Tool to locate life courses on maps
  • Web Portal for Health and Population
Data mining, extraction techniques and extraction software

In the recent years, many members of the LONGPOP project have been working on the development of a common format for databases containing information on persons, families and households – the intermediate data structure (IDS) – as part of the European Historical Population Samples Network (EHPS-Net).

This work package is devoted to the development of a data extraction software which will provide a type of structured data mining whereby data from the IDS are transformed into file formats designed for analysis. The LONGPOP members are cooperating to the construction, building, documentation and distribution of extraction software.

The results will be:

  • Report on the IDS and extraction software
  • Report on the coordination of the building of extraction software
  • Report on different algorithms
  • Data mining extraction software