LONGPOP | Evaluating and Documenting IDS Databases and Extraction Software
The Intermediate Data Structure (IDS) establishes an integrated and joint interface between European and non-European historical longitudinal databases.
Horizon 2020, Longitudinal analysis, Demography, Population Studies, Epidemiology, Statistics, Geography, History, Big Data, Data Analysis
15922
page,page-id-15922,page-template,page-template-full_width,page-template-full_width-php,ajax_fade,page_not_loaded,,qode-title-hidden,side_area_uncovered_from_content,qode-theme-ver-8.0,wpb-js-composer js-comp-ver-4.9.2,vc_responsive

Francisco Anguita – ESR2: Developing and documenting IDS extraction software.

Francisco
HOST

International Institute of Social History (IISH), Koninklijke Nederlandse Akademie van Wetenschappen (KNAW), Amsterdam.

SUPERVISOR

K. Mandemakers (IISH) kma@iisg.nl

OBJECTIVES

The format of the Intermediate Data Structure (IDS) establishes an integrated and joint interface between European and non-European databases containing historical longitudinal micro-data. The format has been adopted by the European Historical Population Sample network in which all databases in this field are cooperating, not only European databases but also from other continents like the datasets held at ICPSR and the Canadian Databases (http://www.ehps-net.eu/content/making-datasets-comparable).

This IDS forms the basis for extraction software that makes the data suitable for statistical analysis in a comparative and systematic way. The experience gained within this network needs to be transferred not only by the diffusion of results, but also by training ESRs on different tasks, such as data managing and curation and the diffusion of results by improving the documentation of several databases and software on its website and the e-journal Historical Life Course Studies.

Part of the job will be the DDI tagging of the content in the databases at the variable level to make the content open to the Web. The ESR will work according the outcomes of the standing working groups of the EHPS network coordinated within the KNAW-IISH and in close collaboration with the rest of the LONGPOP ITN-ETN. The ESR also will have a key role in evaluating software, especially tools developed by the other ESR’s within the LONGPOP project. On the basis of this work on the documentation of the databases, the ESR will develop a special volume of the e-journal on the history and content of these databases.

EXPECTED RESULTS

This IRP will be devoted to the coordination of the construction, building, documentation and distribution of extraction software related to LONGPOP ITN-ETN and the existing projects that are coordinated from KNAW-IISH under the European Historical Population Samples Network (EHPS-Net). The existing documentation on databases on the EHSP website will be improved and expanded. A special volume of Historical Life Course Studies will be edited and organized. This volume concerns the development and content of the existing databases. A report will be made on the ’state of the art’ on the IDS and extraction software. This will be presented in the LONGPOP workshops in the last year of the project. A report on the coordination of the building of extraction software, especially the software that is created by the participants in the Marie Curie network, will be delivered at the end of the project.

LONGPOP EXPECTED RESULTS:

2.1 Systematic improvement of the documentation on databases on the EHSP website.
2.2 Editing a special volume of Historical Life Course Studies concerning the development and content of existing databases.
2.3 Report on the IDS and extraction software in the LONGPOP workshops.
2.4 Report on the coordination of the building of extraction software, especially the software that is created by the participants in the Marie Curie network.
2.5 A personal development plan (course followed at NW Posthumus Institute).

 

LONGPOP WORKING PAPERS AND PRESENTATIONS:

Paiva, D. and Anguita, F. Linking the Historical Sample of the Netherlands into the American censuses, 1850-1940. International Workshop on the Systematic Linking of Historical Records 2017.

ESR BIOGRAPHY

Francisco Anguita is a researcher at the International Institute of Social History. His current activities at the Institute involve data linkage and analysis between large historical databases and population registers; the development, enhancement of the Intermediate Data Structure (IDS) and extraction software for longitudinal demographical analysis; and the documentation of historical databases. He is also part of the Historical Sample of The Netherlands dept. and of the European Historical Population Sample Network.

Francisco obtained his MSc in Research Methodology in Behavioral and Health Sciences in 2014, which allowed him develop numerous projects on data analysis and research design for teams of different Spanish universities (Autonomous and Complutense Universities of Madrid, Granada, Alcalá de Henares, UNED, etc.). With similar duties, he was also in charge of the Center for Applied Psychology at the Autonomous University of Madrid (2013), assisting scholars and researchers of the fields of Psychology, Education and Health Sciences on data analysis.

He also worked for several years as a developer (including databases) and he is a graduate in Social Anthropology and Physics.

Contact: francisco.anguita@iisg.nl 

Link to his profile in ResearchGate here, in Xing here and in Linkedin here.