Development of the Social-Spatial Science Research Data Infrastructure SoRa: FAIR, smart, inclusive

The rapid raise of environmental problems requires a better understanding of the interactions between human behaviour and the environment. In the social sciences, the term “environmental justice” getting increasing attention that often deals with aspects of health and general well-being through gathering of citizen perceptions. It is still challenging to answer questions regarding the role of individual perception of environmental pollution (especially in the living or working environment) and related factors. Linking research data from social science survey with data from spatial science could contribute to a solution. The availability of such linked data is scarce due to the interdisciplinary nature of the infrastructure as well as the legal framework is very complex - particularly with regard to data protection. Finally, the methodological challenges in the adoption of required approaches of geographic information systems (GIS) are mostly hard to achieve within the core competence of social science research practice and academic curriculum of social science domains.

The project aims to implement an operational decentralised IT infrastructure based on state of art in technical components, which links survey data with spatial data. The standardised interfaces will make it possible for interested researchers to link large social science survey panel studies such as the Socio-Economic Panel (SOEP) and/or GESIS, with spatial data infrastructure of the IOER. The new SoRa data linkage service infrastructure is intended to ensure simplified workflow for dealing with cross domain research data from the social and spatial sciences, while complying related technical and legal requirements (such as data protection or FAIR principles). Researchers can easily use the data linkage service via suitable user interfaces and also advanced options such as packages of R or Stata. Further dataset will be made available for the researcher – for example so-called “structured datasets” derived from selected panel surveys in order to simulate the spatial distribution of the people those were surveyed. The researcher will get the chance to test and plan the workflow on data linkage prior to physically going to the restricted area of a research data centre (FDZ) and can be well prepared for the actual implementation in a secure room. The infrastructure will be kept expandable and can be supplemented over time by additional social or spatial science research data centres.

Research questions
The interdisciplinary research has to be supported by well-coordinated research infrastructures (ensuring higher degree of interoperability), this project assumes that the new knowledge can ultimately be gained only through the combined analysis of research data from different sources and specialist domains. Example cases studies are analysed in this project in focus of environmental justice. By asking: Which current research questions are significant in this subject area and which other questions arise at the interface between social and spatial sciences? What infrastructures need to be established to address these questions? How can these be implemented with high performance? Can the newly developed concepts and models also possible to transfer to other questions?

The specific requirements for the IT infrastructure are described after analysing the needs from social and spatial science and the resulting draft of user stories (to define the user's perspective). The concept includes a component and function layout as well as core services e.g. SoRa API, geolinking service and their communication with each other. The components are then implemented, tested and optimised. Co-creatively in workshop approach has been adopted for addressing project-relevant issues with interested researchers. The performance of the SoRa data linking service is demonstrated through scientific use cases on various research questions. A dedicated scientific advisory board is established for guiding throughout the project implementation.

