Workshop 3: Data Management through e-Social Science
Paul Lambert; Simon Jones (DAMES research Node, University of Stirling )
Background
This workshop will explore approaches to, and resources for, improving standards of 'data management' in the social science research process. Here, 'data management' is taken to refer to operations which are typically conducted by social researchers themselves, and which involve handling and manipulating micro-datasets. These sort of operations, often referred to as 'data handling', 'data cleaning', or 'matching' datasets together, prove in many instances to be a major component of a social research project, yet they are often underestimated by social researchers themselves, and they are rarely the subject of extended methodological attention.
This workshop is driven by work on the 'DAMES' NCeSS research Node, based at the Universities of Stirling and Glasgow, which is embarking on a series of case studies, provisions and support services in the field of data management (2008-2011). Much of the work of DAMES is concerned with specialist types of social science data (examples cover data on occupations; on educational qualifications; on ethnic groups; on social care and lifestyle monitoring; and on e-Health databases). These specialist areas (and some more generic topics in data management) are characterised by having relatively high potential for social science uptake and exploitation, though they also often require relatively focused, bespoke application-driven facilities.
A first aim of this workshop will be to introduce and demonstrate the areas and research questions that the DAMES Node will be engaging in to other relevant specialists in the e-social science domain, and to discuss more widely the scope of e-science for dealing with data management operations. A second aim of the workshop is to present the approaches and strategies which are currently being adopted in DAMES (often in a preliminary form), and to comment on their relation with alternative provisions in the field. A third aim is to include contributions from speakers outside the DAMES Node whose research also has relevance to the range of data management operations significant to social scientists.
The majority of topics covered in this workshop will be concerned with data management operations relevant to the quantitative analysis of social survey and administrative datasets (though some discussions will be relevant to other forms of social science data). Relevant topic areas in e-social science include sharing and linking social science data; social science metadata standards; data access and security principles; and social science workflows.
Format of the workshop
This full-day workshop will feature a group of short papers from participants in the DAMES Node. A presentation by Paul Lambert will introduce the topic area of 'data management' in the social sciences, and compare alternative approaches and practices in the field (including comments on e-science approaches in this field; comments on approaches relevant to the specialist data resources studied within the DAMES Node; and also describing other long-standing social science traditions in micro-computing for data management). A presentation by Alison Dawson (tbc) will cover an in-depth description of social science issues with regard to one theme within the DAMES Node, concerned with accessing and linking together data relevant to the analysis of social care in the home. A presentation by Simon Jones will outline and discuss preliminary computer science approaches being applied on the Node, covering topics such as data abstraction and data fusion. A presentation by John Watt will discuss approaches to access and security for distributed and heterogeneous datasets, using examples from previous applications undertaken by the National e-Science Centre.
The workshop also seeks to include a selection of short papers and presentations from other specialists in the topics of e-social science and social science data management. The workshop organisers warmly invite offers of short or long papers or presentations which may be of relevance to this theme (see below).
The workshop will conclude with a panel discussion session aimed to encourage critical feedback on current activities and plans towards 'Data Management through e-Social Science'.
Topics of interest include (but are not limited to)
. Manipulation and analysis of social survey datasets
. Software applications for managing social science data
. Linking datasets
. Missing data in the social sciences
. Metadata standards for social science data
. Social science workflows

