At the end of your project you’ll need to think about how you will share and archive your data for future use.

cropped RDM at the end of your project page photo catalogue boxes

Many funders and publishers now require researchers to share their data as a condition of funding and publication. The best way to do this is to deposit your data in a recognised data repository, which will make your data findable and improve the accessibility and re-usability of your data for the long term.

Below are some guidelines, but if you need assistance, email the Research Data Management Service: This email address is being protected from spambots. You need JavaScript enabled to view it..

 

Back to top


What data to share

It is important that you do not share any data from potentially commercialisable research. Contact the This email address is being protected from spambots. You need JavaScript enabled to view it. for relevant guidance on communicating IP.

Your data must be anonymised prior to sharing. You should never share data containing identifiable information.

If your data contains information classed as ‘sensitive’ then you can still make it available under strict access conditions. This data must always be anonymised prior to sharing. Access to controlled access research data at St George’s, University of London is governed by an independent, transparent process. Contact the This email address is being protected from spambots. You need JavaScript enabled to view it. for more information on sharing and providing access to sensitive data


Back to top


Preparing your data for sharing

If you are submitting your data to a repository, always check the repository’s requirements before you start preparing your data for sharing. You may need to organise or document your data according to specified standards before it can be accepted.

Always redact, anonymise or de-identify your data before making them available.

What is data Anonymisation?


Anonymisation is the process of turning data into a form which does not identify individuals and where identification is not likely to take place. This allows for a much wider use of the information. Anonymising research data involves removing information which might lead to an individual being identified, either from the data itself or by combining the data with other information which a recipient of the data could be expected to have access to. Once the information is anonymised, it ceases to be personal data, and can be disseminated and published without contravening the Data Protection regulation. However, the latter does not diminish research ethics considerations and requirements.

What is data Pseudonymisation?


The Data Protection Act (DPA) 2018 and General Data Protection Regulation (GDPR) defines pseudonymisation as “the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information.” To pseudonymise a data set, the “additional information” must be “kept separately and subject to technical and organizational measures to ensure non-attribution to an identified or identifiable person.” It is important to note that where a researcher produces an anonymised dataset but retains the information which is necessary to identify an individual, the totality of the information held by the researcher will still be personal data, and will have to be managed in accordance with GDPR. What the researcher holds will not cease to be personal data unless the researcher disposes of the identifying information and has no means of recovering it.

Quantitative Data


Anonymisation may be as simple as removing variables which directly identify a research subject, such as name and home address. However, it is often necessary to do more than that to render a dataset truly anonymous. Variables may have to be removed or the data manipulated to deal with situations where an individual could be identified through combinations of variables, or by combining the data with other publicly available information. For example: full UK postcodes typically cover only a small number of delivery addresses, and can easily lead to identification of an individual or household when combined with other information. To anonymise a dataset, it might be necessary to remove the postcode or to only include the element of the postcode which relates to a wider area (i.e. first part of the postcode).

Qualitative Data


Anonymisation may involve the use of pseudonyms and editing the data to remove identifying information. Anonymisation of qualitative data can be problematic because of the risk of individuals being identified through contextual information, and the risk of the data being distorted by the anonymisation process.

The UK Information Commissioner’s Office has published a code of practice for anonymising data. More information on how to anonymise or redact data can be found at the UK Data Service website and at the Irish Qualitative Data Archive.

 


Back to top


Creating your data package

Good quality data is shared as a data package that includes both the data and its related documentation. This will make your data interoperable and re-usable. Good practice is to share any software or code supporting a project as a separate package that references (and links to, if possible) the data package. This is because source code and data are normally shared under different licenses.

You should always share your data under an appropriate license. A license protects your work by telling others how they can legally reuse your content. Selecting an appropriate license can be a daunting task. This license selector can help you to select the most appropriate license for your work.

You may find our Preparing your data for deposit checklist useful as you curate your data at the end of your project.


Back to top


Repositories

There are a number of repository options available to share and preserve your data. This section identifies a few general and subject-specific repositories available to you.

It is always best to choose a national data centre or subject-specific repository to share your data as these are more likely to be accessed by researchers in your field. Data that you have shared elsewhere should be recorded, and where possible linked to, on the St George’s Data Repository.

If you would like support in selecting a repository that’s right for your data, please contact the This email address is being protected from spambots. You need JavaScript enabled to view it.

 

Subject-specific repositories

The Wellcome Trust maintains a list of data repositories covering: nucleotide, genome, protein and macromolecular structures, microarray, proteomics, social sciences and humanities databases, as well as bacterial and virus collections. There are also data collections and repositories listed by the BBSRC for biotechnological and biological sciences research.

FAIRsharing.org also maintains a searchable list of subject-specific repositories.

 

General repositories

Zenodo is a free repository for data from all subject areas. Zenodo is maintained by CERN and will ensure that your data will be discoverable to other researchers and preserved for at least ten years. You may also set restrictions on access to the data, so it does not have to be completely open access.

Dryad and figshare are other general purpose repositories.

 

St George’s Research Data Repository

The St George's Research Data Repository is a digital archive for discovering, cataloguing, storing, sharing and preserving research content produced at St George's. You can use the archive to:

  • Share your research data, source code, conference papers, posters, presentations, images, videos, and a range of other digital research outputs
  • Register and link to data and other research material that are already in the public domain, but are difficult to discover, cite and measure for impact

Each deposit in the St George’s data repository is provided with a digital object identifier (DOI), which allows items to be uniquely cited and assessed for impact. This DOI can be used in Data Access Statements on your journal papers.

Instructions for using the repository are in the next section.

 

Back to top


Using the SGUL Research Data Repository

Log in

Before you deposit your data in the repository please ensure that you have completed the preparing your data for deposit checklist.

Once you have completed the checklist:

  • Go to the St George’s Research Data Repository
  • Click ‘Log in’ on the top right of the screen
  • Select ‘St George’s, University of London’
  • Log in using your institutional credentials

You are now logged into the repository.

What would you like to do?

  • I want to upload and publish my data publicly
  • I want to link to my data that’s published on another repository or website, or place my files under an embargo
  • I want to upload a confidential file. The file will be uploaded to the repository but it will remain confidential, only the metadata will be openly available along with details on how to request access to the data.
  • I want to create a record of data that I possess, I do not want to upload any files

St George’s Data Repository is powered by Figshare. You can find answers to more frequently asked questions on Figshare's knowledge portal.

It is advisable to contact the This email address is being protected from spambots. You need JavaScript enabled to view it.if you intend to deposit your data in the repository to avoid any delay in publishing your research.

 

Back to top


Last Updated: Wednesday, 12 June 2019 10:24

For all enquiries about research data management, email: researchdata@sgul.ac.uk