ShadeLab Digital Data Hygiene

Digital Data Stewardship

 

ShadeLab is a microbial ecology lab that uses a variety of technologies in our research.  Therefore, we generate many different kinds of digital data.  It is very important that we are good stewards of those data so that we can share, re-analyze, and reproduce results, as well as to build on our work and that of others.  Data stewardship involves the following few key efforts:

  1. Protecting raw digital data (from accidental deletion or manipulation).

  2. Organizing raw digital data so that other group members and collaborators can easily find and use it.

  3. Organizing protocols that generate digital data so data are clearly linked to projects, experiments, protocols, and originating samples.

  4. Organizing analysis workflows, scripts/code, and computational notes so that everyone can check and reproduce results quickly.

  5. Making all digital data public, findable, and reusable (see FAIR principles). 

 Data Preservation and Posterity, plus Team Accountability

 

We have several ways that we preserve and share our data, protocols, and analyses. Together, we set a high bar for data integrity and quality. 

  • We use the LabGuru electronic laboratory notebook and management system for experimental notes, shared protocols, freezer/culture collection management, ordering, equipment, etc.  All final experiments are digitally signed by Ashley.  LabGuru is searchable and open within ShadeLab, so that we can view or repeat each other's work, easily find protocols, and build from what we have accomplished.

  • We use the Michigan State University High Performance Computing Cluster to store and analyze digital data projects.  Ashley will request research space for each team member.  We share collaborative research space and have priority access to high memory nodes.

  • We use GitHub for maintaining version-controlled digital analysis notes, code/scripts, and datasets.  Each published paper will have an associated GitHub repo that includes the data files and code needed to reproduce figures and analyses.

  • We deposit raw sequence data into the NCBI SRA. We use GOLD/IMG for sequence data generated by JGI. 

  • We organize and collaborate within ShadeLab using Michigan State's Spartan 365 tools (Microsoft Office).  We use Teams for day-to-day communication, OneDrive for co-working on shared files and organizing important lab documents, and Outlook for shared calendars, and OneNote for collaborative notetaking.  Spartan 365 tools are free to all MSU students, faulty and staff.

Let's share!

We are happy to share a copy of ShadeLab's Digital Data Hygiene Plan.  The plan includes a discussion of different digital data types that we generate as well as practical steps for ensuring high quality data preservation and stewardship. We do not include the doc here simply because it is minutia-focused and not everyone wants to read all that, so we've provided the main points on this page as a summary. Reach out to Ashley for the full Plan if interested!