Colorado State University faculty, staff and students can learn how to manage their research data at Morgan Library this semester. Good data management makes your life easier, improves research quality and reproducibility, and helps you comply with funder and journal requirements to share data.
Two workshop series will be offered Spring semester: Data & Donuts and Coding & Cookies. All sessions take place from 10 to 11:30 a.m. in Computer Classroom 175 in Morgan Library. Sessions are free, but space is limited to 30 participants. Register for either or both series at lib.colostate.edu, under the News & Events tab.
Coding & Cookies
The Coding & Cookies series provides an introduction to literate programming with R, a popular programming language for statistical computing. Learning to automate data cleaning, analysis, and visualization will make your research more efficient, reliable, and transparent.
Coding & Cookies is offered in collaboration with the Department of Statistics. Workshops will be led by experienced statistics graduate students and facilitated by the Morgan Library Data Management Specialist Mara Sedlins.
A basic working knowledge of R and RStudio is helpful to get the most out of these sessions.
Data Wrangling using dplyr
Tuesday, March 3
The process of generating data can be messy, and what you can do with your data depends strongly on how it is formatted. This session will cover how to clean and manipulate datasets using an R package called dplyr. After this session, you will be able to identify and fix errors, subset, reformat and summarize your data. A basic working knowledge of R and R studio would be helpful for you to get the most out of this session.
Data Visualization Using ggplot2
Tuesday, March 24
Now that you’re familiar with R, you can use your plots to do more than just work with the base graphics package. After the ggplot2 session, you will be able to create a variety of plot types, alter their aesthetics, and create custom themes. A working knowledge of R and RStudio and dplyr would be helpful for you to get the most out of this session.
Reproducible Reports using RMarkdown
Tuesday, April 7
Documenting your analysis in a way that is understandable to a colleague (or yourself 3 months later) can be challenging. One way to make reports more readable, even by people who don’t code, is to alternate human readable text with machine readable code. In this session, we will cover creating reproducible reports of this type using knitr. After this session, you will be able to create R markdown documents, add formatted text and executable code blocks, and render the R markdown document into a final report.
Basic parallelization in R
Tuesday, April 21
Guest instructor Timothy Kaiser, HPC specialist, will lead this session.
Data & Donuts
Good data management practices are becoming increasingly important in the digital age, especially when sharing your research with others. Data & Donuts gives participants tools and strategies that can maximize research impact. This series will be presented by the Morgan Library Data Management Specialist Mara Sedlins.
Tuesday, March 31
Properly documenting your research practices can be a challenge, especially when everything is digital. This session will discuss best practices for conducting reproducible research. You will learn about scripting, literate computing and version control using R, markdown, and git. (For a more in-depth introduction to R Markdown and Git, see the Coding & Cookies workshop offered on April 7 and the Data & Donuts workshop on April 14.) We will also discuss how to apply reproducible research concepts even if you don’t code.
Introduction to version control with Git
Tuesday, April 14
We’ve all intuitively used some type of version control in our work such as saving multiple versions of a document. While easy, it can cause file bloat and ultimately become more complicated. Luckily, formal version control systems have been developed to streamline this process. After this session, you’ll be able to create a git repository, make and add changes to the repository, and use GitHub to remotely store your repository.
Open Data: How to comply with data sharing policies
Tuesday, April 28
Many funders and journals require researchers to make their data publicly available when ethically possible. This session will cover common data sharing policies, what to consider when deciding where to share your data, and how to prepare your data to maximize its impact. Attendees will also receive an introduction to sharing data in CSU’s digital repository, Mountain Scholar.