Bring your own project (and problems) session
2024-04-19
This session
Objectives
- Take a few moments to reflect on our own current or intended practices with data
- Identify opportunities and challenges, especially with respect to reproducibility
Process
- Form discussion groups.
- Take notes via the HackMD collaborative notebook (links coming)
- Very helpful if a few people take the notes and other’s contribute to the discussion
- Rotate this responsibility
- Please take care to listen and be respectful of everyone’s opinions
- Notes will be shared via the workshop GitHub repository
1. Data Acquisition
- Data and Meta Data
- What types/forms of data do you each work with? Write as many of these down as you can in a list. (e.g., photos)
- What would be useful to know about how each of these types of data were collected? (think: Camera brand, field of view)
- How can we balance the ideals of data collection with the practical challenges and constraints often faced in real-world research settings?
- List some data or data collection practices that you think you (or others) could document better and would be of high value
- What form would that documentation take?
2. Raw Data Storage and Organization
- Storage
- Where do you tend to store raw data you’ve collected? (e.g., personal computer hard disk)
- What are some of the challenges you face when storing your raw data? (e.g., limited capacity)
- Organization
- How do you tend to organize raw data as you are collecting it (or after you collect it), and where did that organizational practice come from?
- What are some issues that an organizational strategy has addressed or causes for you (or others)?
3. Data Analysis
- Data Cleaning and Transformation
- What does it mean to you to “clean” data, and where do you put this data?
- How do you incorporate and document manual steps?
- Analyses
- What are some challenges you’ve faced, or foresee facing, when it comes to documenting how you’ve analyzed data?
- To what extent does a tool, such as Git and GitHub, address these challenges?
- What are some pros/cons of sharing your analysis code?
4. Data Archival and Sharing
- Motivations
- What motivates you to share archival data with others?
- What are the challenges you might face when attempting to share archival data?
- Strategy
- How do you, or would you, go about sharing a dataset? What are some good methods you’ve used or seen used?
- What are the merits and drawbacks of institutional support for data archival and sharing?
Next up
- Grab some ☕️ and 🍩
- Research and Computational Data Management with Dr. Katherine Ireland and Dr. Camila Lívio
Please fill out the post workshop survey!