WWZ Workshop: Clean Code in Context
Agenda
- 13:00-13:30: What is clean code and why should you use it?
- 13:30-14:15: File Structure, Data Formats and More
- 14:15-14:30: Break
- 14:30-15:15: Exercise
- 15:15-16:00: An intro to coding with AI
Recap Crash Course
“I like my code to be elegant and efficient. The logic should be straightforward to make it hard for bugs to hide, the dependencies minimal to ease maintenance, error handling complete according to an articulated strategy, and performance close to optimal so as not to tempt people to make the code messy with unprincipled optimizations. Clean code does one thing well.”
— Bjarne Stroustrup, inventor of C++
Reference: Martin, R. (2015): Clean Code. Upper Saddle River, NJ: Prentice Hall.
Why you should use clean code
- It saves time & increases efficiency
- Clean code reduces the effort of trying to understand a script later.
- It makes collaboration easier
- No need for project partners to explain their code when it’s clean.
- Well-organized scripts make it easier for reference
- You’ll likely reuse your code a lot during your career.
- Reproducibility
- Saves time when putting together reproduction materials.
Recap
- Code for people, not machines
- Use the right names
- Adhere to standards and be consistent
- Use comments & avoid unnecessary ones
- Use a lab journal for different parameters, models, etc.
- DRY: Don’t Repeat Yourself
- YAGNI: You Ain’t Gonna Need It
- KISS: Keep It Simple, Stupid
Clean Code in Context: File Structure

A good start, but not ideal.
Why File Structure?
Based on the MIT Communication Lab:
Why do it? The arguments for clean code apply here as well:
- It simplifies cooperation
- Create a more streamlined analysis workflow
- No time wasted when trying to understand project structure
- Reproducibility
- Future you will be grateful
- Copy-paste structure for future projects
- Less time needed to refamiliarize yourself with an older project \& its code
General Good Practices
- One main folder per project
- Subfolders: number and type depend on the project
- Separate raw data from edited data
- Consistent naming for subfolders and files
Naming conventions
- Use underscores, hyphens, or periods as delimiters
- Avoid spaces
- CamelCase: e.g.,
firstNameLastName
Examples
Instead of confusing names like final_final_thisone
, consider: v1
, v2
, etc. Use YYYYMMDD
for dates.

Folder structure

Source: Claire Duvallet
- Separate raw data from edited data.
- I also recommend this for coding in general: rename a data frame when you edit (subset, aggregate…) it.
- If you mess up or want to do something differently, you won’t have to reload the data.

Source: Case studies from mitcommlab
- Strive to use universal formats across operating systems.
- Recommend
.rtf
(text) and .csv
(spreadsheets).
LaTeX
- LaTeX is independent of OS.
- The learning curve is steep, but worth it.
- I recommend Overleaf for online editing.
File encoding
- Use UTF-8 encoding to solve internationalization issues.
- Avoid special characters in names, which might cause issues.
R Markdown
- R Markdown is a tool to produce PDF or HTML documents with embedded code chunks.
- It is ideal for reports, for others or for yourself.
Integrating GitHub with RStudio
Follow the instructions here for GitHub-RStudio integration. This integration allows making commits directly via RStudio.

Examples & Resources