This lesson is being piloted (Beta version)

U-M Carpentries Curriculum: Glossary

Key Points

Introduction to the Workshop
  • We follow The Carpentries Code of Conduct.

  • Our goal is to generate a shareable and reproducible report by the end of the workshop.

  • This lesson content is targeted to absolute beginners with no coding experience.

Python for Plotting
  • Python is a free general purpose programming language used by many for reproducible data analysis.

  • Use Python library pandasread_csv() function to read tabular data.

  • Use Python library seaborn to create and save data visualizations.

The Unix Shell
  • A shell is a program whose primary purpose is to read commands and run other programs.

  • Tab completion can help you save a lot of time and frustration.

  • The shell’s main advantages are its support for automating repetitive tasks and its capacity to access network machines.

  • Information is stored in files, which are stored in directories (folders).

  • Directories nested in other directories for a directory tree.

  • cd [path] changes the current working directory.

  • ls [path] prints a listing of a specific file or directory.

  • ls lists the current working directory.

  • pwd prints the user’s current working directory.

  • / is the root directory of the whole file system.

  • A relative path specifies a location starting from the current location.

  • An absolute path specifies a location from the root of the file system.

  • Directory names in a path are separated with / on Unix, but \ on Windows.

  • .. means ‘the directory above the current one’; . on its own means ‘the current directory’.

  • cp [old] [new] copies a file.

  • mkdir [path] creates a new directory.

  • mv [old] [new] moves (renames) a file or directory.

  • rm [path] removes (deletes) a file.

  • * matches zero or more characters in a filename.

  • The shell does not have a trash bin — once something is deleted, it’s really gone.

Intro to Git & GitHub
  • Version control is like an unlimited ‘undo’.

  • Version control also allows many people to work in parallel.

Python for Data Analysis
  • Library importing is an important first step in preparing a Python environment.

  • Data analysis in Python facilitates reproducible research.

  • There are many useful methods in the pandas library that can aid in data analysis.

  • Assessing data source and structure is an important first step in analysis.

  • Preparing data for analysis can take significant effort and planning.

Jupyter Notebook and Markdown
  • Jupyter Notebook is an easy way to create a report that integrates text, code, and figures.

  • A Jupyter Notebook can be exported to HTML, PDF, and other formats.

Conclusion
  • When it comes to trying to figure out how to code something, and debugging, Internet searching is your best friend.

  • There are several resources at the University of Michigan that you can take advantage of if you need help with your code.

  • We didn’t have time to cover all important coding concepts in this workshop, so definitely continue trying to learn more once you get comfortable with the material we covered.

  • There are often packages and tools that you can leverage to perform domain-specific analyses, so search for them!

Glossary

The glossary would go here, formatted as:

{:auto_ids}
key word 1
:   explanation 1

key word 2
:   explanation 2

({:auto_ids} is needed at the start so that Jekyll will automatically generate a unique ID for each item to allow other pages to hyperlink to specific glossary entries.) This renders as:

key word 1
explanation 1
key word 2
explanation 2