Scientists generate more data than ever before. It can be daunting to determine how to extract insights from a mountain of data.
Data science combines traditional statistics and analytics with programming to produce novel insights, intelligently automated processes, and data-driven decisions. This hands-on workshop will cover the basics of Python for data science. We will leverage the Python programming language and libraries which provide powerful and versatile tools for data analysis. This workshop is ideal for scientists who want to write a data analysis pipeline, translate their existing code from R, or transition from Excel data
analysis to Python.
What You Will Learn
This workshop will equip you with an understanding of the tools you need to complete a basic data science project using Python.
Students will have the opportunity to apply basic statistical and machine learning methods to datasets of their choice through open-ended independent assignments. Several biology-specific analysis tools in Python will be introduced.
workshop will equip you with an understanding of the tools you need to complete a basic data science project using Python.
Students will have the opportunity to apply basic statistical and machine learning methods to datasets of their choice through open-ended independent assignments. A selection of biology-specific analysis tools in Python will be introduced.
This workshop will give an introduction to the following topics:
- Why Data Science with Python?
- Python Primer (Programing basics)
- Python Data Science Tools (Numpy and Pandas)
- Data Wrangling and Cleaning (Pandas)
- Data Visualization (Matplotlib and Seaborn)
- Statistics and Machine Learning in Python (scikitlearn)
- Bioinformatics in Python (Biopython)
- Where to Go for Help and Additional Resources
We will use the Anaconda (Python 3.7+ version) distribution for this workshop. Attendees will need a computer with Anaconda successfully installed. If attendees do not come to the pre-workshop session, they will need to complete the day 0 section on Canvas before the start of day 1.
A computer and internet access are required. Simultaneous access to two screens is highly recommended for best learning experience.
One day before the workshop at 5:30 to 6:30 PM ET - Will go over Anaconda and module installation, IDE use, and workshop logistics. This session is not required but highly encouraged.
This workshop is fast-paced, but is also an introduction with the goal of giving you the tools to further your understanding after the workshop. The following guidelines will help you determine if this is the right workshop for you.
- Required: Strong general computer literacy (familiarity with file systems, locate and open applications, locate and open terminal/windows terminal/console). We assume basic statistics knowledge.
We may suggest you consider other workshops if:
- your main goal is deep learning (BIOF 050)
- BIOF 085 covers machine learning concepts (e.g. tree classification, dimensionality reduction, regression)
- your main goal involves image analysis (BIOF 085 does not cover image analysis)
- you are familiar with multiple programming languages
- Especially C, C++ (BIOF 085 starts with the assumption of little to no understanding of programing concepts)
Although no grades are given for workshops, each participant will receive Continuing Education Units (CEUs) based on the number of contact hours. One CEU is equal to ten contact hours. Upon completion each participant will receive a certificate, showing completion of the workshop and 2.1 CEUs.
Follow the link to review Workshop Refund Policy.
- All cancellations must be received in writing via email to email@example.com.
- Cancellations received after 4:00 pm (ET) on business days or received on non-business days are time marked for the following business day.
- All refund payments will be processed by the start of the initial workshop.