Laboratory processes over the past two decades have produced biological data at an exponential rate. When scientists first sequenced the human genome, the global community presumed that miracle cures were around the next corner and that biology had been practically "solved". However, humanity's ability to understand biology is often not limited by our ability to produce data. The bottleneck is understanding the data.
This course is for anyone who wants to help alleviate that critical bottleneck.
The curriculum was originally developed to help life science graduate students analyze their experimental data. Anyone interested in exploring bioinformatics or strengthening their Python skills will find the course valuable. Because this is an intermediate course, students should either 1) have some previous experience with Python or 2) feel comfortable enough with programming in general that they feel they can pick up a new language quickly. If you're still not sure if the course is for you, contact us!
Dr. Jessime Kirk received his BS in Biochemistry from the University of Kentucky and his PhD in Bioinformatics and Computational Biology from the University of North Carolina at Chapel Hill. He helped found UNC's H2L2C course to help life science graduate students learn how to program. The course went on to train more than 300 students in its first 5 years. Jessime now works as a Bioinformatics Engineer at Invitae, a leader in advanced medical genetics, developing NGS-based diagnostic assays. For more information on Jessime, check out his resume or website.
Specifics of each week will change slightly depending on the pacing and interests of the students in the course, but here's InPyBio-20's syllabus:
Get students up and running on the platform and review basic programming skills like looping, IO, and simple data types. An early headache for many students is setting up an environment and workflow that fits their development needs; we'll take the time to make sure everyone has what they need to make the most of the coming weeks.
Students will learn how and when to use arrays. We'll also cover interacting with the filesystem from within Python. By the end of week two, students will have the fundamentals they need for most analysis programs they'll ever write: loading data from a source, transforming it to fit their experiment, and outputting it in a format that provides insights.
This lesson is a first pass at processing columnar biological data and encourages students to think deeply about the data structures they employ to tackle a given problem. Virtually every life scientist eventually needs to analyze columnar data that's too large for Excel.
Our second round of columnar data processing introduces Pandas dataframes, which are a common and extremely powerful way to manipulate 2D data. Pandas is a large library with many features; we'll focus on building mental models for dataframe transformations so students can explore more on their own.
Shifting gears slightly, we'll use cell images to practice more complicated usage of arrays.
This lesson gives students an overview of how to think about object-oriented programming in Python. Classes provide students with a framework for building more extensive and capable pipelines. Students will have the opportunity to write their own classes and practice translating simple models into code.
Networks are another common and dynamic way to represent biological data. This lesson gets students comfortable with the many ways networks can be represented in Python, as well as some basic algorithms that can be applied to networks.
We'll use this last week to reflect on everything we've learned and combine it into a bioinformatics pipeline. This week will give students the opportunity to explore how more complex bioinformatics projects are structured. Students will also be free to spend this time catching up on previous lessons if they prefer.
InPyBio-20 isn't just a bi-weekly lecture. There are a number of components to the course:
Additional details on all points can be found here.
CodeStories is a one-of-a-kind blend of the best that digital education has to offer. While there are plenty of resources on the web for learning to program, we developed CodeStories because nothing available fit our needs. CodeStories is tailored for professionals—that is, individuals (such as life scientists) who already have careers in data-intensive fields and who have an eminent need to learn how to make sense of their data.
Many sites, like Codecademy, focus on teaching beginner programming. While this is useful for many people, the breadth makes much of the material out-of-scope for professionals with specific use cases. CodeStories curricula have been thoughtfully developed by experts to be immediately applicable for students. Our first course, Intermediate Python for Bioinformatics (InPyBio), was developed over five years of teaching life science graduate students.
On the other hand, sites like DataCamp are much more focused on data-intensive fields, but they aim to teach individuals who want to become full-time data scientists. Again, these sites sacrifice personalization for broader appeal. We understand that, while topics like machine learning are fundamentally interesting, it's important to realize that sometimes there isn't time to learn everything. CodeStories teaches you what you need to know so you can unblock yourself at work.
Finally, there are MOOC sites like Udemy with large collections of narrowly focused courses, including some in bioinformatics. The dirty secret of MOOCs, however, is that only 2% of students who start a course finish it. CodeStories offers live classes with other students and 1:1 time with instructors. This environment provides the personalization, accountability, and motivation we all need to keep learning. The 1:1 time allows students to get unstuck when they would otherwise abandon a standard MOOC course.
Our refund policy is straightforward. If you ask for your money back, we'll happily refund you the full amount.
We understand that every student has specific learning outcomes in mind for a course like this. We're doing everything we can to make the course both specific enough to be valuable to individuals and general enough to be engaging to a group of students, all while being as explict as possible about our course goals. All of these things can be tricky; if a student feels we missed the mark for them, we encourage that individual to take advantage of our refund offer. That said, potential students can help us by reading our syllabus, gauging their availability to fully commit to the course, and not hesitating to ask us clarifying questions anywhere they feel we aren't clear.