COMP 790-124 (Fall 2011) — Machine Learning in Computational Biology
Modern techniques in machine learning and their application to computational biology problems.
Organizational
Time: Tuesdays,Thursdays 12:30-1:45
Place:
Sitterson 011
Prerequisites: Linear algebra, Probability or Statistics, Biology, some programming (Matlab/R/Python)
Instructor: Vladimir Jojic (vjojic@cs.unc.edu)
Office hours: SN 319 Tuesdays 2pm-3pm
Overview
Rapid accumulation of biological data enabled by novel measurement technologies necessitates innovation in data analysis. Machine learning is a growing field that has found numerous applications ranging from basic biology to personalized medicine. Whether discovering signatures of cancer or recommending the best treatment, the modeling and analysis paradigms of machine learning have been fruitfully applied. This course aims to introduce you to the basics of machine learning and their application to burning questions in biology and medicine.
Structure
The course aims to engage you in solving computational biology problems using machine learning. In order to achieve this, the course will consist of three components
- Lectures covering ML and comp bio applications
- Student led discussion of relevant papers
- Student project or a written survey of machine learning/comp bio literature
A project that yields a novel and exciting prediction may be selected for experimental validation either commercially (AssayDepot) or by collaborators at UNC.
Grading
-
3 credits:
- Paper presentation: 30%
- Project and project paper: 50% Example project proposal [Zip]
- Discussion participation: 20%
- 1 credit:
- Paper presentation: 60%
- Discussion participation: 40%
Topics covered
Machine Learning
- Linear models for regression/classification (+sparse)
- Mixture and hierarchical models
- Subspace models: factor analysis, PCA (+sparse)
- Graphical models: inference and learning
- Expectation Maximization and variants (including variational approximations)
- Structured models: chains (HMM), trees (phylo- and ontogenies)
- Structure learning in Gaussian models
- Max margin approaches
- Bayesian nonparametrics
- Random projections and compressed sensing
Computational Biology applications
- Motif discovery
- Regulatory network reconstruction
- QTL
- Epitope prediction
- Modeling vaccine and drug responses
- Metagenomics
- Epigenetics
Audience
Students from Computer Science, Bioinformatics, Biology are welcome. I encourage joint projects between students from complementary disciplines.Textbook
There is no textbook for this course, but you may find following helpful:- "Pattern Recognition and Machine Learning," Chris M. Bishop
- "Probabilistic Graphical Models," Daphne Koller and Nir Friedman
- "The Elements of Statistical Learning: Data Mining, Inference, and Prediction," T. Hastie, R. Tibshirani, J. Friedman, download
- "Information Theory, Inference, and Learning Algorithms," David MacKay, download
- "Bioinformatics: The Machine Learning Approach", Pierre Baldi, Søren Brunak
