OptiMHC¶
An optimized rescoring pipeline for immunopeptidomics data that significantly enhances peptide identification performance.
OptiMHC integrates multiple rescoring features with machine learning-based rescoring to maximize the number of confidently identified peptides from mass spectrometry experiments.
How It Works¶
Input files (PepXML / PIN)
→ Parsing & feature extraction
→ PsmContainer (central data structure)
→ Feature generation (Basic, Spectral, RT, MHC binding, PWM, Overlap, …)
→ Machine learning rescoring (Percolator / XGBoost / RandomForest)
→ Visualization & output
- Parse search engine results from PepXML or PIN format into a unified
PsmContainer. - Generate features using a configurable set of features — each adds new scoring dimensions to the PSM data.
- Rescore PSMs with machine learning models (Percolator SVM, XGBoost, or RandomForest) trained via mokapot to separate targets from decoys at a controlled FDR.
- Visualize results with q-value curves, feature importance plots, and target/decoy distributions.
Getting Started¶
- Installation — set up OptiMHC on your system
- Quick Start — run your first rescoring pipeline in minutes
Learn More¶
- Tutorial — examples, pipeline walkthrough, and feature explanations
- API Reference — detailed module and class documentation
- Development — set up a development environment