Visit our GitHub page to find out more about our current projects. Below you can find a list of our latest software.

FragPipe: a complete proteomics pipeline with the MSFragger search engine at heart

FragPipe is a GUI for a suite of computational tools enabling comprehensive analysis of proteomics data. It is powered by MSFragger, and includes the Philosopher toolkit for post-processing of MSFragger search results, FDR filtering, label-based quantification, and multi-experiment report generation. Crystal-C and PTM-Shepherd are included to aid interpretation of open search results. Also included in FragPipe are TMT-Integrator for TMT/iTRAQ isobaric labeling-based quantification, IonQuant for label-free quantification with match-between-run (MBR) functionality, SpectraST and EasyPQP spectral library building modules, and DIA-Umpire SE module for direct analysis of data independent acquisition (DIA) data.

MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics

MSFragger is an ultrafast database search tool that uses a fragment ion indexing method to rapidly perform spectra similarity comparisons. On a typical quad-core workstation, MSFragger is able to perform open searching (500 Da precursor mass window tolerance) in under 10 minutes for a single LC-MS/MS run. It is implemented in the Java programming language and is available as a standalone JAR.

Kong, A. T.; Leprevost, F. V.; Avtonomov, D. M.; Mellacheruvu, D.; Nesvizhskii, A. I. MSFragger: Ultrafast and Comprehensive Peptide Identification in Mass Spectrometry-Based Proteomics. Nat. Methods 2017, 14 (5), 513–520.

Philosopher: a complete toolkit for shotgun proteomics data analysis

Philosopher is fast, easy-to-use, scalable, and versatile data analysis software for mass spectrometry-based proteomics. Philosopher is dependency-free and can analyze both traditional database searches and open searches for post-translational modification (PTM) discovery.

Leprevost, F. V.; Haynes, S. E.; Avtonomov, D. M.; Chang, H. Y.; Shanmugam, A. K.; Mellacheruvu, D.; Kong, A. T.; Nesvizhskii, A. I. Philosopher: a versatile toolkit for shotgun proteomics data analysis. Nat. Methods 2020, 1-2.

PTM-Shepherd: a tool for summarizing open search results

PTM-Shepherd automates characterization of PTM profiles detected in open searches based on attributes such as amino acid localization, fragmentation spectra similarity, retention time shifts, and relative modification rates. PTM-Shepherd can also perform multi-experiment comparisons for studying changes in modification profiles, e.g. in data generated in different laboratories or under different conditions.

Geiszler, D. J., Kong, A. T., Avtonomov, D. M., Yu, F., Leprevost, F. V., & Nesvizhskii, A. I. PTM-Shepherd: analysis and summarization of post-translational and chemical modifications from open search results. bioRxiv 2020.07.08.192583.

TMT-Integrator: a tool for integrating channel abundances from multiple TMT samples

TMT-Integrator extracts and combines channel abundances from multiple TMT or iTRAQ-labeled samples. It takes PSM tables generated by Philosphor as input, and generates quantification reports at the user-specified levels. TMT-Integrator currently provides four quantification options: gene, protein, peptide, and modified site levels. TMT-Integrator is included in FragPipe for complete analyses of isobaric labeling experiments.

IonQuant: a label-free quantification tool

IonQuant is a label free quantification tool for both ion mobility (timsTOF) and non-ion mobility (e.g. Orbitrap) data. It supports matching-between-runs (MBR) as well as isotopic chemical labeling. IonQuant is included in the FragPipe user interface.

Yu, F., Haynes, S. E., Teo, G. C., Avtonomov, D. M., Polasky, D. A., & Nesvizhskii, A. I. Fast quantitative analysis of timsTOF PASEF data with MSFragger and IonQuant. Mol Cell Proteomics 2020, 19(9), 1575–1585.

PD Nodes: MSFragger and Philosopher (PeptideProphet) as Proteome Discoverer nodes

While MSFragger can be used in FragPipe or as a stand-alone search engine, MSFragger can also be used as a processing node in the Thermo Scientific Proteome Discoverer (PD) environment. We also provide PeptideProphet (via Philosopher) as part of the PD processing node, enabling downstream processing of MSFragger search results in PD using either Percolator or PeptideProphet.

BatMass: a Java Software Platform for LC-MS Data Visualization in Proteomics and Metabolomics

BatMass is a mass spectrometry data visualization tool. It was created to provide an extensible platform, providing basic functionality, like project management, raw mass-spectrometry data access, various GUI widgets and extension points.

Avtonomov, D. M.; Raskind, A.; Nesvizhskii, A. I. BatMass: A Java Software Platform for LC-MS Data Visualization in Proteomics and Metabolomics. J. Proteome Res. 2016, 15 (8), 2500–2509.

DIA-Umpire: comprehensive computational framework for data independent acquisition proteomics

DIA-Umpire is an open source Java program for computational analysis of data independent acquisition (DIA) mass spectrometry-based proteomics data. It enables untargeted peptide and protein identification and quantitation using DIA data, and also incorporates targeted extraction to reduce the number of cases of missing quantitation.

Tsou, C. C.; Avtonomov, D.; Larsen, B.; Tucholska, M.; Choi, H.; Gingras, A. C.; Nesvizhskii, A. I. DIA-Umpire: Comprehensive Computational Framework for Data-Independent Acquisition Proteomics. Nat. Methods 2015, 12 (3), 258–264.

mapDIA: preprocessing and statistical analysis of quantitative proteomics data from data independent acquisition mass spectrometry

mapDIA is software for statistical analysis of differential expression using MS/MS fragment-level quantitative data from data independent acquisition (DIA) proteomics experiments. It offers a series of tools for essential data preprocessing, including a novel retention time-based normalization method and multiple peptide/fragment selection steps.

Teo, G.; Kim, S.; Tsou, C. C.; Collins, B.; Gingras, A. C.; Nesvizhskii, A. I.; Choi, H. MapDIA: Preprocessing and Statistical Analysis of Quantitative Proteomics Data from Data Independent Acquisition Mass Spectrometry. J Proteomics 2015, 129, 108–120.

The CRAPome: a contaminant repository for affinity purification - mass spectrometry data

Contaminant Repository for Affinity Purification (CRAPome) is a database of annotated negative controls contributed by the proteomics research community. It addresses the common problem of distinguishing real interactions from the non-specific background (also known as 'contaminants'). The database and associated computational tools to score protein interactions are available online. The intuitive web-interface can be used to explore the database and to analyze user-uploaded data.

Mellacheruvu, D.; Wright, Z.; Couzens, A. L.; Lambert, J. P.; St-Denis, N. A.; Li, T.; Miteva, Y. V.; Hauri, S.; Sardiu, M. E.; Low, T. Y.; et al. The CRAPome: A Contaminant Repository for Affinity Purification-Mass Spectrometry Data. Nat. Methods 2013, 10 (8), 730–736.

SAINT: probabilistic scoring of affinity purification–mass spectrometry data

Computational models and software for assigning confidence scores to protein-protein interactions in label-free quantitative AP-MS datasets. For each observed interaction with associated label-fee quantification, SAINT calculates the probability of true interaction. The modeling incorporates various data normalization steps and is also capable of utilizing the quantittaive information from negative control purifications for improving specificity in small-to-intermediate scale experiments (SAINT v. 2). The method was initially developed for label-free spectral count data, but was later extended to MS1 intensity-based quantitative data (SAINT-MS1). SAINTexpress is a recently developed fast version of the algorithm.

Choi, H.; Liu, G.; Mellacheruvu, D.; Tyers, M.; Gingras, A. C.; Nesvizhskii, A. I. Analyzing Protein-Protein Interactions from Affinity Purification-Mass Spectrometry Data with SAINT. Curr Protoc Bioinformatics 2012, Chapter 8, Unit8.15.

SAINT Website

ProHits: integrated software for mass spectrometry-based interaction proteomics

ProHits is a Laboratory Management System (LIMS) for interaction proteomics developed primarily by the Anne-Claude Gingras and Mike Tyers laboratories in collaboration with Nesvizhskii lab. It is a comprehensive system that integrates the TPP/iProphet for peptide/protein identification and SAINT suite of tools for interaction scoring.

Nature Biotechnology, 2010

PROHITS Website

LuciPHOr2: site localization of generic post-translational modifications from tandem mass spectrometry data

Luciphor2 re-implements the original Luciphor algorithm 9see above) in JAVA and expands it to work on any post-translational modification. Luciphor2 has several features over the previous version: It can run on any computer that uses JAVA It can score any PTM It can score results from any search tool Like the original Luciphor, this release can process PeptideProphet XML files (pepXML). It can also read in tab-delimited files with scores from any protein search tool.

Bioinformatics, 2014

LuciPHOR2 Website

Abacus: a computational tool for extracting and pre-processing spectral count data for label-free quantitative proteomic analysis

ABACUS is a computational tool for extracting label-free quantitative information (spectral counts) from MS/MS data sets. It aggregates data from multiple experiments, adjusts spectral counts to accurately account for peptides shared across multiple proteins, and performs common normalization steps. It can also output the spectral count data at the gene level, thus simplifying the integration and comparison between gene and protein expression data. Abacus is compatible with the widely used Trans-Proteomic Pipeline suite of tools and comes with a graphical user interface making it easy to interact with the program. The main aim of Abacus is to streamline the analysis of spectral count data by providing an automated, easy to use solution for extracting this information from proteomic data sets for subsequent, more sophisticated statistical analysis.

Proteomcis, 2011

Abacus Website

QPROT: statistical method for testing differential expression using protein-level intensity data in label-free quantitative proteomics

QPROT is a software for differential protein expression using MS1 and MS/MS-level continuous quantitative data. Features a hierarchical model with predictive recursive algorithm. Includes percentile normalization and multiple threading for fast computing.

Proteomics, 2015

QPROT Website

QSPEC: significance analysis of spectral count data in label-free shotgun proteomics.

Software for the analysis of differential protein expression using label-free spectral count data. The hierarchical model of QSPEC pools statistical information for mean and variance estimates across all proteins in the presence of limited number of replicate data. In a typical quantitative proteomics experiment, there are rarely a sufficient number of replicates to render conventional statistic-based tests such as T-test applicable. QSPEC addresses this problem and calculates the ratio of likelihoods (Bayes Factor) for differential expression for each protein based on certain model assumptions (Poisson-family distributions for count data and Gaussian distribution for intensity data).

Molecular & Cell Proteomics, 2008

Please use QProt instead

NestedCluster: analysis of Protein Complexes via Model-based Biclustering of Label-free Quantitative AP-MS Data

A biclustering method for constructing protein complexes using (filtered) high-confidence interaction data from label-free quantitative AP-MS experiment. The method forms bait clusters based on the similarity of quantitative interaction profiles as anchors of protein complexes, and identifies submatrices of prey proteins showing consistent quantitative association within the anchor bait clusters. The statistical model here determines the optimal number of bait clusters and prey clusters in the data, automatically yielding the configuration of highly probable protein complexes.

Molecular Systems Biology, 2010

Download NestedCluster

Trans-Proteomic Pipeline

We developed core components of the widely used open source data analysis pipeline (Trans-Proteomic Pipeline, TPP) for primary processing of mass spectrometry-based proteomic data. The pipeline is currently maintained by the Seattle Proteomics Center at the Institute for Systems Biology.

Download TPP