PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification

The first step in bottom-up proteomics is the assignment of measured fragmentation mass spectra to peptide sequences, also known as peptide spectrum matches. In recent years novel algorithms have pushed the assignment to new heights; unfortunately, different algorithms come with different strengths...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ranff, Tristan (VerfasserIn) , Dennison, Matthew (VerfasserIn) , Bédorf, Jeroen (VerfasserIn) , Schulze, Stefan (VerfasserIn) , Zinn, Nico (VerfasserIn) , Bantscheff, Marcus (VerfasserIn) , van Heugten, Jasper J. R. M. (VerfasserIn) , Fufezan, Christian (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: January 22, 2025
In: Journal of proteome research
Year: 2025, Jahrgang: 24, Heft: 2, Pages: 929-939
ISSN:1535-3907
DOI:10.1021/acs.jproteome.4c00686
Online-Zugang:Verlag, kostenfrei, Volltext: https://doi.org/10.1021/acs.jproteome.4c00686
Verlag, kostenfrei, Volltext: https://pubs.acs.org/doi/10.1021/acs.jproteome.4c00686
Volltext
Verfasserangaben:Tristan Ranff, Matthew Dennison, Jeroen Bédorf, Stefan Schulze, Nico Zinn, Marcus Bantscheff, Jasper J.R.M. van Heugten, and Christian Fufezan

MARC

LEADER 00000naa a2200000 c 4500
001 193188448X
003 DE-627
005 20250728121317.0
007 cr uuu---uuuuu
008 250728s2025 xx |||||o 00| ||eng c
024 7 |a 10.1021/acs.jproteome.4c00686  |2 doi 
035 |a (DE-627)193188448X 
035 |a (DE-599)KXP193188448X 
040 |a DE-627  |b ger  |c DE-627  |e rda 
041 |a eng 
084 |a 30  |2 sdnb 
100 1 |a Ranff, Tristan  |e VerfasserIn  |0 (DE-588)1354615808  |0 (DE-627)191575030X  |4 aut 
245 1 0 |a PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification  |c Tristan Ranff, Matthew Dennison, Jeroen Bédorf, Stefan Schulze, Nico Zinn, Marcus Bantscheff, Jasper J.R.M. van Heugten, and Christian Fufezan 
264 1 |c January 22, 2025 
300 |b Illustrationen 
300 |a 11 
336 |a Text  |b txt  |2 rdacontent 
337 |a Computermedien  |b c  |2 rdamedia 
338 |a Online-Ressource  |b cr  |2 rdacarrier 
500 |a Gesehen am 28.07.2025 
520 |a The first step in bottom-up proteomics is the assignment of measured fragmentation mass spectra to peptide sequences, also known as peptide spectrum matches. In recent years novel algorithms have pushed the assignment to new heights; unfortunately, different algorithms come with different strengths and weaknesses and choosing the appropriate algorithm poses a challenge for the user. Here we introduce PeptideForest, a semisupervised machine learning approach that integrates the assignments of multiple algorithms to train a random forest classifier to alleviate that issue. Additionally, PeptideForest increases the number of peptide-to-spectrum matches that exhibit a q-value lower than 1% by 25.2 ± 1.6% compared to MS-GF+ data on samples containing mixed HEK and Escherichia coli proteomes. However, an increase in quantity does not necessarily reflect an increase in quality and this is why we devised a novel approach to determine the quality of the assigned spectra through TMT quantification of samples with known ground truths. Thereby, we could show that the increase in PSMs below 1% q-value does not come with a decrease in quantification quality and as such PeptideForest offers a possibility to gain deeper insights into bottom-up proteomics. PeptideForest has been integrated into our pipeline framework Ursgal and can therefore be combined with a wide array of algorithms. 
700 1 |a Dennison, Matthew  |e VerfasserIn  |4 aut 
700 1 |a Bédorf, Jeroen  |e VerfasserIn  |4 aut 
700 1 |a Schulze, Stefan  |e VerfasserIn  |4 aut 
700 1 |a Zinn, Nico  |e VerfasserIn  |0 (DE-588)14303555X  |0 (DE-627)704370808  |0 (DE-576)334583640  |4 aut 
700 1 |a Bantscheff, Marcus  |d 1972-  |e VerfasserIn  |0 (DE-588)12355263X  |0 (DE-627)082625409  |0 (DE-576)293763410  |4 aut 
700 1 |a van Heugten, Jasper J. R. M.  |e VerfasserIn  |4 aut 
700 1 |a Fufezan, Christian  |e VerfasserIn  |0 (DE-588)1219004677  |0 (DE-627)1734651814  |4 aut 
773 0 8 |i Enthalten in  |t Journal of proteome research  |d Washington, DC : ACS Publications, 2002  |g 24(2025), 2, Seite 929-939  |h Online-Ressource  |w (DE-627)340077174  |w (DE-600)2065254-9  |w (DE-576)096704608  |x 1535-3907  |7 nnas  |a PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification 
773 1 8 |g volume:24  |g year:2025  |g number:2  |g pages:929-939  |g extent:11  |a PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification 
856 4 0 |u https://doi.org/10.1021/acs.jproteome.4c00686  |x Verlag  |x Resolving-System  |z kostenfrei  |3 Volltext 
856 4 0 |u https://pubs.acs.org/doi/10.1021/acs.jproteome.4c00686  |x Verlag  |z kostenfrei  |3 Volltext 
951 |a AR 
992 |a 20250728 
993 |a Article 
994 |a 2025 
998 |g 1219004677  |a Fufezan, Christian  |m 1219004677:Fufezan, Christian  |d 140000  |e 140000PF1219004677  |k 0/140000/  |p 8  |y j 
998 |g 1354615808  |a Ranff, Tristan  |m 1354615808:Ranff, Tristan  |d 120000  |e 120000PR1354615808  |k 0/120000/  |p 1  |x j 
999 |a KXP-PPN193188448X  |e 4750083704 
BIB |a Y 
SER |a journal 
JSO |a {"physDesc":[{"noteIll":"Illustrationen","extent":"11 S."}],"relHost":[{"physDesc":[{"extent":"Online-Ressource"}],"origin":[{"dateIssuedDisp":"2002-","dateIssuedKey":"2002","publisher":"ACS Publications","publisherPlace":"Washington, DC"}],"id":{"issn":["1535-3907"],"zdb":["2065254-9"],"eki":["340077174"]},"name":{"displayForm":["American Chemical Society"]},"pubHistory":["2002 -"],"part":{"year":"2025","issue":"2","pages":"929-939","text":"24(2025), 2, Seite 929-939","volume":"24","extent":"11"},"type":{"media":"Online-Ressource","bibl":"periodical"},"note":["Gesehen am 25.08.2020"],"disp":"PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identificationJournal of proteome research","recId":"340077174","language":["eng"],"title":[{"title_sort":"Journal of proteome research","title":"Journal of proteome research"}]}],"name":{"displayForm":["Tristan Ranff, Matthew Dennison, Jeroen Bédorf, Stefan Schulze, Nico Zinn, Marcus Bantscheff, Jasper J.R.M. van Heugten, and Christian Fufezan"]},"origin":[{"dateIssuedKey":"2025","dateIssuedDisp":"January 22, 2025"}],"id":{"eki":["193188448X"],"doi":["10.1021/acs.jproteome.4c00686"]},"note":["Gesehen am 28.07.2025"],"type":{"media":"Online-Ressource","bibl":"article-journal"},"language":["eng"],"recId":"193188448X","person":[{"family":"Ranff","given":"Tristan","display":"Ranff, Tristan","roleDisplay":"VerfasserIn","role":"aut"},{"family":"Dennison","given":"Matthew","display":"Dennison, Matthew","roleDisplay":"VerfasserIn","role":"aut"},{"given":"Jeroen","family":"Bédorf","role":"aut","roleDisplay":"VerfasserIn","display":"Bédorf, Jeroen"},{"family":"Schulze","given":"Stefan","roleDisplay":"VerfasserIn","display":"Schulze, Stefan","role":"aut"},{"family":"Zinn","given":"Nico","display":"Zinn, Nico","roleDisplay":"VerfasserIn","role":"aut"},{"given":"Marcus","family":"Bantscheff","role":"aut","roleDisplay":"VerfasserIn","display":"Bantscheff, Marcus"},{"roleDisplay":"VerfasserIn","display":"van Heugten, Jasper J. R. M.","role":"aut","family":"van Heugten","given":"Jasper J. R. M."},{"role":"aut","display":"Fufezan, Christian","roleDisplay":"VerfasserIn","given":"Christian","family":"Fufezan"}],"title":[{"title":"PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification","title_sort":"PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification"}]} 
SRT |a RANFFTRISTPEPTIDEFOR2220