PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification
The first step in bottom-up proteomics is the assignment of measured fragmentation mass spectra to peptide sequences, also known as peptide spectrum matches. In recent years novel algorithms have pushed the assignment to new heights; unfortunately, different algorithms come with different strengths...
Gespeichert in:
| Hauptverfasser: | , , , , , , , |
|---|---|
| Dokumenttyp: | Article (Journal) |
| Sprache: | Englisch |
| Veröffentlicht: |
January 22, 2025
|
| In: |
Journal of proteome research
Year: 2025, Jahrgang: 24, Heft: 2, Pages: 929-939 |
| ISSN: | 1535-3907 |
| DOI: | 10.1021/acs.jproteome.4c00686 |
| Online-Zugang: | Verlag, kostenfrei, Volltext: https://doi.org/10.1021/acs.jproteome.4c00686 Verlag, kostenfrei, Volltext: https://pubs.acs.org/doi/10.1021/acs.jproteome.4c00686 |
| Verfasserangaben: | Tristan Ranff, Matthew Dennison, Jeroen Bédorf, Stefan Schulze, Nico Zinn, Marcus Bantscheff, Jasper J.R.M. van Heugten, and Christian Fufezan |
MARC
| LEADER | 00000naa a2200000 c 4500 | ||
|---|---|---|---|
| 001 | 193188448X | ||
| 003 | DE-627 | ||
| 005 | 20250728121317.0 | ||
| 007 | cr uuu---uuuuu | ||
| 008 | 250728s2025 xx |||||o 00| ||eng c | ||
| 024 | 7 | |a 10.1021/acs.jproteome.4c00686 |2 doi | |
| 035 | |a (DE-627)193188448X | ||
| 035 | |a (DE-599)KXP193188448X | ||
| 040 | |a DE-627 |b ger |c DE-627 |e rda | ||
| 041 | |a eng | ||
| 084 | |a 30 |2 sdnb | ||
| 100 | 1 | |a Ranff, Tristan |e VerfasserIn |0 (DE-588)1354615808 |0 (DE-627)191575030X |4 aut | |
| 245 | 1 | 0 | |a PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification |c Tristan Ranff, Matthew Dennison, Jeroen Bédorf, Stefan Schulze, Nico Zinn, Marcus Bantscheff, Jasper J.R.M. van Heugten, and Christian Fufezan |
| 264 | 1 | |c January 22, 2025 | |
| 300 | |b Illustrationen | ||
| 300 | |a 11 | ||
| 336 | |a Text |b txt |2 rdacontent | ||
| 337 | |a Computermedien |b c |2 rdamedia | ||
| 338 | |a Online-Ressource |b cr |2 rdacarrier | ||
| 500 | |a Gesehen am 28.07.2025 | ||
| 520 | |a The first step in bottom-up proteomics is the assignment of measured fragmentation mass spectra to peptide sequences, also known as peptide spectrum matches. In recent years novel algorithms have pushed the assignment to new heights; unfortunately, different algorithms come with different strengths and weaknesses and choosing the appropriate algorithm poses a challenge for the user. Here we introduce PeptideForest, a semisupervised machine learning approach that integrates the assignments of multiple algorithms to train a random forest classifier to alleviate that issue. Additionally, PeptideForest increases the number of peptide-to-spectrum matches that exhibit a q-value lower than 1% by 25.2 ± 1.6% compared to MS-GF+ data on samples containing mixed HEK and Escherichia coli proteomes. However, an increase in quantity does not necessarily reflect an increase in quality and this is why we devised a novel approach to determine the quality of the assigned spectra through TMT quantification of samples with known ground truths. Thereby, we could show that the increase in PSMs below 1% q-value does not come with a decrease in quantification quality and as such PeptideForest offers a possibility to gain deeper insights into bottom-up proteomics. PeptideForest has been integrated into our pipeline framework Ursgal and can therefore be combined with a wide array of algorithms. | ||
| 700 | 1 | |a Dennison, Matthew |e VerfasserIn |4 aut | |
| 700 | 1 | |a Bédorf, Jeroen |e VerfasserIn |4 aut | |
| 700 | 1 | |a Schulze, Stefan |e VerfasserIn |4 aut | |
| 700 | 1 | |a Zinn, Nico |e VerfasserIn |0 (DE-588)14303555X |0 (DE-627)704370808 |0 (DE-576)334583640 |4 aut | |
| 700 | 1 | |a Bantscheff, Marcus |d 1972- |e VerfasserIn |0 (DE-588)12355263X |0 (DE-627)082625409 |0 (DE-576)293763410 |4 aut | |
| 700 | 1 | |a van Heugten, Jasper J. R. M. |e VerfasserIn |4 aut | |
| 700 | 1 | |a Fufezan, Christian |e VerfasserIn |0 (DE-588)1219004677 |0 (DE-627)1734651814 |4 aut | |
| 773 | 0 | 8 | |i Enthalten in |t Journal of proteome research |d Washington, DC : ACS Publications, 2002 |g 24(2025), 2, Seite 929-939 |h Online-Ressource |w (DE-627)340077174 |w (DE-600)2065254-9 |w (DE-576)096704608 |x 1535-3907 |7 nnas |a PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification |
| 773 | 1 | 8 | |g volume:24 |g year:2025 |g number:2 |g pages:929-939 |g extent:11 |a PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification |
| 856 | 4 | 0 | |u https://doi.org/10.1021/acs.jproteome.4c00686 |x Verlag |x Resolving-System |z kostenfrei |3 Volltext |
| 856 | 4 | 0 | |u https://pubs.acs.org/doi/10.1021/acs.jproteome.4c00686 |x Verlag |z kostenfrei |3 Volltext |
| 951 | |a AR | ||
| 992 | |a 20250728 | ||
| 993 | |a Article | ||
| 994 | |a 2025 | ||
| 998 | |g 1219004677 |a Fufezan, Christian |m 1219004677:Fufezan, Christian |d 140000 |e 140000PF1219004677 |k 0/140000/ |p 8 |y j | ||
| 998 | |g 1354615808 |a Ranff, Tristan |m 1354615808:Ranff, Tristan |d 120000 |e 120000PR1354615808 |k 0/120000/ |p 1 |x j | ||
| 999 | |a KXP-PPN193188448X |e 4750083704 | ||
| BIB | |a Y | ||
| SER | |a journal | ||
| JSO | |a {"physDesc":[{"noteIll":"Illustrationen","extent":"11 S."}],"relHost":[{"physDesc":[{"extent":"Online-Ressource"}],"origin":[{"dateIssuedDisp":"2002-","dateIssuedKey":"2002","publisher":"ACS Publications","publisherPlace":"Washington, DC"}],"id":{"issn":["1535-3907"],"zdb":["2065254-9"],"eki":["340077174"]},"name":{"displayForm":["American Chemical Society"]},"pubHistory":["2002 -"],"part":{"year":"2025","issue":"2","pages":"929-939","text":"24(2025), 2, Seite 929-939","volume":"24","extent":"11"},"type":{"media":"Online-Ressource","bibl":"periodical"},"note":["Gesehen am 25.08.2020"],"disp":"PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identificationJournal of proteome research","recId":"340077174","language":["eng"],"title":[{"title_sort":"Journal of proteome research","title":"Journal of proteome research"}]}],"name":{"displayForm":["Tristan Ranff, Matthew Dennison, Jeroen Bédorf, Stefan Schulze, Nico Zinn, Marcus Bantscheff, Jasper J.R.M. van Heugten, and Christian Fufezan"]},"origin":[{"dateIssuedKey":"2025","dateIssuedDisp":"January 22, 2025"}],"id":{"eki":["193188448X"],"doi":["10.1021/acs.jproteome.4c00686"]},"note":["Gesehen am 28.07.2025"],"type":{"media":"Online-Ressource","bibl":"article-journal"},"language":["eng"],"recId":"193188448X","person":[{"family":"Ranff","given":"Tristan","display":"Ranff, Tristan","roleDisplay":"VerfasserIn","role":"aut"},{"family":"Dennison","given":"Matthew","display":"Dennison, Matthew","roleDisplay":"VerfasserIn","role":"aut"},{"given":"Jeroen","family":"Bédorf","role":"aut","roleDisplay":"VerfasserIn","display":"Bédorf, Jeroen"},{"family":"Schulze","given":"Stefan","roleDisplay":"VerfasserIn","display":"Schulze, Stefan","role":"aut"},{"family":"Zinn","given":"Nico","display":"Zinn, Nico","roleDisplay":"VerfasserIn","role":"aut"},{"given":"Marcus","family":"Bantscheff","role":"aut","roleDisplay":"VerfasserIn","display":"Bantscheff, Marcus"},{"roleDisplay":"VerfasserIn","display":"van Heugten, Jasper J. R. M.","role":"aut","family":"van Heugten","given":"Jasper J. R. M."},{"role":"aut","display":"Fufezan, Christian","roleDisplay":"VerfasserIn","given":"Christian","family":"Fufezan"}],"title":[{"title":"PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification","title_sort":"PeptideForest: Semisupervised machine learning integrating multiple search engines for peptide identification"}]} | ||
| SRT | |a RANFFTRISTPEPTIDEFOR2220 | ||