Automatic and transparent resource contention mitigation for improving large-scale parallel file system performance
Proportional to the scale increases in HPC systems, many scientific applications are becoming increasingly data intensive, and parallel I/O has become one of the dominant factors impacting the large-scale HPC application performance. On a typical large-scale HPC system, we have observed that the lac...
Gespeichert in:
| Hauptverfasser: | , |
|---|---|
| Dokumenttyp: | Kapitel/Artikel Konferenzschrift |
| Sprache: | Englisch |
| Veröffentlicht: |
31 May 2018
|
| In: |
2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS 2017)
Year: 2018, Pages: 604-613 |
| DOI: | 10.1109/ICPADS.2017.00084 |
| Online-Zugang: | Verlag, Volltext: http://dx.doi.org/10.1109/ICPADS.2017.00084 Verlag, Volltext: https://ieeexplore.ieee.org/document/8368413/ |
| Verfasserangaben: | Sarah Neuwirth, Feiyi Wang, Sarp Oral and Ulrich Bruening |
| Zusammenfassung: | Proportional to the scale increases in HPC systems, many scientific applications are becoming increasingly data intensive, and parallel I/O has become one of the dominant factors impacting the large-scale HPC application performance. On a typical large-scale HPC system, we have observed that the lack of a global workload coordination coupled with the shared nature of storage systems cause load imbalance and resource contention over the end-to-end I/O paths resulting in severe performance degradation. I/O load imbalance on HPC systems is generally a self-inflicted wound and mostly occurs between the I/O paths and resources consumed by each individual job. |
|---|---|
| Beschreibung: | Online Resource |
| ISBN: | 9781538621295 |
| DOI: | 10.1109/ICPADS.2017.00084 |