An implementation of tensor product patch smoothers on GPUs
In this paper, a task-scheduling approach to efficiently calculating sparse symmetric matrix-vector products and designed to run on graphics processing units (GPUs) is presented. The main premise is that, for many sparse symmetric matrices occurring in common applications, it is possible to obtain s...
Gespeichert in:
| Hauptverfasser: | , , , |
|---|---|
| Dokumenttyp: | Article (Journal) |
| Sprache: | Englisch |
| Veröffentlicht: |
2025
|
| In: |
SIAM journal on scientific computing
Year: 2025, Jahrgang: 47, Heft: 2, Pages: B280-B307 |
| ISSN: | 1095-7197 |
| DOI: | 10.1137/24M1642706 |
| Online-Zugang: | Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1137/24M1642706 Verlag, lizenzpflichtig, Volltext: https://epubs.siam.org/doi/10.1137/24M1642706 |
| Verfasserangaben: | Cu Cui, Paul Grosse-Bley, Guido Kanschat, and Robert Strzodka |
MARC
| LEADER | 00000naa a2200000 c 4500 | ||
|---|---|---|---|
| 001 | 1932956034 | ||
| 003 | DE-627 | ||
| 005 | 20250811150500.0 | ||
| 007 | cr uuu---uuuuu | ||
| 008 | 250811s2025 xx |||||o 00| ||eng c | ||
| 024 | 7 | |a 10.1137/24M1642706 |2 doi | |
| 035 | |a (DE-627)1932956034 | ||
| 035 | |a (DE-599)KXP1932956034 | ||
| 040 | |a DE-627 |b ger |c DE-627 |e rda | ||
| 041 | |a eng | ||
| 084 | |a 27 |2 sdnb | ||
| 100 | 1 | |a Cui, Cu |e VerfasserIn |0 (DE-588)1373554339 |0 (DE-627)1932956395 |4 aut | |
| 245 | 1 | 3 | |a An implementation of tensor product patch smoothers on GPUs |c Cu Cui, Paul Grosse-Bley, Guido Kanschat, and Robert Strzodka |
| 264 | 1 | |c 2025 | |
| 300 | |b Illustrationen | ||
| 300 | |a 28 | ||
| 336 | |a Text |b txt |2 rdacontent | ||
| 337 | |a Computermedien |b c |2 rdamedia | ||
| 338 | |a Online-Ressource |b cr |2 rdacarrier | ||
| 500 | |a Gesehen am 11.08.2025 | ||
| 520 | |a In this paper, a task-scheduling approach to efficiently calculating sparse symmetric matrix-vector products and designed to run on graphics processing units (GPUs) is presented. The main premise is that, for many sparse symmetric matrices occurring in common applications, it is possible to obtain significant reductions in memory usage and improvements in performance when the matrix is prepared in certain ways prior to computation. The preprocessing proposed in this paper employs task scheduling to overcome the difficulties that have suppressed the development of methods taking advantage of the symmetry of sparse matrices. The performance of the proposed task-scheduling method is verified using a Kepler (Tesla K40c) graphics accelerator, and is compared to the performance of cuSPARSE library functions on a GPU and to functions from the Intel MKL on central processing units (CPUs) executed in the parallel mode. The obtained results indicate that the proposed approach for sparse symmetric matrix-vector products results in up to a 40% reduction in memory usage, as compared to nonsymmetric matrix storage formats, while retaining good throughput. Compared to cuSPARSE and Intel MKL functions for sparse symmetric matrices, the proposed TSMV approach allowed us to achieve a significant speedup (of over one order of magnitude). | ||
| 700 | 1 | |a Große-Bley, Paul |e VerfasserIn |0 (DE-588)1373554649 |0 (DE-627)1932956832 |4 aut | |
| 700 | 1 | |a Kanschat, Guido |e VerfasserIn |0 (DE-588)102535334X |0 (DE-627)72215612X |0 (DE-576)175755949 |4 aut | |
| 700 | 1 | |a Strzodka, Robert |d 1973- |e VerfasserIn |0 (DE-588)122745264 |0 (DE-627)487567145 |0 (DE-576)293403473 |4 aut | |
| 773 | 0 | 8 | |i Enthalten in |a Society for Industrial and Applied Mathematics |t SIAM journal on scientific computing |d Philadelphia, Pa. : SIAM, 1993 |g 47(2025), 2, Seite B280-B307 |h Online-Ressource |w (DE-627)266885292 |w (DE-600)1468391-X |w (DE-576)078589967 |x 1095-7197 |7 nnas |
| 773 | 1 | 8 | |g volume:47 |g year:2025 |g number:2 |g pages:B280-B307 |g extent:28 |a An implementation of tensor product patch smoothers on GPUs |
| 856 | 4 | 0 | |u https://doi.org/10.1137/24M1642706 |x Verlag |x Resolving-System |z lizenzpflichtig |3 Volltext |
| 856 | 4 | 0 | |u https://epubs.siam.org/doi/10.1137/24M1642706 |x Verlag |z lizenzpflichtig |3 Volltext |
| 951 | |a AR | ||
| 992 | |a 20250811 | ||
| 993 | |a Article | ||
| 994 | |a 2025 | ||
| 998 | |g 122745264 |a Strzodka, Robert |m 122745264:Strzodka, Robert |d 700000 |d 720000 |e 700000PS122745264 |e 720000PS122745264 |k 0/700000/ |k 1/700000/720000/ |p 4 |y j | ||
| 998 | |g 102535334X |a Kanschat, Guido |m 102535334X:Kanschat, Guido |d 700000 |d 708000 |e 700000PK102535334X |e 708000PK102535334X |k 0/700000/ |k 1/700000/708000/ |p 3 | ||
| 998 | |g 1373554649 |a Große-Bley, Paul |m 1373554649:Große-Bley, Paul |d 700000 |d 720000 |e 700000PG1373554649 |e 720000PG1373554649 |k 0/700000/ |k 1/700000/720000/ |p 2 | ||
| 998 | |g 1373554339 |a Cui, Cu |m 1373554339:Cui, Cu |d 700000 |d 708000 |e 700000PC1373554339 |e 708000PC1373554339 |k 0/700000/ |k 1/700000/708000/ |p 1 |x j | ||
| 999 | |a KXP-PPN1932956034 |e 4756356362 | ||
| BIB | |a Y | ||
| SER | |a journal | ||
| JSO | |a {"physDesc":[{"extent":"28 S.","noteIll":"Illustrationen"}],"person":[{"given":"Cu","family":"Cui","role":"aut","display":"Cui, Cu"},{"family":"Große-Bley","given":"Paul","display":"Große-Bley, Paul","role":"aut"},{"role":"aut","display":"Kanschat, Guido","family":"Kanschat","given":"Guido"},{"display":"Strzodka, Robert","role":"aut","given":"Robert","family":"Strzodka"}],"title":[{"title_sort":"implementation of tensor product patch smoothers on GPUs","title":"An implementation of tensor product patch smoothers on GPUs"}],"relHost":[{"disp":"Society for Industrial and Applied MathematicsSIAM journal on scientific computing","language":["eng"],"type":{"media":"Online-Ressource","bibl":"periodical"},"titleAlt":[{"title":"Journal on scientific and statistical computing"},{"title":"Journal on scientific computing"}],"part":{"volume":"47","year":"2025","extent":"28","text":"47(2025), 2, Seite B280-B307","pages":"B280-B307","issue":"2"},"corporate":[{"role":"aut","display":"Society for Industrial and Applied Mathematics"}],"title":[{"title":"SIAM journal on scientific computing","title_sort":"SIAM journal on scientific computing"}],"pubHistory":["14.1993 -"],"id":{"zdb":["1468391-X"],"issn":["1095-7197"],"eki":["266885292"]},"recId":"266885292","name":{"displayForm":["Society for Industrial and Applied Mathematics"]},"physDesc":[{"extent":"Online-Ressource"}],"origin":[{"publisher":"SIAM","dateIssuedDisp":"1993-","dateIssuedKey":"1993","publisherPlace":"Philadelphia, Pa."}],"note":["Gesehen am 02.07.2021"]}],"origin":[{"dateIssuedKey":"2025","dateIssuedDisp":"2025"}],"note":["Gesehen am 11.08.2025"],"id":{"doi":["10.1137/24M1642706"],"eki":["1932956034"]},"language":["eng"],"name":{"displayForm":["Cu Cui, Paul Grosse-Bley, Guido Kanschat, and Robert Strzodka"]},"type":{"media":"Online-Ressource","bibl":"article-journal"},"recId":"1932956034"} | ||
| SRT | |a CUICUGROSSIMPLEMENTA2025 | ||