Multilevel interior penalty methods on GPUs
We present a matrix-free multigrid method for high-order Discontinuous Galerkin (DG) finite element methods with GPU acceleration. A performance analysis is conducted, comparing various data and compute layouts. Smoother implementations are optimized through localization and fast diagonalization tec...
Gespeichert in:
| Hauptverfasser: | , |
|---|---|
| Dokumenttyp: | Article (Journal) |
| Sprache: | Englisch |
| Veröffentlicht: |
September 2025
|
| In: |
ACM transactions on mathematical software
Year: 2025, Jahrgang: 51, Heft: 3, Pages: 1-27 |
| ISSN: | 1557-7295 |
| DOI: | 10.1145/3765616 |
| Online-Zugang: | Verlag, kostenfrei, Volltext: https://doi.org/10.1145/3765616 Verlag, kostenfrei, Volltext: https://dl.acm.org/doi/10.1145/3765616 |
| Verfasserangaben: | Cu Cui, Guido Kanschat |
MARC
| LEADER | 00000caa a2200000 c 4500 | ||
|---|---|---|---|
| 001 | 1950218376 | ||
| 003 | DE-627 | ||
| 005 | 20260127105827.0 | ||
| 007 | cr uuu---uuuuu | ||
| 008 | 260126s2025 xx |||||o 00| ||eng c | ||
| 024 | 7 | |a 10.1145/3765616 |2 doi | |
| 035 | |a (DE-627)1950218376 | ||
| 035 | |a (DE-599)KXP1950218376 | ||
| 040 | |a DE-627 |b ger |c DE-627 |e rda | ||
| 041 | |a eng | ||
| 084 | |a 28 |2 sdnb | ||
| 100 | 1 | |a Cui, Cu |e VerfasserIn |0 (DE-588)1373554339 |0 (DE-627)1932956395 |4 aut | |
| 245 | 1 | 0 | |a Multilevel interior penalty methods on GPUs |c Cu Cui, Guido Kanschat |
| 264 | 1 | |c September 2025 | |
| 300 | |b Diagramme | ||
| 300 | |a 27 | ||
| 336 | |a Text |b txt |2 rdacontent | ||
| 337 | |a Computermedien |b c |2 rdamedia | ||
| 338 | |a Online-Ressource |b cr |2 rdacarrier | ||
| 500 | |a Gesehen am 26.01.2026 | ||
| 520 | |a We present a matrix-free multigrid method for high-order Discontinuous Galerkin (DG) finite element methods with GPU acceleration. A performance analysis is conducted, comparing various data and compute layouts. Smoother implementations are optimized through localization and fast diagonalization techniques. Leveraging conflict-free access patterns in shared memory, arithmetic throughput of up to 40% of the peak performance on NVIDIA A100 GPUs are achieved. Experimental results affirm the effectiveness of mixed-precision approaches and Message Passing Interface (MPI) parallelization in accelerating algorithms. Furthermore, an assessment of solver efficiency and robustness is provided across both two and three dimensions, with applications to Poisson problems. | ||
| 700 | 1 | |a Kanschat, Guido |e VerfasserIn |0 (DE-588)102535334X |0 (DE-627)72215612X |0 (DE-576)175755949 |4 aut | |
| 773 | 0 | 8 | |i Enthalten in |a Association for Computing Machinery |t ACM transactions on mathematical software |d New York, NY : ACM, 1975 |g 51(2025), 3 vom: Sept., Artikel-ID 19, Seite 1-27 |h Online-Ressource |w (DE-627)320454134 |w (DE-600)2006421-4 |w (DE-576)09088986X |x 1557-7295 |7 nnas |
| 773 | 1 | 8 | |g volume:51 |g year:2025 |g number:3 |g month:09 |g elocationid:19 |g pages:1-27 |g extent:27 |a Multilevel interior penalty methods on GPUs |
| 856 | 4 | 0 | |u https://doi.org/10.1145/3765616 |x Verlag |x Resolving-System |z kostenfrei |3 Volltext |7 0 |
| 856 | 4 | 0 | |u https://dl.acm.org/doi/10.1145/3765616 |x Verlag |z kostenfrei |3 Volltext |7 0 |
| 951 | |a AR | ||
| 992 | |a 20260126 | ||
| 993 | |a Article | ||
| 994 | |a 2025 | ||
| 998 | |g 102535334X |a Kanschat, Guido |m 102535334X:Kanschat, Guido |d 700000 |d 708000 |e 700000PK102535334X |e 708000PK102535334X |k 0/700000/ |k 1/700000/708000/ |p 2 |y j | ||
| 998 | |g 1373554339 |a Cui, Cu |m 1373554339:Cui, Cu |d 700000 |d 708000 |e 700000PC1373554339 |e 708000PC1373554339 |k 0/700000/ |k 1/700000/708000/ |p 1 |x j | ||
| 999 | |a KXP-PPN1950218376 |e 4860819632 | ||
| BIB | |a Y | ||
| SER | |a journal | ||
| JSO | |a {"relHost":[{"note":["Gesehen am 16.06.20"],"origin":[{"dateIssuedKey":"1975","publisherPlace":"New York, NY","dateIssuedDisp":"1975-","publisher":"ACM"}],"physDesc":[{"extent":"Online-Ressource"}],"recId":"320454134","pubHistory":["1.1975 -"],"id":{"zdb":["2006421-4"],"eki":["320454134"],"issn":["1557-7295"]},"title":[{"title_sort":"ACM transactions on mathematical software","title":"ACM transactions on mathematical software","subtitle":"a publication of the Association for Computing Machinery"}],"part":{"year":"2025","volume":"51","extent":"27","text":"51(2025), 3 vom: Sept., Artikel-ID 19, Seite 1-27","pages":"1-27","issue":"3"},"corporate":[{"display":"Association for Computing Machinery","role":"aut"}],"disp":"Association for Computing MachineryACM transactions on mathematical software","type":{"bibl":"periodical","media":"Online-Ressource"},"language":["eng"],"titleAlt":[{"title":"TOMS"},{"title":"ACM TOMS"},{"title":"Transactions on mathematical software"}]}],"person":[{"family":"Cui","given":"Cu","display":"Cui, Cu","role":"aut"},{"family":"Kanschat","given":"Guido","role":"aut","display":"Kanschat, Guido"}],"title":[{"title":"Multilevel interior penalty methods on GPUs","title_sort":"Multilevel interior penalty methods on GPUs"}],"physDesc":[{"noteIll":"Diagramme","extent":"27 S."}],"note":["Gesehen am 26.01.2026"],"origin":[{"dateIssuedDisp":"September 2025","dateIssuedKey":"2025"}],"id":{"eki":["1950218376"],"doi":["10.1145/3765616"]},"recId":"1950218376","name":{"displayForm":["Cu Cui, Guido Kanschat"]},"type":{"media":"Online-Ressource","bibl":"article-journal"},"language":["eng"]} | ||
| SRT | |a CUICUKANSCMULTILEVEL2025 | ||