Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Poster - 3D Tixels: A highly efficient algorithm for GPU/CPU-acceleration of molecular dynamics on heterogeneous parallel architectures
KTH, Skolan för teknikvetenskap (SCI), Teoretisk fysik, Beräkningsbiofysik.ORCID-id: 0000-0003-0603-5514
KTH, Skolan för teknikvetenskap (SCI), Teoretisk fysik, Beräkningsbiofysik.ORCID-id: 0000-0002-7498-7763
KTH, Skolan för teknikvetenskap (SCI), Teoretisk fysik, Beräkningsbiofysik.ORCID-id: 0000-0002-2734-2794
2011 (Engelska)Ingår i: SC - Proc. High Perform. Comput. Networking, Storage Anal. Companion, Co-located SC, 2011, s. 71-72Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Several GPU-based algorithms have been developed to ac-celerate biomolecular simulations, but although they pro-vide benefits over single-core implementations, they have not been able to surpass the performance of state-of-the art SIMD CPU implementations (e.g. GROMACS), not to mention efficient scaling. Here, we present a heteroge-nous parallelization that utilizes both CPU and GPU re-sources efficiently. A novel fixed-particle-number sub-cell algorithm for non-bonded force calculation was developed. The algorithm uses the SIMD width as algorithmic work unit, it is intrinsically future-proof since it can be adapted to future hardware. The CUDA non-bonded kernel imple-mentation achieves up to 60% work-efficiency, 1.5 IPC, and 95% L1 cache utilization. On the CPU OpenMP-parallelized SSE-accelerated code runs overlapping with GPU execution. Fully automated dynamic inter-process as well as CPU-GPU load balancing is employed. We achieve threefold speedup compared to equivalent GROMACS CPU code and show good strong and weak scaling. To the best of our knowledge this the fastest GPU molecular dynamics implementation presented to date.

Ort, förlag, år, upplaga, sidor
2011. s. 71-72
Nyckelord [en]
GPGPU, GPU, Heterogeneous architectures, Molecular dynamics, Multi-level paralleliza-tion, Biomolecular Simulation, Cache utilization, Efficient algorithm, Force calculation, GPU-based algorithms, Multi-level, Parallelizations, State of the art, Algorithms, Application programming interfaces (API), Core levels, Parallel architectures, Program processors
Nationell ämneskategori
Biokemi och molekylärbiologi
Identifikatorer
URN: urn:nbn:se:kth:diva-149917DOI: 10.1145/2148600.2148637Scopus ID: 2-s2.0-84859097973ISBN: 9781450310307 (tryckt)OAI: oai:DiVA.org:kth-149917DiVA, id: diva2:742361
Konferens
2011 High Performance Computing Networking, Storage and Analysis, SC'11, Co-located with SC'11, 12 November 2011 through 18 November 2011, Seattle, WA
Anmärkning

QC 20140901

Tillgänglig från: 2014-09-01 Skapad: 2014-08-28 Senast uppdaterad: 2014-09-01Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Personposter BETA

Páll, SzilárdHess, BerkLindahl, Erik

Sök vidare i DiVA

Av författaren/redaktören
Páll, SzilárdHess, BerkLindahl, Erik
Av organisationen
Beräkningsbiofysik
Biokemi och molekylärbiologi

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 271 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf