Poster - 3D Tixels: A highly efficient algorithm for GPU/CPU-acceleration of molecular dynamics on heterogeneous parallel architectures
2011 (English)In: SC - Proc. High Perform. Comput. Networking, Storage Anal. Companion, Co-located SC, 2011, 71-72 p.Conference paper (Refereed)
Several GPU-based algorithms have been developed to ac-celerate biomolecular simulations, but although they pro-vide benefits over single-core implementations, they have not been able to surpass the performance of state-of-the art SIMD CPU implementations (e.g. GROMACS), not to mention efficient scaling. Here, we present a heteroge-nous parallelization that utilizes both CPU and GPU re-sources efficiently. A novel fixed-particle-number sub-cell algorithm for non-bonded force calculation was developed. The algorithm uses the SIMD width as algorithmic work unit, it is intrinsically future-proof since it can be adapted to future hardware. The CUDA non-bonded kernel imple-mentation achieves up to 60% work-efficiency, 1.5 IPC, and 95% L1 cache utilization. On the CPU OpenMP-parallelized SSE-accelerated code runs overlapping with GPU execution. Fully automated dynamic inter-process as well as CPU-GPU load balancing is employed. We achieve threefold speedup compared to equivalent GROMACS CPU code and show good strong and weak scaling. To the best of our knowledge this the fastest GPU molecular dynamics implementation presented to date.
Place, publisher, year, edition, pages
2011. 71-72 p.
GPGPU, GPU, Heterogeneous architectures, Molecular dynamics, Multi-level paralleliza-tion, Biomolecular Simulation, Cache utilization, Efficient algorithm, Force calculation, GPU-based algorithms, Multi-level, Parallelizations, State of the art, Algorithms, Application programming interfaces (API), Core levels, Parallel architectures, Program processors
Biochemistry and Molecular Biology
IdentifiersURN: urn:nbn:se:kth:diva-149917DOI: 10.1145/2148600.2148637ScopusID: 2-s2.0-84859097973ISBN: 9781450310307OAI: oai:DiVA.org:kth-149917DiVA: diva2:742361
2011 High Performance Computing Networking, Storage and Analysis, SC'11, Co-located with SC'11, 12 November 2011 through 18 November 2011, Seattle, WA
QC 201409012014-09-012014-08-282014-09-01Bibliographically approved