Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A flexible algorithm for calculating pair interactions on SIMD architectures
KTH, School of Engineering Sciences (SCI), Theoretical Physics, Theoretical & Computational Biophysics. KTH, Centres, SeRC - Swedish e-Science Research Centre. KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0003-0603-5514
KTH, School of Engineering Sciences (SCI), Theoretical Physics, Theoretical & Computational Biophysics. KTH, Centres, SeRC - Swedish e-Science Research Centre. KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0002-7498-7763
2013 (English)In: Computer Physics Communications, ISSN 0010-4655, E-ISSN 1879-2944, Vol. 184, no 12, 2641-2650 p.Article in journal (Refereed) Published
Abstract [en]

Calculating interactions or correlations between pairs of particles is typically the most time-consuming task in particle simulation or correlation analysis. Straightforward implementations using a double loop over particle pairs have traditionally worked well, especially since compilers usually do a good job of unrolling the inner loop. In order to reach high performance on modern CPU and accelerator architectures, single-instruction multiple-data (SIMD) parallelization has become essential. Avoiding memory bottlenecks is also increasingly important and requires reducing the ratio of memory to arithmetic operations. Moreover, when pairs only interact within a certain cut-off distance, good SIMD utilization can only be achieved by reordering input and output data, which quickly becomes a limiting factor. Here we present an algorithm for SIMD parallelization based on grouping a fixed number of particles, e.g. 2, 4, or 8, into spatial clusters. Calculating all interactions between particles in a pair of such clusters improves data reuse compared to the traditional scheme and results in a more efficient SIMD parallelization. Adjusting the cluster size allows the algorithm to map to SIMD units of various widths. This flexibility not only enables fast and efficient implementation on current CPUs and accelerator architectures like GPUs or Intel MIC, but it also makes the algorithm future-proof. We present the algorithm with an application to molecular dynamics simulations, where we can also make use of the effective buffering the method introduces.

Place, publisher, year, edition, pages
2013. Vol. 184, no 12, 2641-2650 p.
Keyword [en]
GPU, Molecular dynamics, Pair interactions, SIMD, Verlet list, Accelerator architectures, Efficient implementation, Molecular dynamics simulations, Single-instruction multiple-data, Program processors, Clustering algorithms
National Category
Other Computer and Information Science
Identifiers
URN: urn:nbn:se:kth:diva-139874DOI: 10.1016/j.cpc.2013.06.003ISI: 000328725200003Scopus ID: 2-s2.0-84888294449OAI: oai:DiVA.org:kth-139874DiVA: diva2:688795
Note

QC 20140107

Available from: 2014-01-17 Created: 2014-01-15 Last updated: 2017-12-06Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Authority records BETA

Páll, SzilardHess, Berk

Search in DiVA

By author/editor
Páll, SzilardHess, Berk
By organisation
Theoretical & Computational BiophysicsSeRC - Swedish e-Science Research CentreScience for Life Laboratory, SciLifeLab
In the same journal
Computer Physics Communications
Other Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 200 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf