Ändra sökning
Avgränsa sökresultatet
123 51 - 100 av 111
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 51. Lapenta, Giovanni
    et al.
    Goldman, Martin
    Newman, David
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Divin, Andrey
    Electromagnetic energy conversion in downstream fronts from three dimensional kinetic reconnection2014Ingår i: Physics of Plasmas, ISSN 1070-664X, E-ISSN 1089-7674, Vol. 21, nr 5, s. 055702-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The electromagnetic energy equation is analyzed term by term in a 3D simulation of kinetic reconnection previously reported by Vapirev et al. [J. Geophys. Res.: Space Phys. 118, 1435 (2013)]. The evolution presents the usual 2D-like topological structures caused by an initial perturbation independent of the third dimension. However, downstream of the reconnection site, where the jetting plasma encounters the yet unperturbed pre-existing plasma, a downstream front is formed and made unstable by the strong density gradient and the unfavorable local acceleration field. The energy exchange between plasma and fields is most intense at the instability, reaching several pW/m(3), alternating between load (energy going from fields to particles) and generator (energy going from particles to fields) regions. Energy exchange is instead purely that of a load at the reconnection site itself in a region focused around the x-line and elongated along the separatrix surfaces. Poynting fluxes are generated at all energy exchange regions and travel away from the reconnection site transporting an energy signal of the order of about S approximate to 10(-3)W/m(2). (C) 2014 AIP Publishing LLC.

  • 52. Lapenta, Giovanni
    et al.
    Goldman, Martin V.
    Newman, David L.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Energy exchanges in reconnection outflows2017Ingår i: Plasma Physics and Controlled Fusion, ISSN 0741-3335, E-ISSN 1361-6587, Vol. 59, nr 1, artikel-id 014019Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Reconnection outflows are highly energetic directed flows that interact with the ambient plasma or with flows from other reconnection regions. Under these conditions the flow becomes highly unstable and chaotic, as any flow jets interacting with a medium. We report here massively parallel simulations of the two cases of interaction between outflow jets and between a single outflow with an ambient plasma. We find in both case the development of a chaotic magnetic field, subject to secondary reconnection events that further complicate the topology of the field lines. The focus of the present analysis is on the energy balance. We compute each energy channel (electromagnetic, bulk, thermal, for each species) and find where the most energy is exchanged and in what form. The main finding is that the largest energy exchange is not at the reconnection site proper but in the regions where the outflowing jets are destabilized.

  • 53. Lapenta, Giovanni
    et al.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Divin, Andrey
    Newman, David
    Goldman, Martin
    Separatrices: The crux of reconnection2015Ingår i: Journal of Plasma Physics, ISSN 0022-3778, E-ISSN 1469-7807, Vol. 81, nr 1, artikel-id 325810109Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Magnetic reconnection is one of the key processes in astrophysical and laboratory plasmas: it is the opposite of a dynamo. Looking at energy, a dynamo transforms kinetic energy in magnetic energy while reconnection takes magnetic energy and returns it to its kinetic form. Most plasma processes at their core involve first storing magnetic energy accumulated over time and then releasing it suddenly. We focus here on this release. A key concept in analysing reconnection is that of the separatrix, a surface (line in 2D) that separates the fresh unperturbed plasma embedded in magnetic field lines not yet reconnected with the hotter exhaust embedded in reconnected field lines. In kinetic physics, the separatrices become a layer where many key processes develop. We present here new results relative to the processes at the separatrices that regulate the plasma flow, the energization of the species, the electromagnetic fields and the instabilities developing at the separatrices.

  • 54. Lapenta, Giovanni
    et al.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Goldman, Martin V.
    Newman, David L.
    Secondary reconnection sites in reconnection-generated flux ropes and reconnection fronts2015Ingår i: Nature Physics, ISSN 1745-2473, E-ISSN 1745-2481, Vol. 11, nr 8, s. 690-+Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The primary target of the Magnetospheric MultiScale (MMS) mission is the electron-scale diffusion layer around reconnection sites. Here we study where these regions are found in full three-dimensional simulations. In two dimensions the sites of electron diffusion, defined as the regions where magnetic topology changes and electrons move with respect to the magnetic field lines, are located near the reconnection site. But in three dimensions we find that the reconnection exhaust far from the primary reconnection site also becomes host to secondary reconnection sites. Four diagnostics are used to demonstrate the point: the direct observation of topology impossible without secondary reconnection, the direct measurement of topological field line breakage, the measurement of electron jets emerging from secondary reconnection regions, and the violation of the frozen-in condition. We conclude that secondary reconnection occurs in a large part of the exhaust, providing many more chances for MMS to find itself in the right region to hit its target.

  • 55. Lapenta, Giovanni
    et al.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Poedts, Stefaan
    Vucinic, Dean
    Space Weather Prediction and Exascale Computing2013Ingår i: Computing in science & engineering (Print), ISSN 1521-9615, E-ISSN 1558-366X, Vol. 15, nr 5, s. 68-76Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Space weather can have a great effect on Earth's climate. Predicting the impact of space environment disturbances on Earth presents a challenge to scientists. Here, the ExaScience Lab's efforts are presented, which use exascale computing and new visualization tools to predict the arrival and impact of space events on Earth.

  • 56. Lapenta, Giovanni
    et al.
    Pierrard, Viviane
    Keppens, Rony
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Poedts, Stefaan
    Sebek, Ondrej
    Travnicek, Pavel M.
    Henri, Pierre
    Califano, Francesco
    Pegoraro, Francesco
    Faganello, Matteo
    Olshevsky, Vyacheslav
    Restante, Anna Lisa
    Nordlund, Åke
    Frederiksen, Jacob Trier
    Mackay, Duncan H.
    Parnell, Clare E.
    Bemporad, Alessandro
    Susino, Roberto
    Borremans, Kris
    SWIFF: Space weather integrated forecasting framework2013Ingår i: Journal of Space Weather and Space Climate, ISSN 2115-7251, E-ISSN 2115-7251, Vol. 3, s. A05-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    SWIFF is a project funded by the Seventh Framework Programme of the European Commission to study the mathematical-physics models that form the basis for space weather forecasting. The phenomena of space weather span a tremendous scale of densities and temperature with scales ranging 10 orders of magnitude in space and time. Additionally even in local regions there are concurrent processes developing at the electron, ion and global scales strongly interacting with each other. The fundamental challenge in modelling space weather is the need to address multiple physics and multiple scales. Here we present our approach to take existing expertise in fluid and kinetic models to produce an integrated mathematical approach and software infrastructure that allows fluid and kinetic processes to be modelled together. SWIFF aims also at using this new infrastructure to model specific coupled processes at the Solar Corona, in the interplanetary space and in the interaction at the Earth magnetosphere.

  • 57.
    Ma, Yingjuan
    et al.
    Univ Calif Los Angeles, Dept Earth Planetary & Space Sci, Los Angeles, CA 90095 USA..
    Russell, Christopher T.
    Univ Calif Los Angeles, Dept Earth Planetary & Space Sci, Los Angeles, CA 90095 USA..
    Toth, Gabor
    Univ Michigan, Dept Climate & Space Sci & Engn, Ann Arbor, MI 48109 USA..
    Chen, Yuxi
    Univ Michigan, Dept Climate & Space Sci & Engn, Ann Arbor, MI 48109 USA..
    Nagy, Andrew F.
    Univ Michigan, Dept Climate & Space Sci & Engn, Ann Arbor, MI 48109 USA..
    Harada, Yuki
    Univ Iowa, Dept Phys & Astron, Iowa City, IA 52242 USA..
    McFadden, James
    Univ Calif Berkeley, Space Sci Lab, Berkeley, CA 94720 USA..
    Halekas, Jasper S.
    Univ Iowa, Dept Phys & Astron, Iowa City, IA 52242 USA..
    Lillis, Rob
    Univ Calif Berkeley, Space Sci Lab, Berkeley, CA 94720 USA..
    Connerney, John E. P.
    NASA, Goddard Space Flight Ctr, Greenbelt, MD USA..
    Espley, Jared
    NASA, Goddard Space Flight Ctr, Greenbelt, MD USA..
    DiBraccio, Gina A.
    NASA, Goddard Space Flight Ctr, Greenbelt, MD USA..
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST). KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Peng, Ivy Bo
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Fang, Xiaohua
    Univ Colorado, Lab Atmospher & Space Phys, Boulder, CO 80309 USA..
    Jakosky, Bruce M.
    Univ Colorado, Lab Atmospher & Space Phys, Boulder, CO 80309 USA..
    Reconnection in the Martian Magnetotail: Hall-MHD With Embedded Particle-in-Cell Simulations2018Ingår i: Journal of Geophysical Research - Space Physics, ISSN 2169-9380, E-ISSN 2169-9402, Vol. 123, nr 5, s. 3742-3763Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Mars Atmosphere and Volatile EvolutioN (MAVEN) mission observations show clear evidence of the occurrence of the magnetic reconnection process in the Martian plasma tail. In this study, we use sophisticated numerical models to help us understand the effects of magnetic reconnection in the plasma tail. The numerical models used in this study are (a) a multispecies global Hall-magnetohydrodynamic (HMHD) model and (b) a global HMHD model two-way coupled to an embedded fully kinetic particle-in-cell code. Comparison with MAVEN observations clearly shows that the general interaction pattern is well reproduced by the global HMHD model. The coupled model takes advantage of both the efficiency of the MHD model and the ability to incorporate kinetic processes of the particle-in-cell model, making it feasible to conduct kinetic simulations for Mars under realistic solar wind conditions for the first time. Results from the coupled model show that the Martian magnetotail is highly dynamic due to magnetic reconnection, and the resulting Mars-ward plasma flow velocities are significantly higher for the lighter ion fluid, which are quantitatively consistent with MAVEN observations. The HMHD with Embedded Particle-in-Cell model predicts that the ion loss rates are more variable but with similar mean values as compared with HMHD model results.

  • 58. Manzini, G.
    et al.
    Delzanno, G. L.
    Vencels, J.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    A Legendre-Fourier spectral method with exact conservation laws for the Vlasov-Poisson system2016Ingår i: Journal of Computational Physics, ISSN 0021-9991, E-ISSN 1090-2716, Vol. 317, s. 82-107Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present the design and implementation of an L-2-stable spectral method for the discretization of the Vlasov-Poisson model of a collisionless plasma in one space and velocity dimension. The velocity and space dependence of the Vlasov equation are resolved through a truncated spectral expansion based on Legendre and Fourier basis functions, respectively. The Poisson equation, which is coupled to the Vlasov equation, is also resolved through a Fourier expansion. The resulting system of ordinary differential equation is discretized by the implicit second-order accurate Crank-Nicolson time discretization. The non-linear dependence between the Vlasov and Poisson equations is iteratively solved at any time cycle by a Jacobian-Free Newton-Krylov method. In this work we analyze the structure of the main conservation laws of the resulting Legendre-Fourier model, e.g., mass, momentum, and energy, and prove that they are exactly satisfied in the semi-discrete and discrete setting. The L-2-stability of the method is ensured by discretizing the boundary conditions of the distribution function at the boundaries of the velocity domain by a suitable penalty term. The impact of the penalty term on the conservation properties is investigated theoretically and numerically. An implementation of the penalty term that does not affect the conservation of mass, momentum and energy, is also proposed and studied. A collisional term is introduced in the discrete model to control the filamentation effect, but does not affect the conservation properties of the system. Numerical results on a set of standard test problems illustrate the performance of the method.

  • 59. Marchand, R.
    et al.
    Miyake, Y.
    Usui, H.
    Deca, J.
    Lapenta, G.
    Mateo-Velez, J. C.
    Ergun, R. E.
    Sturner, A.
    Genot, V.
    Hilgers, A.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Cross-comparison of spacecraft-environment interaction model predictions applied to Solar Probe Plus near perihelion2014Ingår i: Physics of Plasmas, ISSN 1070-664X, E-ISSN 1089-7674, Vol. 21, nr 6, s. 062901-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Five spacecraft-plasma models are used to simulate the interaction of a simplified geometry Solar Probe Plus (SPP) satellite with the space environment under representative solar wind conditions near perihelion. By considering similarities and differences between results obtained with different numerical approaches under well defined conditions, the consistency and validity of our models can be assessed. The impact on model predictions of physical effects of importance in the SPP mission is also considered by comparing results obtained with and without these effects. Simulation results are presented and compared with increasing levels of complexity in the physics of interaction between solar environment and the SPP spacecraft. The comparisons focus particularly on spacecraft floating potentials, contributions to the currents collected and emitted by the spacecraft, and on the potential and density spatial profiles near the satellite. The physical effects considered include spacecraft charging, photoelectron and secondary electron emission, and the presence of a background magnetic field. Model predictions obtained with our different computational approaches are found to be in agreement within 2% when the same physical processes are taken into account and treated similarly. The comparisons thus indicate that, with the correct description of important physical effects, our simulation models should have the required skill to predict details of satellite-plasma interaction physics under relevant conditions, with a good level of confidence. Our models concur in predicting a negative floating potential V-fl similar to -10V for SPP at perihelion. They also predict a "saturated emission regime" whereby most emitted photo-and secondary electron will be reflected by a potential barrier near the surface, back to the spacecraft where they will be recollected.

  • 60.
    Markidis, Markidis
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Lapenta, G.
    Delzanno, G. L.
    Henri, P.
    Goldman, M. V.
    Newman, D. L.
    Intrator, T.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Signatures of secondary collisionless magnetic reconnection driven by kink instability of a flux rope2014Ingår i: Plasma Physics and Controlled Fusion, ISSN 0741-3335, E-ISSN 1361-6587, Vol. 56, nr 6, s. 064010-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The kinetic features of secondary magnetic reconnection in a single flux rope undergoing internal kink instability are studied by means of three-dimensional particle-in-cell simulations. Several signatures of secondary magnetic reconnection are identified in the plane perpendicular to the flux rope: a quadrupolar electron and ion density structure and a bipolar Hall magnetic field develop in proximity of the reconnection region. The most intense electric fields form perpendicularly to the local magnetic field, and a reconnection electric field is identified in the plane perpendicular to the flux rope. An electron current develops along the reconnection line, in the opposite direction of the electron current supporting the flux rope magnetic field structure. Along the reconnection line, several bipolar structures of the electric field parallel to the magnetic field occur, making the magnetic reconnection region turbulent. The reported signatures of secondary magnetic reconnection can help to localize magnetic reconnection events in space, astrophysical and fusion plasmas.

  • 61.
    Markidis, Stefano
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Chien, Steven Wei Der
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Peng, I. B.
    Vetter, J. S.
    NVIDIA tensor core programmability, performance & precision2018Ingår i: Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, s. 522-531, artikel-id 8425458Konferensbidrag (Refereegranskat)
    Abstract [en]

    The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called Tensor Core that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program NVIDIA Tensor Cores, their performances and the precision loss due to computation in mixed precision. Currently, NVIDIA provides three different ways of programming matrix-multiply-and-accumulate on Tensor Cores: the CUDA Warp Matrix Multiply Accumulate (WMMA) API, CUTLASS, a templated library based on WMMA, and cuBLAS GEMM. After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively. A WMMA implementation of batched GEMM reaches a performance of 4 Tflops/s. While precision loss due to matrix multiplication with half precision input might be critical in many HPC applications, it can be considerably reduced at the cost of increased computation. Our results indicate that HPC applications using matrix multiplications can strongly benefit from using of NVIDIA Tensor Cores.

  • 62.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Schliephake, Michael
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Hart, Alistair
    Henty, David
    Heisey, Katherine
    Fischer, Paul
    OpenACC acceleration of the Nek5000 spectral element code2015Ingår i: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 29, nr 3, s. 311-319Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present a case study of porting NekBone, a skeleton version of the Nek5000 code, to a parallel GPU-accelerated system. Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flow. The original NekBone Fortran source code has been used as the base and enhanced by OpenACC directives. The profiling of NekBone provided an assessment of the suitability of the code for GPU systems, and indicated possible kernel optimizations. To port NekBone to GPU systems required little effort and a small number of additional lines of code (approximately one OpenACC directive per 1000 lines of code). The naïve implementation using OpenACC leads to little performance improvement: on a single node, from 16 Gflops obtained with the version without OpenACC, we reached 20 Gflops with the naïve OpenACC implementation. An optimized NekBone version leads to a 43 Gflop performance on a single node. In addition, we ported and optimized NekBone to parallel GPU systems, reaching a parallel efficiency of 79.9% on 1024 GPUs of the Titan XK7 supercomputer at the Oak Ridge National Laboratory.

  • 63.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Henri, P.
    Lapenta, G.
    Divin, A.
    Goldman, M.
    Newman, D.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Kinetic simulations of plasmoid chain dynamics2013Ingår i: Physics of Plasmas, ISSN 1070-664X, E-ISSN 1089-7674, Vol. 20, nr 8, s. 082105-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The dynamics of a plasmoid chain is studied with three dimensional Particle-in-Cell simulations. The evolution of the system with and without a uniform guide field, whose strength is 1/3 the asymptotic magnetic field, is investigated. The plasmoid chain forms by spontaneous magnetic reconnection: the tearing instability rapidly disrupts the initial current sheet generating several small-scale plasmoids that rapidly grow in size coalescing and kinking. The plasmoid kink is mainly driven by the coalescence process. It is found that the presence of guide field strongly influences the evolution of the plasmoid chain. Without a guide field, a main reconnection site dominates and smaller reconnection regions are included in larger ones, leading to an hierarchical structure of the plasmoid-dominated current sheet. On the contrary in presence of a guide field, plasmoids have approximately the same size and the hierarchical structure does not emerge, a strong core magnetic field develops in the center of the plasmoid in the direction of the existing guide field, and bump-on-tail instability, leading to the formation of electron holes, is detected in proximity of the plasmoids.

  • 64.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Henri, P.
    Lapenta, G.
    Divin, A.
    Goldman, M. V.
    Newman, D.
    Eriksson, S.
    Collisionless magnetic reconnection in a plasmoid chain2012Ingår i: Nonlinear processes in geophysics, ISSN 1023-5809, E-ISSN 1607-7946, Vol. 19, nr 1, s. 145-153Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The kinetic features of plasmoid chain formation and evolution are investigated by two dimensional Particlein-Cell simulations. Magnetic reconnection is initiated in multiple X points by the tearing instability. Plasmoids form and grow in size by continuously coalescing. Each chain plasmoid exhibits a strong out-of plane core magnetic field and an out-of-plane electron current that drives the coalescing process. The disappearance of the X points in the coalescence process are due to anti-reconnection, a magnetic reconnection where the plasma inflow and outflow are reversed with respect to the original reconnection flow pattern. Anti-reconnection is characterized by the Hall magnetic field quadrupole signature. Two new kinetic features, not reported by previous studies of plasmoid chain evolution, are here revealed. First, intense electric fields develop in-plane normally to the separatrices and drive the ion dynamics in the plasmoids. Second, several bipolar electric field structures are localized in proximity of the plasmoid chain. The analysis of the electron distribution function and phase space reveals the presence of counter-streaming electron beams, unstable to the two stream instability, and phase space electron holes along the reconnection separatrices.

  • 65.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Henri, Pierre
    Lapenta, Giovanni
    Ronnmark, Kjell
    Hamrin, Maria
    Meliani, Zakaria
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    The Fluid-Kinetic Particle-in-Cell method for plasma simulations2014Ingår i: Journal of Computational Physics, ISSN 0021-9991, E-ISSN 1090-2716, Vol. 271, s. 415-429Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A method that solves concurrently the multi-fluid and Maxwell's equations has been developed for plasma simulations. By calculating the stress tensor in the multi-fluid momentum equation by means of computational particles moving in a self-consistent electromagnetic field, the kinetic effects are retained while solving the multi-fluid equations. The Maxwell's and multi-fluid equations are discretized implicitly in time enabling kinetic simulations over time scales typical of the fluid simulations. The Fluid-Kinetic Particle-in-Cell method has been implemented in a three-dimensional electromagnetic code, and tested against the two-stream instability, the Weibel instability, the ion cyclotron resonance and magnetic reconnection problems. The method is a promising approach for coupling fluid and kinetic methods in a unified framework.

  • 66.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Lapenta, G.
    Divin, A.
    Goldman, M.
    Newman, D.
    Andersson, L.
    Three dimensional density cavities in guide field collisionless magnetic reconnection2012Ingår i: Physics of Plasmas, ISSN 1070-664X, E-ISSN 1089-7674, Vol. 19, nr 3, s. 032119-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Particle-in-cell simulations of collisionless magnetic reconnection with a guide field reveal for the first time the three dimensional features of the low density regions along the magnetic reconnection separatrices, the so-called cavities. It is found that structures with further lower density develop within the cavities. Because their appearance is similar to the rib shape, these formations are here called low density ribs. Their location remains approximately fixed in time and their density progressively decreases, as electron currents along the cavities evacuate them. They develop along the magnetic field lines and are supported by a strong perpendicular electric field that oscillates in space. In addition, bipolar parallel electric field structures form as isolated spheres between the cavities and the outflow plasma, along the direction of the low density ribs and of magnetic field lines.

  • 67.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Solving software challenges for exascale: International Conference on Exascale Applications and Software, EASC 2014 Stockholm, Sweden, April 2–3, 2014 revised selected papers2015Ingår i: International Conference on Exascale Applications and Software, EASC 2014, Elsevier, 2015, Vol. 8759Konferensbidrag (Refereegranskat)
  • 68.
    Markidis, Stefano
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Olshevsky, Vyacheslav
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Sishtla, C. P.
    Chien, Steven W. D.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Lapenta, G.
    PolyPIC: The polymorphic-particle-in-cell method for fluid-kinetic coupling2018Ingår i: Frontiers in Physics, E-ISSN 2296-424X, Vol. 6, nr OCT, artikel-id 100Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Particle-in-Cell (PIC) methods are widely used computational tools for fluid and kinetic plasma modeling. While both the fluid and kinetic PIC approaches have been successfully used to target either kinetic or fluid simulations, little was done to combine fluid and kinetic particles under the same PIC framework. This work addresses this issue by proposing a new PIC method, PolyPIC, that uses polymorphic computational particles. In this numerical scheme, particles can be either kinetic or fluid, and fluid particles can become kinetic when necessary, e.g., particles undergoing a strong acceleration. We design and implement the PolyPIC method, and test it against the Landau damping of Langmuir and ion acoustic waves, two stream instability and sheath formation. We unify the fluid and kinetic PIC methods under one common framework comprising both fluid and kinetic particles, providing a tool for adaptive fluid-kinetic coupling in plasma simulations.

  • 69.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Peng, Ivy Bo
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Larsson Träff, Jesper
    Rougier, Antoine
    Bartsch, Valeria
    Machado, Rui
    Rahn, Mirko
    Hart, Alistair
    Holmes, Daniel
    Bull, Mark
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    The EPiGRAM Project: Preparing Parallel Programming Models for Exascale2016Ingår i: HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2016 INTERNATIONAL WORKSHOPS, Springer, 2016, s. 56-68Konferensbidrag (Refereegranskat)
    Abstract [en]

    EPiGRAM is a European Commission funded project to improve existing parallel programming models to run efficiently large scale applications on exascale supercomputers. The EPiGRAM project focuses on the two current dominant petascale programming models, message-passing and PGAS, and on the improvement of two of their associated programming systems, MPI and GASPI. In EPiGRAM, we work on two major aspects of programming systems. First, we improve the performance of communication operations by decreasing the memory consumption, improving collective operations and introducing emerging computing models. Second, we enhance the interoperability of message-passing and PGAS by integrating them in one PGAS-based MPI implementation, called EMPI4Re, implementing MPI endpoints and improving GASPI interoperability with MPI. The new EPiGRAM concepts are tested in two large-scale applications, iPIC3D, a Particle-in-Cell code for space physics simulations, and Nek5000, a Computational Fluid Dynamics code.

  • 70.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Peng, Ivybo
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Iakymchuk, Roman
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Kestor, G.
    Gioiosa, R.
    A performance characterization of streaming computing on supercomputers2016Ingår i: Procedia Computer Science, Elsevier, 2016, s. 98-107Konferensbidrag (Refereegranskat)
    Abstract [en]

    Streaming computing models allow for on-the-y processing of large data sets. With the increased demand for processing large amount of data in a reasonable period of time, streaming models are more and more used on supercomputers to solve data-intensive problems. Because supercomputers have been mainly used for compute-intensive workload, supercomputer performance metrics focus on the number of oating point operations in time and cannot fully characterize a streaming application performance on supercomputers. We introduce the injection and processing rates as the main metrics to characterize the performance of streaming computing on supercomputers. We analyze the dynamics of these quantities in a modi ed STREAM benchmark developed atop of an MPI streaming library in a series of di erent congurations. We show that after a brief transient the injection and processing rates converge to sustained rates. We also demonstrate that streaming computing performance strongly depends on the number of connections between data producers and consumers and on the processing task granularity.

  • 71.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Schliephake, Michael
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Aguilar, Xavier
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Henty, David
    University of Edinburgh.
    Richardson, Harvey
    Cray Inc..
    Hart, Alistair
    Cray Inc..
    Gray, Alan
    University of Edinburgh.
    Lecomber, David
    Allinea Software Limited.
    Hilbrich, Tobias
    Technische Universität Dresden.
    Doleschal, Jens
    Technische Universität Dresden.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Paving the path to exascale computing with CRESTA development environment2013Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    The development and implementation of efficient computer codes for exascale supercomputers will require combined advancement of all development environment components: compilers, automatic tuning frameworks, run-time systems, debuggers and performance monitoring and analysis tools. The exascale era poses unprecedented challenges. Because the presence of accelerators is more and more common among the fastest supercomputer and will play a role in exascale computing, compilers will need to support hybrid computer architectures and generate efficient code hiding the complexity of programming accelerators. Hand optimization of the code will be very difficult on exascale machine and will be increasingly assisted by automatic tuners. Application tuning will be more focus on parallel aspects of the computation because of large amount of available parallelism. The application workload will be distributed over million of processes, and to implement ad-hoc strategies directly in the application will be probably unfeasible while an adaptive run-time system will provide automatic load balancing. Debuggers and performance monitoring tools will deal with million processes and with huge amount of data from application and hardware counters, but they will still be required to minimize the overhead and retain scalability. In this talk, we present how the development environment of the CRESTA exascale EC project meets all these challenges by advancing the state of the art in the field.

    An investigation of compiler support for hybrid GPU programming, the design concepts, and the main characteristics of the alpha prototype implementation of the CRESTA development environment components for exascale computing are presented. A performance study of OpenACC compiler directives has been carried out, showing very promising results and indicating OpenACC as viable approach for programming hybrid exascale supercomputer. A new Domain-Specific Language (DSL) has been defined for the expression of parallel auto-tuning at very large scale. The focus of on the extension of the auto-tuning approach into the parallel domain to enable tuning of communication-related aspects of application. A new adaptive run-time system has been designed to schedule processes depending on the resource availability, on the workload, and on the run-time analysis of the application performance. The Allinea DDT debugger and the Dresden University of Technology MUST MPI correctness checker are being extended to provide a unified interface, to improve scalability, and to include new disruptive technology based on statistical analysis of run-time behavior of the application for anomalies detection. The new exascale prototypes of the Dresden University of Technology Vampir, VampirTrace and Score-P performance monitoring and analysis tools have been released. The new features include the possibility of applying filtering technique before loading performance data to drastically reduce memory needs during the performance analysis. The initial evaluation study of the development environment is targeted on the CRESTA project applications to determine how the development environment could be coupled into a production suite for exascale computing.

  • 72.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Vencels, Juris
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Peng, Ivy Bo
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Akhmetova, Dana
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Henri, Pierre
    Idle waves in high-performance computing2015Ingår i: Physical Review E. Statistical, Nonlinear, and Soft Matter Physics, ISSN 1539-3755, E-ISSN 1550-2376, Vol. 91, nr 1, s. 013306-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The vast majority of parallel scientific applications distributes computation among processes that are in a busy state when computing and in an idle state when waiting for information from other processes. We identify the propagation of idle waves through processes in scientific applications with a local information exchange between the two processes. Idle waves are nondispersive and have a phase velocity inversely proportional to the average busy time. The physical mechanism enabling the propagation of idle waves is the local synchronization between two processes due to remote data dependency. This study provides a description of the large number of processes in parallel scientific applications as a continuous medium. This work also is a step towards an understanding of how localized idle periods can affect remote processes, leading to the degradation of global performance in parallel scientific applications.

  • 73. Narasimhamurthy, S.
    et al.
    Danilov, N.
    Wu, S.
    Umanesan, G.
    Chien, Steven Wei Der
    KTH.
    Rivas-Gomez, Sergio
    KTH.
    Peng, Ivy Bo
    KTH.
    Laure, Erwin
    KTH.
    De Witt, S.
    Pleiter, D.
    Markidis, Stefano
    KTH.
    The SAGE project: A storage centric approach for exascale computing2018Ingår i: 2018 ACM International Conference on Computing Frontiers, CF 2018 - Proceedings, Association for Computing Machinery (ACM), 2018, s. 287-292Konferensbidrag (Refereegranskat)
    Abstract [en]

    SAGE (Percipient StorAGe for Exascale Data Centric Computing) is a European Commission funded project towards the era of Exascale computing. Its goal is to design and implement a Big Data/Extreme Computing (BDEC) capable infrastructure with associated software stack. The SAGE system follows a storage centric approach as it is capable of storing and processing large data volumes at the Exascale regime. SAGE addresses the convergence of Big Data Analysis and HPC in an era of next-generation data centric computing. This convergence is driven by the proliferation of massive data sources, such as large, dispersed scientific instruments and sensors where data needs to be processed, analyzed and integrated into simulations to derive scientific and innovative insights. A first prototype of the SAGE system has been been implemented and installed at the Jülich Supercomputing Center. The SAGE storage system consists of multiple types of storage device technologies in a multi-tier I/O hierarchy, including flash, disk, and non-volatile memory technologies. The main SAGE software component is the Seagate Mero Object Storage that is accessible via the Clovis API and higher level interfaces. The SAGE project also includes scientific applications for the validation of the SAGE concepts. The objective of this paper is to present the SAGE project concepts, the prototype of the SAGE platform and discuss the software architecture of the SAGE system.

  • 74.
    Narasimhamurthy, Sai
    et al.
    Seagate Syst UK, London, England..
    Danilov, Nikita
    Seagate Syst UK, London, England..
    Wu, Sining
    Seagate Syst UK, London, England..
    Umanesan, Ganesan
    Seagate Syst UK, London, England..
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Rivas-Gomez, Sergio
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Peng, Ivy Bo
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Pleiter, Dirk
    Julich Supercomp Ctr, Julich, Germany..
    de Witt, Shaun
    Culham Ctr Fus Energy, Abingdon, Oxon, England..
    SAGE: Percipient Storage for Exascale Data Centric Computing2019Ingår i: Parallel Computing, ISSN 0167-8191, E-ISSN 1872-7336, Vol. 83, s. 22-33Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We aim to implement a Big Data/Extreme Computing (BDEC) capable system infrastructure as we head towards the era of Exascale computing - termed SAGE (Percipient StorAGe for Exascale Data Centric Computing). The SAGE system will be capable of storing and processing immense volumes of data at the Exascale regime, and provide the capability for Exascale class applications to use such a storage infrastructure. SAGE addresses the increasing overlaps between Big Data Analysis and HPC in an era of next-generation data centric computing that has developed due to the proliferation of massive data sources, such as large, dispersed scientific instruments and sensors, whose data needs to be processed, analysed and integrated into simulations to derive scientific and innovative insights. Indeed, Exascale I/O, as a problem that has not been sufficiently dealt with for simulation codes, is appropriately addressed by the SAGE platform. The objective of this paper is to discuss the software architecture of the SAGE system and look at early results we have obtained employing some of its key methodologies, as the system continues to evolve.

  • 75. O'Donncha, F.
    et al.
    Iakymchuk, Roman
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Akhriev, A.
    Gschwandtner, P.
    Thoman, P.
    Heller, T.
    Aguilar, Xavier
    KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Dichev, K.
    Gillan, C.
    Markidis, Stefano
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Ragnoli, E.
    Vassiliadis, V.
    Johnston, M.
    Jordan, H.
    Fahringer, T.
    AllScale toolchain pilot applications: PDE based solvers using a parallel development environment2019Ingår i: Computer Physics Communications, ISSN 0010-4655, E-ISSN 1879-2944, artikel-id 107089Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    AllScale is a programming environment targeting simplified development of highly scalable parallel applications by dividing development responsibilities into silos. The front-end AllScale API provides a simple C++ development environment through a suite of parallel constructs expressions denoting tasks operating concurrently. This interfaces with the other components of the toolchain (core-level API, compiler and runtime) which manages tasks related to the machine and system level, hidden to the user. The paper describes the development of two large-scale parallel applications within the AllScale API, namely, an advection– diffusion model with data assimilation and a Lagrangian space-weather simulation model based on a particle-in-cell method. We present mathematical formulations and implementations and evaluate parallel constructs developed using the AllScale API. The performance of the applications from the perspective of both parallel scalability, and more importantly productivity are assessed. We demonstrate how the AllScale API can greatly improve developer productivity while maintaining parallel performance in two applications with distinct numerical characteristics. Code complexity metrics demonstrate reduction in application specific implementations of up to 30% while performance tests on three different compute systems demonstrate comparable parallel scalability to an MPI version of the code.

  • 76. Olshevsky, V.
    et al.
    Lapenta, G.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Energetics of Kinetic Reconnection in a Three-Dimensional Null-Point Cluster2013Ingår i: Physical Review Letters, ISSN 0031-9007, E-ISSN 1079-7114, Vol. 111, nr 4, s. 045002-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We perform three-dimensional particle-in-cell simulations of magnetic reconnection with multiple magnetic null points. Magnetic field energy conversion into kinetic energy is about five times higher than in traditional Harris sheet configuration. More than 85% of initial magnetic field energy is transferred to particle energy during 25 reversed ion cyclofrequencies. Magnetic reconnection in the cluster of null points evolves in three phases. During the first phase, ion beams are excited, then give part of their energy back to the magnetic field in the second phase. In the third phase, magnetic reconnection occurs in many small patches around the current channels formed along the stripes of a low magnetic field. Magnetic reconnection in null points essentially presents three-dimensional features, with no two-dimensional symmetries or current sheets.

  • 77. Olshevsky, Vyacheslav
    et al.
    Deca, Jan
    Divin, Andrey
    Peng, Ivy Bo
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Innocenti, Maria Elena
    Cazzola, Emanuele
    Lapenta, Giovanni
    Magnetic Null Points In Kinetic Simulations of Space Plasmas2016Ingår i: Astrophysical Journal, ISSN 0004-637X, E-ISSN 1538-4357, Vol. 819, nr 1, artikel-id 52Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present a systematic attempt to study magnetic null points and the associated magnetic energy conversion in kinetic particle-in-cell simulations of various plasma configurations. We address three-dimensional simulations performed with the semi-implicit kinetic electromagnetic code iPic3D in different setups: variations of a Harris current sheet, dipolar and quadrupolar magnetospheres interacting with the solar wind,. and a relaxing turbulent configuration with multiple null points. Spiral nulls are more likely created in space plasmas: in all our simulations except lunar magnetic anomaly (LMA) and quadrupolar mini-magnetosphere the number of spiral nulls prevails over the number of radial nulls by a factor of 3-9. We show that often magnetic nulls do not indicate the regions of intensive energy dissipation. Energy dissipation events caused by topological bifurcations at radial nulls are rather rare and short-lived. The so-called X-lines formed by the radial nulls in the Harris current sheet and LMA simulations are rather stable and do not exhibit any energy dissipation. Energy dissipation is more powerful in the vicinity of spiral nulls enclosed by magnetic flux ropes with strong currents at their axes (their cross. sections resemble 2D magnetic islands). These null lines reminiscent of Z-pinches efficiently dissipate magnetic energy due to secondary instabilities such as the two-stream or kinking instability, accompanied by changes in magnetic topology. Current enhancements accompanied by spiral nulls may signal magnetic energy conversion sites in the observational data.

  • 78. Olshevsky, Vyacheslav
    et al.
    Divin, Andrey
    Eriksson, Elin
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Lapenta, Giovanni
    ENERGY DISSIPATION IN MAGNETIC NULL POINTS AT KINETIC SCALES2015Ingår i: Astrophysical Journal, ISSN 0004-637X, E-ISSN 1538-4357, Vol. 807, nr 2, artikel-id 155Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We use kinetic particle-in-cell and MHD simulations supported by an observational data set to investigate magnetic reconnection in clusters of null points in space plasma. The magnetic configuration under investigation is driven by fast adiabatic flux rope compression that dissipates almost half of the initial magnetic field energy. In this phase powerful currents are excited producing secondary instabilities, and the system is brought into a state of "intermittent turbulence" within a few ion gyro-periods. Reconnection events are distributed all over the simulation domain and energy dissipation is rather volume-filling. Numerous spiral null points interconnected via their spines form null lines embedded into magnetic flux ropes; null point pairs demonstrate the signatures of torsional spine reconnection. However, energy dissipation mainly happens in the shear layers formed by adjacent flux ropes with oppositely directed currents. In these regions radial null pairs are spontaneously emerging and vanishing, associated with electron streams and small-scale current sheets. The number of spiral nulls in the simulation outweighs the number of radial nulls by a factor of 5-10, in accordance with Cluster observations in the Earth's magnetosheath. Twisted magnetic fields with embedded spiral null points might indicate the regions of major energy dissipation for future space missions such as the Magnetospheric Multiscale Mission.

  • 79. Olshevsky, Vyacheslav
    et al.
    Lapenta, Giovanni
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Divin, Andrey
    Role of Z-pinches in magnetic reconnection in space plasmas2015Ingår i: Journal of Plasma Physics, ISSN 0022-3778, E-ISSN 1469-7807, Vol. 81, nr 1, artikel-id 325810105Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A widely accepted scenario of magnetic reconnection in collisionless space plasmas is the breakage of magnetic field lines in X-points. In laboratory, reconnection is commonly studied in pinches, current channels embedded into twisted magnetic fields. No model of magnetic reconnection in space plasmas considers both nullpoints and pinches as peers. We have performed a particle-in-cell simulation of magnetic reconnection in a three-dimensional configuration where null-points are present initially, and Z-pinches are formed during the simulation along the lines of spiral null-points. The non-spiral null-points are more stable than spiral ones, and no substantial energy dissipation is associated with them. On the contrary, turbulent magnetic reconnection in the pinches causes the magnetic energy to decay at a rate of similar to 1.5% per ion gyro period. Dissipation in similar structures is a likely scenario in space plasmas with large fraction of spiral null-points.

  • 80.
    Peng, Bo
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Vaivads, A.
    Vencels, Juris
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Amaya, J.
    Divin, A.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Lapenta, G.
    The formation of a magnetosphere with implicit Particle-in-Cell simulations2015Ingår i: Procedia Computer Science, Elsevier, 2015, nr 1, s. 1178-1187Konferensbidrag (Refereegranskat)
    Abstract [en]

    We demonstrate the improvements to an implicit Particle-in-Cell code, iPic3D, on the example of dipolar magnetic field immersed in the flow of the plasma and show the formation of a magnetosphere. We address the problem of modelling multi-scale phenomena during the formation of a magnetosphere by implementing an adaptive sub-cycling technique to resolve the motion of particles located close to the magnetic dipole centre, where the magnetic field intensity is maximum. In addition, we implemented new open boundary conditions to model the inflow and outflow of plasma. We present the results of a global three-dimensional Particle-in-Cell simulation and discuss the performance improvements from the adaptive sub-cycling technique.

  • 81.
    Peng, I. B.
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Gioiosa, R.
    Kestor, G.
    Cicotti, P.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Exploring the performance benefit of hybrid memory system on HPC environments2017Ingår i: Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017, Institute of Electrical and Electronics Engineers (IEEE), 2017, s. 683-692, artikel-id 7965110Konferensbidrag (Refereegranskat)
    Abstract [en]

    Hardware accelerators have become a de-facto standard to achieve high performance on current supercomputers and there are indications that this trend will increase in the future. Modern accelerators feature high-bandwidth memory next to the computing cores. For example, the Intel Knights Landing (KNL) processor is equipped with 16 GB of high-bandwidth memory (HBM) that works together with conventional DRAM memory. Theoretically, HBM can provide ∼4× higher bandwidth than conventional DRAM. However, many factors impact the effective performance achieved by applications, including the application memory access pattern, the problem size, the threading level and the actual memory configuration. In this paper, we analyze the Intel KNL system and quantify the impact of the most important factors on the application performance by using a set of applications that are representative of scientific and data-analytics workloads. Our results show that applications with regular memory access benefit from MCDRAM, achieving up to 3× performance when compared to the performance obtained using only DRAM. On the contrary, applications with random memory access pattern are latency-bound and may suffer from performance degradation when using only MCDRAM. For those applications, the use of additional hardware threads may help hide latency and achieve higher aggregated bandwidth when using HBM.

  • 82. Peng, I. B.
    et al.
    Gioiosa, R.
    Kestor, G.
    Vetter, J. S.
    Cicotti, P.
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Characterizing the performance benefit of hybrid memory system for HPC applications2018Ingår i: Parallel Computing, ISSN 0167-8191, E-ISSN 1872-7336, Vol. 76, s. 57-69Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Heterogenous memory systems that consist of multiple memory technologies are becoming common in high-performance computing environments. Modern processors and accelerators, such as the Intel Knights Landing (KNL) CPU and NVIDIA Volta GPU, feature small-size high-bandwidth memory near the compute cores and large-size normal-bandwidth memory that is connected off-chip. Theoretically, HBM can provide about four times higher bandwidth than conventional DRAM. However, many factors impact the actual performance improvement that an application can achieve on such system. In this paper, we focus on the Intel KNL system and identify the most important factors on the application performance, including the application memory access pattern, the problem size, the threading level and the actual memory configuration. We use a set of representative applications from both scientific and data-analytics domains. Our results show that applications with regular memory access benefit from MCDRAM, achieving up to three times performance when compared to the performance obtained using only DRAM. On the contrary, applications with irregular memory access pattern are latency-bound and may suffer from performance degradation when using only MCDRAM. Also, we provide memory-centric analysis of four applications, identify their major data objects, correlate their characteristics to the performance improvement on the testbed.

  • 83.
    Peng, I. B.
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    The cost of synchronizing imbalanced processes in message passing systems2015Ingår i: Proceedings - IEEE International Conference on Cluster Computing, ICCC, Institute of Electrical and Electronics Engineers (IEEE), 2015, s. 408-417Konferensbidrag (Refereegranskat)
    Abstract [en]

    Synchronization in message passing systems is achieved by communication among processes. System and architectural noise and different workloads cause processes to be imbalanced and to reach synchronization points at different time. Thus, both communication and imbalance impact the synchronization performance. In this paper, we study the algorithmic properties that allow the communication in synchronization to absorb the initial imbalance among processes. We quantify the imbalance absorption properties of different barrier algorithms using a LogP Monte Carlo simulator. We found that linear and f-way tournament barriers can absorb up to 95% of random exponential imbalance with the standard deviation equal to the communication time for one message. Dissemination, butterfly and pairwise exchange barriers, on the other hand, do not absorb imbalance but can effectively bound the post-barrier imbalance. We identify that synchronization transits from communication-dominated to imbalance-dominated when the standard deviation of imbalance distribution is more than twice the communication time for one message. In our study, f-way tournament barriers provided the best imbalance absorption rate and convenient communication time.

  • 84.
    Peng, I. Bo
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Johlander, A.
    Vaivads, A.
    Khotyaintsev, Y.
    Henri, P.
    Lapenta, G.
    Kinetic structures of quasi-perpendicular shocks in global particle-in-cell simulations2015Ingår i: Physics of Plasmas, ISSN 1070-664X, E-ISSN 1089-7674, Vol. 22, nr 9, artikel-id 092109Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We carried out global Particle-in-Cell simulations of the interaction between the solar wind and a magnetosphere to study the kinetic collisionless physics in super-critical quasi-perpendicular shocks. After an initial simulation transient, a collisionless bow shock forms as a result of the interaction of the solar wind and a planet magnetic dipole. The shock ramp has a thickness of approximately one ion skin depth and is followed by a trailing wave train in the shock downstream. At the downstream edge of the bow shock, whistler waves propagate along the magnetic field lines and the presence of electron cyclotron waves has been identified. A small part of the solar wind ion population is specularly reflected by the shock while a larger part is deflected and heated by the shock. Solar wind ions and electrons are heated in the perpendicular directions. Ions are accelerated in the perpendicular direction in the trailing wave train region. This work is an initial effort to study the electron and ion kinetic effects developed near the bow shock in a realistic magnetic field configuration.

  • 85.
    Peng, Ivy B.
    et al.
    Oak Ridge Natl Lab, Oak Ridge, TN 37830 USA..
    Vetter, Jeffrey S.
    Oak Ridge Natl Lab, Oak Ridge, TN 37830 USA..
    Moore, Shirley
    Oak Ridge Natl Lab, Oak Ridge, TN 37830 USA..
    Joydeep, Rakshit
    Intel Labs, Santa Clara, CA USA..
    Markidis, Stefano
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Analyzing the Suitability of Contemporary 3D-Stacked PIM Architectures for HPC Scientific Applications2019Ingår i: CF '19 - PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS, ASSOC COMPUTING MACHINERY , 2019, s. 256-262Konferensbidrag (Refereegranskat)
    Abstract [en]

    Scaling off-chip bandwidth is challenging due to fundamental limitations, such as a fixed pin count and plateauing signaling rates. Recently, vendors have turned to 2.5D and 3D stacking to closely integrate system components. Interestingly, these technologies can integrate a logic layer under multiple memory dies, enabling computing capability inside a memory stack. This trend in stacking is making PIM architectures commercially viable. In this work, we investigate the suitability of offloading kernels in scientific applications onto 3D stacked PIM architectures. We evaluate several hardware constraints resulted from the stacked structure. We perform extensive simulation experiments and indepth analysis to quantify the impact of application locality in TI,Bs, data caches, and memory stacks. Our results also identify design optimization areas in software and hardware for HPC scientific applications.

  • 86.
    Peng, Ivy Bo
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Gioiosa, Roberto
    Kestor, Gokcen
    Cicotti, Pietro
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    RTHMS: A Tool for Data Placement on Hybrid Memory System2017Ingår i: ACM SIGPLAN NOTICES, ASSOC COMPUTING MACHINERY , 2017, Vol. 52, nr 9, s. 82-91Konferensbidrag (Refereegranskat)
    Abstract [en]

    Traditional scientific and emerging data analytics applications require fast, power-efficient, large, and persistent memories. Combining all these characteristics within a single memory technology is expensive and hence future supercomputers will feature different memory technologies side-by-side. However, it is a complex task to program hybrid-memory systems and to identify the best object-to-memory mapping. We envision that programmers will probably resort to use default configurations that only require minimal interventions on the application code or system settings. In this work, we argue that intelligent, fine-grained data placement can achieve higher performance than default setups. We present an algorithm for data placement on hybrid-memory systems. Our algorithm is based on a set of single-object allocation rules and global data placement decisions. We also present RTHMS, a tool that implements our algorithm and provides recommendations about the object-to-memory mapping. Our experiments on a hybrid memory system, an Intel Knights Landing processor with DRAM and HBM, show that RTHMS is able to achieve higher performance than the default configuration. We believe that RTHMS will be a valuable tool for programmers working on complex hybrid-memory systems.

  • 87.
    Peng, Ivy Bo
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Gioiosa, Roberto
    Kestor, Gokcen
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Preparing HPC Applications for the Exascale Era: A Decoupling Strategy2017Ingår i: 2017 46th International Conference on Parallel Processing (ICPP), IEEE Computer Society, 2017, s. 1-10, artikel-id 8025274Konferensbidrag (Refereegranskat)
    Abstract [en]

    Production-quality parallel applications are often a mixture of diverse operations, such as computation- and communication-intensive, regular and irregular, tightly coupled and loosely linked operations. In conventional construction of parallel applications, each process performs all the operations, which might result inefficient and seriously limit scalability, especially at large scale. We propose a decoupling strategy to improve the scalability of applications running on large-scale systems. Our strategy separates application operations onto groups of processes and enables a dataflow processing paradigm among the groups. This mechanism is effective in reducing the impact of load imbalance and increases the parallel efficiency by pipelining multiple operations. We provide a proof-of-concept implementation using MPI, the de-facto programming system on current supercomputers. We demonstrate the effectiveness of this strategy by decoupling the reduce, particle communication, halo exchange and I/O operations in a set of scientific and data-analytics applications. A performance evaluation on 8,192 processes of a Cray XC40 supercomputer shows that the proposed approach can achieve up to 4x performance improvement.

  • 88.
    Peng, Ivy Bo
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Gioiosa, Roberto
    Pacific Northwest Natl Lab, Computat Sci & Math Div, Richland, WA 99352 USA..
    Kestor, Gokcen
    Pacific Northwest Natl Lab, Computat Sci & Math Div, Richland, WA 99352 USA..
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    MPI Streams for HPC Applications2017Ingår i: New Frontiers in High Performance Computing and Big Data / [ed] Geoffrey Fox, Vladimir Getov, Lucio Grandinetti, Gerhard Joubert, Thomas Sterling, IOS Press, 2017, s. 75-92Konferensbidrag (Refereegranskat)
    Abstract [en]

    Data streams are a sequence of data flowing between source and destination processes. Streaming is widely used for signal, image and video processing for its efficiency in pipelining and effectiveness in reducing demand for memory. The goal of this work is to extend the use of data streams to support both conventional scientific applications and emerging data analytics applications running on HPC platforms. We introduce an extension called MPIStream to the de-facto programming standard on HPC, MPI. MPIStream supports data streams either within a single application or among multiple applications. We present three use cases using MPI streams in HPC applications together with their parallel performance. We show the convenience of using MPI streams to support the needs from both traditional HPC and emerging data analytics applications running on supercomputers.

  • 89.
    Peng, Ivy Bo
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Holmes, D.
    Bull, M.
    A Data streaming model in MPI2015Ingår i: Proceedings of the 3rd ExaMPI Workshop at the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2015, ACM Digital Library, 2015Konferensbidrag (Refereegranskat)
    Abstract [en]

    Data streaming model is an effective way to tackle the chal-lenge of data-intensive applications. As traditional HPC applications generate large volume of data and more data-intensive applications move to HPC infrastructures, it is nec-essary to investigate the feasibility of combining message-passing and streaming programming models. MPI, the de facto standard for programming on HPC, cannot intuitively express the communication pattern and the functional op-erations required in streaming models. In this work, we de-signed and implemented a data streaming library MPIStream atop MPI to allocate data producers and consumers, to stream data continuously or irregularly and to process data at run-Time. In the same spirit as the STREAM benchmark, we developed a parallel stream benchmark to measure data processing rate. The performance of the library largely de-pends on the size of the stream element, the number of data producers and consumers and the computational intensity of processing one stream element. With 2,048 data produc-ers and 2,048 data consumers in the parallel benchmark, MPIStream achieved 200 GB/s processing rate on a Blue Gene/Q supercomputer. We illustrate that a streaming li-brary for HPC applications can effectively enable irregular parallel I/O, application monitoring and threshold collective operations. © 2015 ACM.

  • 90.
    Peng, Ivy Bo
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Vencels, Juris
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Lapenta, Giovanni
    Divin, Andrey
    Vaivads, Andris
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Energetic particles in magnetotail reconnection2015Ingår i: Journal of Plasma Physics, ISSN 0022-3778, E-ISSN 1469-7807, Vol. 81, artikel-id 325810202Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We carried out a 3D fully kinetic simulation of Earth's magnetotail magnetic reconnection to study the dynamics of energetic particles. We developed and implemented a new relativistic particle mover in iPIC3D, an implicit Particle-in-Cell code, to correctly model the dynamics of energetic particles. Before the onset of magnetic reconnection, energetic electrons are found localized close to current sheet and accelerated by lower hybrid drift instability. During magnetic reconnection, energetic particles are found in the reconnection region along the x-line and in the separatrices region. The energetic electrons are first present in localized stripes of the separatrices and finally cover all the separatrix surfaces. Along the separatrices, regions with strong electron deceleration are found. In the reconnection region, two categories of electron trajectory are identified. First, part of the electrons are trapped in the reconnection region, bouncing a few times between the outflow jets. Second, part of the electrons pass the reconnection region without being trapped. Different from electrons, energetic ions are localized on the reconnection fronts of the outflow jets.

  • 91. Restante, A. L.
    et al.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Lapenta, G.
    Intrator, T.
    Geometrical investigation of the kinetic evolution of the magnetic field in a periodic flux rope2013Ingår i: Physics of Plasmas, ISSN 1070-664X, E-ISSN 1089-7674, Vol. 20, nr 8, s. 082501-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Flux ropes are bundles of magnetic field wrapped around an axis. Many laboratory, space, and astrophysics processes can be represented using this idealized concept. Here, a massively parallel 3D kinetic simulation of a periodic flux rope undergoing the kink instability is studied. The focus is on the topology of the magnetic field and its geometric structures. The analysis considers various techniques such as Poincare maps and the quasi-separatrix layer (QSL). These are used to highlight regions with expansion or compression and changes in the connectivity of magnetic field lines and consequently to outline regions where heating and current may be generated due to magnetic reconnection. The present study is, to our knowledge, the first QSL analysis of a fully kinetic 3D particle in cell simulation and focuses the existing QSL method of analysis to periodic systems.

  • 92.
    Rivas Gomez, Sergio
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Brabazon, K.
    Perks, O.
    Narasimhamurthy, S.
    Decoupled Strategy for Imbalanced Workloads in MapReduce Frameworks2019Ingår i: Proceedings - 20th International Conference on High Performance Computing and Communications, 16th International Conference on Smart City and 4th International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2018, Institute of Electrical and Electronics Engineers (IEEE), 2019, s. 921-927Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this work, we consider the integration of MPI one-sided communication and non-blocking I/O in HPC-centric MapReduce frameworks. Using a decoupled strategy, we aim to overlap the Map and Reduce phases of the algorithm by allowing processes to communicate and synchronize using solely one-sided operations. Hence, we effectively increase the performance in situations where the workload per process becomes unexpectedly unbalanced. Using a Word-Count implementation and a large dataset from the Purdue MapReduce Benchmarks Suite (PUMA), we demonstrate that our approach can provide up to 23% performance improvement on average compared to a reference MapReduce implementation that uses state-of-the-art MPI collective communication and I/O.

  • 93.
    Rivas-Gomez, Sergio
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Fanfarillo, Alessandro
    National Center for Atmospheric Research, Boulder, CO, United States..
    Narasimhamurthy, Sai
    Seagate Syst UK, Havant PO9 1SA, England..
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Persistent Coarrays: Integrating MPI Storage Windows in Coarray Fortran2019Ingår i: Proceedings of the 26th European MPI Users' Group Meeting (EuroMPI 2019), ACM Digital Library, 2019, s. 1-8, artikel-id 3Konferensbidrag (Refereegranskat)
    Abstract [en]

    The inherent integration of novel hardware and software components on HPC is expected to considerably aggravate the Mean Time Between Failures (MTBF) on scientific applications, while simultaneously increase the programming complexity of these clusters. In this work, we present the initial steps towards the integration of transparent resilience support inside Coarray Fortran. In particular, we propose persistent coarrays, an extension of OpenCoarrays that integrates MPI storage windows to leverage its transport layer and seamlessly map coarrays to files on storage. Preliminary results indicate that our approach provides clear benefits on representative workloads, while incurring in minimal source code changes.

  • 94.
    Rivas-Gomez, Sergio
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Fanfarillo, Alessandro
    National Center for Atmospheric Research, Boulder, CO, United States..
    Valat, Sebastien
    Atos, 1 Rue de Provence, 38130 Echirolles, France.
    Laferriere, Christophe
    Atos, 1 Rue de Provence, 38130 Echirolles, France.
    Couvee, Philippe
    Atos, 1 Rue de Provence, 38130 Echirolles, France.
    Narasimhamurthy, Sai
    Seagate Syst UK, Havant PO9 1SA, England..
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    uMMAP-IO: User-level Memory-mapped I/O for HPC2019Ingår i: Proceedings of the 26th IEEE International Conference on High-Performance Computing, Data, and Analytics (HiPC'19),, Institute of Electrical and Electronics Engineers (IEEE), 2019Konferensbidrag (Refereegranskat)
    Abstract [en]

    The integration of local storage technologies alongside traditional parallel file systems on HPC clusters, is expected to rise the programming complexity on scientific applications aiming to take advantage of the increased-level of heterogeneity. In this work, we present uMMAP-IO, a user-level memory-mapped I/O implementation that simplifies data management on multi-tier storage subsystems. Compared to the memory-mapped I/O mechanism of the OS, our approach features per-allocation configurable settings (e.g., segment size) and transparently enables access to a diverse range of memory and storage technologies, such as the burst buffer I/O accelerators. Preliminary results indicate that uMMAP-IO provides at least 5-10x better performance on representative workloads in comparison with the standard memory-mapped I/O of the OS, and approximately 20-50% degradation on average compared to using conventional memory allocations without storage support up to 8192 processes.

  • 95.
    Rivas-Gomez, Sergio
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Gioiosa, Roberto
    Oak Ridge Natl Lab, Oak Ridge, TN 37830 USA..
    Peng, Ivy Bo
    Oak Ridge Natl Lab, Oak Ridge, TN 37830 USA..
    Kestor, Gokcen
    Oak Ridge Natl Lab, Oak Ridge, TN 37830 USA..
    Narasimhamurthy, Sai
    Seagate Syst UK, Havant PO9 1SA, England..
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    MPI windows on storage for HPC applications2018Ingår i: Parallel Computing, ISSN 0167-8191, E-ISSN 1872-7336, Vol. 77, s. 38-56Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Upcoming HPC clusters will feature hybrid memories and storage devices per compute node. In this work, we propose to use the MPI one-sided communication model and MPI windows as unique interface for programming memory and storage. We describe the design and implementation of MPI storage windows, and present its benefits for out-of-core execution, parallel I/O and fault-tolerance. In addition, we explore the integration of heterogeneous window allocations, where memory and storage share a unified virtual address space. When performing large, irregular memory operations, we verify that MPI windows on local storage incurs a 55% performance penalty on average. When using a Lustre parallel file system, "asymmetric" performance is observed with over 90% degradation in writing operations. Nonetheless, experimental results of a Distributed Hash Table, the HACC I/O kernel mini-application, and a novel MapReduce implementation based on the use of MPI one-sided communication, indicate that the overall penalty of MPI windows on storage can be negligible in most cases in real-world applications.

  • 96.
    Rivas-Gomez, Sergio
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Peng, Ivy Bo
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, E.
    Kestor, G.
    Gioiosa, R.
    Extending message passing interface windows to storage2017Ingår i: Proceedings - 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017, Institute of Electrical and Electronics Engineers Inc. , 2017, s. 728-730Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper presents an extension to MPI supporting the one-sided communication model and window allocations in storage. Our design transparently integrates with the current MPI implementations, enabling applications to target MPI windows in storage, memory or both simultaneously, without major modifications. Initial performance results demonstrate that the presented MPI window extension could potentially be helpful for a wide-range of use-cases and with low-overhead.

  • 97.
    Rivas-Gomez, Sergio
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Pena, A. J.
    Moloney, D.
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Exploring the vision processing unit as co-processor for inference2018Ingår i: Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, s. 589-598, artikel-id 8425465Konferensbidrag (Refereegranskat)
    Abstract [en]

    The success of the exascale supercomputer is largely debated to remain dependent on novel breakthroughs in technology that effectively reduce the power consumption and thermal dissipation requirements. In this work, we consider the integration of co-processors in high-performance computing (HPC) to enable low-power, seamless computation offloading of certain operations. In particular, we explore the so-called Vision Processing Unit (VPU), a highly-parallel vector processor with a power envelope of less than 1W. We evaluate this chip during inference using a pre-trained GoogLeNet convolutional network model and a large image dataset from the ImageNet ILSVRC challenge. Preliminary results indicate that a multi-VPU configuration provides similar performance compared to reference CPU and GPU implementations, while reducing the thermal-design power (TDP) up to 8x in comparison.

  • 98.
    Simmendinger, Christian
    et al.
    T Syst Solut Res, Stuttgart, Germany..
    Iakymchuk, Roman
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Cebamanos, Luis
    Univ Edinburgh, EPCC, Edinburgh, Midlothian, Scotland..
    Akhmetova, Dana
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Bartsch, Valeria
    Fraunhofer ITWM, HPC Dept, Kaiserslautern, Germany..
    Rotaru, Tiberiu
    Fraunhofer ITWM, Kaiserslautern, Germany..
    Rahn, Mirko
    Fraunhofer ITWM, HPC Dept, Kaiserslautern, Germany..
    Laure, Erwin
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC. KTH Royal Inst Technol, High Performance Comp, Stockholm, Sweden.;KTH Royal Inst Technol, PDC Ctr, High Performance Comp Ctr, Stockholm, Sweden..
    Markidis, Stefano
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST). KTH Royal Inst Technol, High Performance Comp, Stockholm, Sweden..
    Interoperability strategies for GASPI and MPI in large-scale scientific applications2019Ingår i: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 33, nr 3, s. 554-568Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    One of the main hurdles of partitioned global address space (PGAS) approaches is the dominance of message passing interface (MPI), which as a de facto standard appears in the code basis of many applications. To take advantage of the PGAS APIs like global address space programming interface (GASPI) without a major change in the code basis, interoperability between MPI and PGAS approaches needs to be ensured. In this article, we consider an interoperable GASPI/MPI implementation for the communication/performance crucial parts of the Ludwig and iPIC3D applications. To address the discovered performance limitations, we develop a novel strategy for significantly improved performance and interoperability between both APIs by leveraging GASPI shared windows and shared notifications. First results with a corresponding implementation in the MiniGhost proxy application and the Allreduce collective operation demonstrate the viability of this approach.

  • 99.
    Sishtla, Chaitanya Prasad
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Divin, Andrey
    St Petersburg State Univ, Dept Phys, St Petersburg 198504, Russia..
    Deca, Jan
    Univ Colorado, LASP, Boulder, CO 80303 USA.;NASA, Inst Modeling Plasma Atmospheres & Cosm Dust, SSERVI, Moffett Field, CA 94035 USA..
    Olshevsky, Viacheslav
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Electron trapping in the coma of a weakly outgassing comet2019Ingår i: Physics of Plasmas, ISSN 1070-664X, E-ISSN 1089-7674, Vol. 26, nr 10, artikel-id 102904Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Measurements from the Rosetta mission have shown a multitude of nonthermal electron distributions in the cometary environment, challenging the previously assumed plasma interaction mechanisms near a cometary nucleus. In this paper, we discuss electron trapping near a weakly outgassing comet from a fully kinetic (particle-in-cell) perspective. Using the electromagnetic fields derived from the simulation, we characterize the trajectories of trapped electrons in the potential well surrounding the cometary nucleus and identify the distinguishing features in their respective velocity and pitch angle distributions. Our analysis allows us to define a clear boundary in velocity phase space between the distributions of trapped and passing electrons. Published under license by AIP Publishing.

  • 100.
    Sishtla, Chaitanya Prasad
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Olshevsky, Viacheslav
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Chien, Wei Der
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Particle-in-Cell Simulations of Plasma Dynamics in Cometary Environment2019Ingår i: Journal of Physics: Conference Series, Institute of Physics Publishing (IOPP), 2019, Vol. 1225, nr 1, artikel-id 012009Konferensbidrag (Refereegranskat)
    Abstract [en]

    We perform and analyze global Particle-in-Cell (PIC) simulations of the interaction between solar wind and an outgassing comet with the goal of studying the plasma kinetic dynamics of a cometary environment. To achieve this, we design and implement a new numerical method in the iPIC3D code to model outgassing from the comet: new plasma particles are ejected from the comet "surface" at each computational cycle. Our simulations show that a bow shock is formed as a result of the interaction between solar wind and outgassed particles. The analysis of distribution functions for the PIC simulations shows that at the bow shock part of the incoming solar wind, ions are reflected while electrons are heated. This work attempts to reveal kinetic effects in the atmosphere of an outgassing comet using a fully kinetic Particle-in-Cell model.

123 51 - 100 av 111
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf