Multiplication of Matrices of Arbitrary Shape on a Data Parallel Computer
1994 (English)In: Parallel Computing, ISSN 0167-8191, Vol. 20, no 7, 919-951 p.Article in journal (Refereed) Published
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been implemented on the Connection Machine system CM-200 are described. No assumption is made on the shape or size of the operands. For matrix-matrix multiplication, both the nonsystolic and the systolic algorithms are outlined. A systolic algorithm that computes the product matrix in-place is described in detail. We show that a level-3 DBLAS yields better performance than a level-2 DBLAS. On the Connection Machine system CM-200, blocking yields a performance improvement by a factor of up to three over level-2 DBLAS. For certain matrix shapes the systolic algorithms offer both improved performance and significantly reduced temporary storage requirements compared to the nonsystolic block algorithms.
We show that, in order to minimize the communication time, an algorithm that leaves the largest operand matrix stationary should be chosen for matrix-matrix multiplication. Furthermore, it is shown both analytically and experimentally that the optimum shape of the processor array yields square stationary submatrices in each processor, i.e. the ratio between the length of the axes of the processing array must be the same as the ratio between the corresponding axes of the stationary matrix. The optimum processor array shape may yield a factor of five performance enhancement for the multiplication of square matrices. For rectangular matrices a factor of 30 improvement was observed for an optimum processor array shape compared to a poorly chosen processor array shape.
Place, publisher, year, edition, pages
1994. Vol. 20, no 7, 919-951 p.
Linear algebra; Matrix multiplication; Nonsystolic algorithm; Systolic algorithm; Distributed BLAS; Connection machine CM-200; Performance results
Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-90990DOI: 10.1016/0167-8191(94)90011-6OAI: oai:DiVA.org:kth-90990DiVA: diva2:507641
NR 201408052012-03-052012-03-05Bibliographically approved