Scheduling for numerical linear algebra library at scale

Jakub Kurzak*, Hatem Ltaief, Jack J. Dongarra, Rosa M. Badia

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations


State-of-the-art dense linear algebra software, such as the LAPACK and ScaLAPACK libraries, suffer performance losses on multicore processors due to their inability to fully exploit thread-level parallelism. At the same time the coarse-grain dataflow model gains popularity as a paradigm for programming multicore architectures. This work looks at implementing classic dense linear algebra workloads, Cholesky factorization and QR factorization, using dynamic data-driven execution. Two emerging approaches to implementing coarse-grain dataflow are examined, the model of nested parallelism, represented by the Cilk framework, and the model of parallelism expressed through an arbitrary Direct Acyclic Graph, represented by the SMP Superscalar framework. Performance and coding effort are analyzed and compared agains code manually parallelized at the thread level.

Original languageEnglish (US)
Title of host publicationHigh Speed and Large Scale Scientific Computing
PublisherIOS Press BV
Number of pages24
ISBN (Print)9781607500735
StatePublished - 2009
Externally publishedYes

Publication series

NameAdvances in Parallel Computing
ISSN (Print)0927-5452


  • Cholesky
  • QR
  • linear algebra
  • matrix factorization
  • multicore
  • scheduling
  • task graph

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Scheduling for numerical linear algebra library at scale'. Together they form a unique fingerprint.

Cite this