For the last few years, decomposing processors into multiple cores that operate independently, in parallel, within a shared address space, has increased the power of computer processors. This paper presents a new method for programming dense linear algebra algorithms that gives modern architectures, in this context, better performance than the traditional approach of using libraries such as linear algebra package (LAPACK).
展开▼