- Referenced in 59 articles
- parallel implementation of linear algebra algorithms and applications on distributed memory supercomputers such ... natural approach to encoding so-called blocked algorithms, which achieve high performance by operating ... centric approach to data distribution, sets PLAPACK apart from other parallel linear algebra libraries, allowing...
- Referenced in 40 articles
- including evolutionary algorithms (EA), local searches (LS), the most common parallel and distributed models...
- Referenced in 227 articles
- systems of equations on shared-memory and distributed-memory multiprocessors. The solver has has been ... Parallel on SMPs and Cluster of SMPs. Automatic combination of iterative and direct solver algorithms...
- Referenced in 62 articles
- include the distributed reduction of single topologies on multiple processor cores. The parallel reduction ... system. Fast graph and matroid based algorithms allow for the identification of equivalent topologies...
- Referenced in 11 articles
- PUMMA: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers. he paper describes Parallel ... Universal Matrix Multiplication Algorithms (PUMMA) on distributed memory concurrent computers. The PUMMA package includes ... block cyclic data distribution. The routines perform efficiently for a wide range of processor configurations ... BLAS routine xGEMM. Details of the parallel implementation of the routines are given, and results...
- Referenced in 42 articles
- Processing (DSP) algorithms, in particular fast transform algorithms such as the fast Fourier transform. SPIRAL ... platforms including SSE, multicore, Cell, GPU, distributed memory parallel processors, and FPGA, and has produced ... some of the fastest implementations of these algorithms on these platforms (SPIRAL is used...
- Referenced in 7 articles
- parallel algorithm for calculating nonequispaced fast Fourier transforms on massively parallel distributed memory architectures ... serial algorithm due to the use of oversampled FFT. This algorithm has been implemented ... Furthermore, we derive a new parallel distributed memory algorithm for the fast computation of fully ... that an appropriate adjustment of the underlying parallel nonequispaced fast Fourier transform circumvents severe load...
- Referenced in 48 articles
- modeling and simulation of entities in parallel and distributed computing (PDC) systems-users, applications, resources ... schedulers) for design and evaluation of scheduling algorithms. It provides a comprehensive facility for creating...
- Referenced in 35 articles
- massively parallel supercomputers with distributed memory. While both versions use a tree algorithm to compute ... this study, we detail the numerical algorithms employed, and show various tests of the code ... release both the serial and the massively parallel version of the code...
- Referenced in 8 articles
- High-Performance Parallel and Distributed Graph Computation The Parallel BGL builds on the Boost Graph ... offering similar data structures, algorithms, and syntax for distributed, parallel computation that the BGL offers ... both experimentation with and comparison of parallel graph algorithms and to provide solid implementations...
- Referenced in 12 articles
- symmetric systems, real, parallel on distributed-memory clusters, combinatorial graph algorithms...
- Referenced in 17 articles
- between processors. A parallel version of a sequential importance sampling solution algorithm based on local ... continuous distribution of possible realisations. It utilises the parallel nested Benders algorithm and a parallel...
- Referenced in 8 articles
- matrices and the pathological eigenvalue distribution challenge the efficiency and robustness of the solver ... article, we present a parallel eigenvalue algorithm based on distributed spectrum slicing. We describe...
- Referenced in 60 articles
- include the null message and conditional event algorithms. The paper describes the GloMoSim library, addresses ... parallelization, and presents a set of experimental results on the IBM 9076 SP, a distributed...
- Referenced in 24 articles
- parallel applications over increasingly large sets of distributed resources. Consequently, the study of scheduling algorithms...
- Referenced in 5 articles
- closed queueing networks A parallel distribution analysis by chain algorithm (PDAC) is presented ... class queueing networks. The PDAC algorithm uses data parallel computation of the summation indices needed...
- Referenced in 13 articles
- fast Fourier transforms (FFTs) on massively parallel, distributed memory architectures based on the message passing ... Similar to established transpose FFT algorithms, we propose a parallel FFT framework that is based ... propose an algorithm to calculate pruned FFTs more efficiently on distributed memory architectures. For example...
- Referenced in 4 articles
- Choices also include different automatic parallelization techniques, data distributions, algorithmic parameters, transformations, and blocking...
- Referenced in 5 articles
- been much research on checkpointing algorithms for parallel and distributed systems; but surprisingly few implementations ... library call. It implements three consistent checkpointing algorithms, two optimizations to reduce checkpoint time...
- Referenced in 16 articles
- efficient distributed memory algorithms makes it impractical to rewrite programs for every new parallel machine...