Parallel three-dimensional nonequispaced fast Fourier transforms and their application to particle simulation Starting from an approved serial algorithm, we develop a new parallel algorithm for calculating nonequispaced fast Fourier transforms on massively parallel distributed memory architectures. We demonstrate how to deal with the inherent load imbalance of the serial algorithm due to the use of oversampled FFT. This algorithm has been implemented in a new open source software library called PNFFT. Furthermore, we derive a new parallel distributed memory algorithm for the fast computation of fully Coulomb interactions in a charged particle system with nonperiodic boundary conditions based on a particle-mesh approximation scheme. We show that an appropriate adjustment of the underlying parallel nonequispaced fast Fourier transform circumvents severe load imbalance due to particle scaling. To prove the high scalability of our algorithms we provide performance results on a BlueGene/P system using up to 65536 cores.