For the purpose of further saving computing time, an improved algorithm about NSFOT is provided in this paper. That is, by introducing the simple operations such as preprocessing or after-processing, Haar and Walsh transforms are performed conveniently on the multiprocessor. As a result, one large size problem is divided into several small size sub-problems, load on every processor not only decreases greatly but also gets so uniform that much time is saved. Both the theoretical analysis and experimental results demonstrate the effectiveness of the proposed approach.