In this paper, we study the problem of scheduling parallel loops at compile-time for a heterogeneous network of machines. We consider heterogeneity in three aspects of parallel programming: program, processor and network. A heterogeneous program has parallel loops with different amount of work in each iteration; heterogeneous processors have different speeds; and a heterogeneous network has different cost of communication between processors.
We propose a simple yet comprehensive model for use in compiling for a network of processors, and develop compiler algorithms for generating optimal and sub-optimal schedules of loops for load balancing, communication optimizations and network contention. Experiments show that a significant improvement of performance is achieved using our techniques.
Our www address is http://www.cs.rochester.edu/u/wei