This paper presents an instruction scheduling and cluster assignment approach for clustered very long instruction words (VLIW) processors. The technique produces high performance code by simultaneously balancing instructions among clusters and minimizing the amount of inter-cluster data communications. The scheme is evaluated based on benchmarks extracted from UTDSP. Results show a significant speedup compared with previously used techniques with speed-ups of up to 44%, with average speed-ups ranging from 14% (2-cluster) to 18% (4-cluster).
在多簇处理器情况下,指令应用所带来的簇间数据交互问题已经成为制约处理器性能的关键问题。针对此问题提出了在一般的调度后进行一次后溯重调度优化过程,减少了簇间的数据交互量,提高了编译器关于处理器的利用率,同时减少了编译生成的指令序列运行时所消耗的功耗。实验结果表明,利用该方法进行调度,比列表调度算法簇间数据交互量减少平均44.36%,调度后的指令执行时间的平均减少量为24.93%,比UAS(unified assign and schedule)调度算法簇间数据交互量减少平均31.25%,调度后的指令执行时间的平均减少量为14.62%。