An important challenge in Linear Algebra accelerated compilers like XLA is multi-pass optimization and analysis. There has been recent interest chiefly in XLA target-dependent optimization on the graph-level, subgraph-level, and kernel-level phases. We specifically focus on target-independent optimization pass ordering for XLA HLO, which is the problem of finding the optimal sequence of compiler optimization passes. However, there is little domain specific study in pass ordering for XLA HLO. To this end, we propose introducing deep Reinforcement Learning (RL) based search for optimal XLA HLO pass ordering. We also propose enhancements to the deep RL algorithms to further improve optimal search performance and open the research direction for domain-specific guidance for RL. We create an XLA Gym experimentation framework as a tool to enable RL algorithms to interact with the compiler for passing optimizations and thereby train agents. Overall, in our experimentation we observe an average of 13.3% improvement in operation count reduction on a benchmark of GPT-2 training graphs and 10.4% improvement on a diverse benchmark including GPT-2, BERT, and ResNet graphs using the proposed approach over the compiler's default phase ordering.

Overall algorithm:

Results on benchmark including GPT-2, BERT, and ResNet models:


  title={Target-independent XLA optimization using Reinforcement Learning},
  author={Ganai, Milan and Li, Haichen and Enns, Theodore and Wang, Yida and Huang, Randy},
  maintitle = {Neural Information Processing Systems 2022},
  booktitle = {Workshop on Machine Learning for Systems},