Publications

2024

  1. AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
    Xie, Zhiqiang, Kang, Hao, Sheng, Ying, Krishna, Tushar, Fatahalian, Kayvon, and Kozyrakis, Christos
    2024
  2. Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight
    Xie, Zhiqiang, Zheng, Yujia, Ottens, Lizi, Zhang, Kun, Kozyrakis, Christos, and Mace, Jonathan
    2024
  3. SGLang: Efficient Execution of Structured Language Model Programs
    Zheng, Lianmin, Yin, Liangsheng,  Xie, Zhiqiang, Sun, Chuyue, Huang, Jeff, Yu, Cody Hao, Cao, Shiyi, Kozyrakis, Christos, Stoica, Ion, Gonzalez, Joseph E., Barrett, Clark, and Sheng, Ying
    2024
  4. High-throughput and Flexible Host Networking via Control and Data Path Physical Separation
    Skiadopoulos, Athinagoras,  Xie, Zhiqiang, Zhao, Mark, Cai, Qizhe, Agarwal, Saksham, Adelmann, Jacob, Ahern, David, Contavalli, Carlo, Goldflam, Michael, Mayatskikh, Vitaly, Raja, Raghu, Walton, Daniel, Agarwal, Rachit, Mukherjee, Shrijeet, and Kozyrakis, Christos
    OSDI 2024

2023

  1. FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
    Sheng, Ying, Zheng, Lianmin, Yuan, Binhang, Li, Zhuohan, Ryabinin, Max, Fu, Daniel Y.,  Xie, Zhiqiang, Chen, Beidi, Barrett, Clark, Gonzalez, Joseph E., Liang, Percy, Ré, Christopher, Stoica, Ion, and Zhang, Ce
    2023
  2. An Extensible, Data-Oriented Architecture for High-Performance, Many-World Simulation
    Shacklett, Brennan, Rosenzweig, Luc Guy,  Xie, Zhiqiang, Sarkar, Bidipta, Szot, Andrew, Wijmans, Erik, Koltun, Vladlen, Batra, Dhruv, and Fatahalian, Kayvon
    ACM Trans. Graph. 2023
  3. The Benefit of Hindsight: Tracing Edge-Cases in Distributed Systems
    Zhang, Lei,  Xie, Zhiqiang, Anand, Vaastav, Vigfusson, Ymir, and Mace, Jonathan
    NSDI 2023
  4. The Odd One Out: Energy is Not Like Other Metrics
    Anand, Vaastav,  Xie, Zhiqiang, Stolet, Matheus, De Viti, Roberta, Davidson, Thomas, Karimipour, Reyhaneh, Alzayat, Safya, and Mace, Jonathan
    SIGENERGY Energy Inform. Rev. 2023

2022

  1. Efficient Flow Scheduling in Distributed Deep Learning Training with Echelon Formation
    Pan, Rui, Lei, Yiming, Li, Jialong,  Xie, Zhiqiang, Yuan, Binhang, and Xia, Yiting
    HotNets 2022
  2. Graphiler: Optimizing Graph Neural Networks with Message Passing Data Flow Graph
    Xie, Zhiqiang, Wang, Minjie, Ye, Zihao, Zhang, Zheng, and Fan, Rui
    MLSys 2022

2021

  1. Dual-side sparse tensor core
    Wang, Yang, Zhang, Chen,  Xie, Zhiqiang, Guo, Cong, Liu, Yunxin, and Leng, Jingwen
    ISCA 2021

2020

  1. Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks
    Ma, Lingxiao*,  Xie, Zhiqiang*, Yang, Zhi, Xue, Jilong, Miao, Youshan, Cui, Wei, Hu, Wenxiang, Yang, Fan, Zhang, Lintao, and Zhou, Lidong
    OSDI 2020, *equal contributions