Google AI’s ArchGym brings machine learning and architecture simulators together, enabling researchers to overcome major obstacles in studying architecture. Discover how this groundbreaking open source gymnasium provides a unified framework, addresses challenges, and fosters innovation in computer architecture research.
Computer architecture research has a rich history of developing simulators and tools to shape the design of computer systems. However, incorporating machine learning (ML) techniques into architecture studies has presented significant challenges. Google AI’s latest innovation, ArchGym, aims to overcome these obstacles by seamlessly integrating ML algorithms with architecture simulators, revolutionising computer architecture research.
- One of the major challenges in studying architecture with machine learning is the absence of a systematic method to determine the optimal ML algorithm and hyperparameters for specific computer architecture problems.
- Furthermore, computer architecture simulators play a vital role in architectural progress, but they face the pressing challenge of balancing precision, efficiency, and economy during exploration. Simulators can provide vastly different performance estimates depending on the model used, such as cycle-accurate or ML-based proxy models.
- Additionally, commercial licensing restrictions can limit the frequency of simulator usage for data collection. These limitations impact the choice of optimisation algorithm for design exploration, considering the trade-offs between performance and sample efficiency.
- The ever-evolving landscape of ML algorithms also poses challenges. Certain ML algorithms heavily rely on data, and gaining insights into the design space necessitates effective visualisation of the exploration output, such as datasets.
In response to these challenges, ArchGym provides a standardised and unified framework for evaluating ML-based search algorithms consistently.
Comprised of two main components, ArchGym revolutionises the way researchers approach computer architecture research. The components are:
The ArchGym environment
The ArchGym environment encapsulates the architecture cost model and desired workload(s), allowing the computation of the computational cost based on specific architectural parameters.
The ArchGym agent
On the other hand, the ArchGym agent incorporates the hyperparameters and policies that guide the ML algorithm during the search process. The choice of hyperparameters significantly impacts the optimisation results, while policies dictate how the agent selects parameters to optimise the goal over time.
ArchGym establishes a reliable communication line between the agent and the environment by integrating these two components through a standardised interface. The interface relies on three primary signals:
- Hardware status
These signals enable the agent to monitor the hardware’s health and make informed recommendations to maximise a customer-specified reward. The reward is proportionate to various measures of hardware efficiency.
Empirical studies conducted by Google researchers demonstrate that ArchGym achieves hardware performance comparable to other ML methods across a wide range of optimisation targets and design space exploration scenarios. As an open source software, ArchGym fosters collaboration among researchers, providing a common and extensible interface for evaluating ML techniques and establishing robust baselines for computer architecture research.