Configuring and Scaling ML with Hydra + Ray



Hydra

Hydra, from Facebook AI, is a framework for elegantly configuring complex applications. Since its initial release, Hydra has become a popular framework adopted by researchers and practitioners. We are happy to announce that users can now scale and launch jobs to the cloud through the new Hydra Ray Launcher!


Ray

Ray is a library that fits Hydra’s needs perfectly. Ray is a simple yet powerful Python library for parallel and distributed programming with a great ecosystem of ML libraries (for distributed training, reinforcement learning, and model serving), as well as community libraries and integrations (e.g., Dask on Ray, Horovod on Ray).


Hydra Ray Launcher enables you to easily configure and launch your application on Ray in 3 different ways. You can launch your application by:

  1. Starting or connecting to a Ray cluster on AWS EC2 for short lived clusters

  2. Connecting to an existing Ray cluster if you have a long running Ray cluster.

  3. Starting a new Ray cluster locally.


Walkthrough

Below, we walk you through installation and running the example applications provided by the launcher. Please check Hydra Ray Launcher’s documentation for more details.


Installation

pip install hydra-ray-launcher --pre


Launch to an AWS cluster

Launching a Ray application on AWS can be done by setting hydra/launcher=aws. This will allow your application to run on a new or existing AWS EC2 Ray Cluster. Launching on AWS is built on top of Ray’s cluster launcher CLI (learn more here). The cluster launcher CLI expects an autoscaler yaml for cluster configurations.


You can configure your cluster just like how you configure any Hydra application: yaml, strict config and command line override.

@hydra.main(config_name="config")
def my_app(cfg: DictConfig) -> None:
    log.info(f"Executing task {cfg.task}")
    time.sleep(1)


if __name__ == "__main__":
    my_app()