Configuring and Scaling ML with Hydra + Ray

Hydra

Hydra, from Facebook AI, is a framework for elegantly configuring complex applications. Since its initial release, Hydra has become a popular framework adopted by researchers and practitioners. We are happy to announce that users can now scale and launch jobs to the cloud through the new Hydra Ray Launcher!

Ray

Ray is a library that fits Hydra’s needs perfectly. Ray is a simple yet powerful Python library for parallel and distributed programming with a great ecosystem of ML libraries (for distributed training, reinforcement learning, and model serving), as well as community libraries and integrations (e.g., Dask on Ray, Horovod on Ray).

Hydra Ray Launcher enables you to easily configure and launch your application on Ray in 3 different ways. You can launch your application by:

Starting or connecting to a Ray cluster on AWS EC2 for short lived clusters
Connecting to an existing Ray cluster if you have a long running Ray cluster.
Starting a new Ray cluster locally.

Walkthrough

Below, we walk you through installation and running the example applications provided by the launcher. Please check Hydra Ray Launcher’s documentation for more details.

Installation

pip install hydra-ray-launcher --pre

Launch to an AWS cluster

Launching a Ray application on AWS can be done by setting hydra/launcher=aws. This will allow your application to run on a new or existing AWS EC2 Ray Cluster. Launching on AWS is built on top of Ray’s cluster launcher CLI (learn more here). The cluster launcher CLI expects an autoscaler yaml for cluster configurations.

You can configure your cluster just like how you configure any Hydra application: yaml, strict config and command line override.

@hydra.main(config_name="config")
def my_app(cfg: DictConfig) -> None:
    log.info(f"Executing task {cfg.task}")
    time.sleep(1)


if __name__ == "__main__":
    my_app()

From the command line:

Launch on an existing Ray cluster

You can also launch the application on an existing Ray cluster by configuring the launcher to connect to the cluster. For instance, you might want to run the same Ray script on your local machine by accessing the Ray cluster you started previously.

Launch by spinning up a new Ray cluster locally (For testing)

For a quick test locally, you can spin up a Ray cluster at initialization time. Note how the only difference is to specify the hydra launcher as ray, as opposed to ray_aws shown in the previous example.