A unified inference, pretraining, and reinforcement learning codebase that powers our products and R&D workloads.
Purpose-Built for
Preference Tuning
Preference tuning methods are often an after-thought. Tacked-on distributed training codebases struggle to handle robust workflows with acceptable latency, while method-specific codebases struggle to generalize.
TAILORED TO TUNE
Flexible
Typical LLM codebases are deeply entangled with their distributed training strategy, slowing down researchers.
In Harmony, environments, logic, and model distributions are decoupled, enabling researchers to focus on creative experiments and novel recipes.
Performant
Models in Harmony are implemented with custom kernels, careful memory management, and extensive profiling to validate throughput in both compute-bound and memory-bound regimes.
Robust
The core of Harmony is built in Rust, with the machine learning logic exposed in Python for easy iteration.
Harmony is extensively tested, including regular reproductions of full-scale PPO recipes to monitor and control for performance regressions.
EXPERIMENT FASTER
Designed for
Accelerated Iteration
Focus on experimentation, not implementation
Advanced LLM use cases require repeated interactions between instances of models and complex environments.
Often, researchers end-up spending more time dealing with distributed training idiosyncrasies than on experiments and iterations of their ideas.
Abstract away the idiosyncrasies
In Harmony, users simply write recipes, focusing on the logic of the interactions.
Recipes get lowered just-in-time for distribution, and it's easy to blend inference and training across myriad models and environments.