We have built a unified inference, pretraining, and reinforcement learning codebase in Rust+Python. It powers all of our products and internal research and development workloads.
Let's enable researchers to focus on research again.
Preference tuning methods are often an after-thought, either tacked-on distributed training codebase or built with method-specific codebase that struggle to generalize.
Adaptive Harmony is fully dedicated to preference tuning, accelerating researchers workflows.
Typical LLM codebases are deeply entangled with their distributed training strategy, slowing down researchers. In Harmony, environments, logic, and model distributions are decoupled: researchers can focus on creative experiments and novel recipes.
Models in Harmony are implemented with custom kernels, careful memory management, and extensive profiling to validate throughput in both compute and memory-bound regimes. Even in resource-constrained scenarios, for enterprises without clusters.
The core of Harmony is built in Rust, with the machine learning logic exposed in Python for easy iteration by researchers. It is also extensively tested, up to regular reproduction of full-scale PPO recipes to control for performance regressions.
Check out Adaptive Engine, our platform for testing, serving, monitoring, and iterating on large language models.