
Parallax by Gradient
Parallax, created by Gradient, is an advanced open-source inference engine that transforms model inference into a global, collaborative endeavor. By leveraging a decentralized architecture, it enables large language models to be divided, executed, and validated across a network of interconnected devices. This system facilitates seamless deployment across multiple platforms—Windows, Linux, and macOS—and accommodates diverse GPU configurations such as those from Blackwell, Ampere, and Hopper series.
Visit Website- Distributed Model ExecutionEnables model inference to be divided and executed on separate distributed nodes, allowing for the effective utilization of available computational resources.
- Multi-platform CompatibilitySupports various operating systems such as Windows, Linux, and macOS, offering flexible installation methods via source code, Docker, or native applications.
- Real-time Resource AllocationFeatures include dynamic key-value cache management and continuous batch processing for macOS, as well as intelligent request scheduling and routing to achieve optimal performance.
- Parallel Pipeline ArchitectureEnables pipeline parallel model sharding to effectively distribute model layers across various nodes within the cluster.