Parallax by Gradient

Parallax is a fully decentralized inference engine designed to enable the creation of distributed AI clusters capable of running large language models on various devices without regard to their specific configurations or geographical locations.

Visit Website

Distributed Model ExecutionEnables model inference to be divided and executed on separate distributed nodes, allowing for the effective utilization of available computational resources.
Multi-platform CompatibilitySupports various operating systems such as Windows, Linux, and macOS, offering flexible installation methods via source code, Docker, or native applications.
Real-time Resource AllocationFeatures include dynamic key-value cache management and continuous batch processing for macOS, as well as intelligent request scheduling and routing to achieve optimal performance.
Parallel Pipeline ArchitectureEnables pipeline parallel model sharding to effectively distribute model layers across various nodes within the cluster.