Soda

SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems. This paper describes the SODA scheduler for System S, a highly scalable distributed stream processing system. Unlike traditional batch applications, streaming applications are open-ended. The system cannot typically delay the processing of the data. The scheduler must be able to shift resource allocation dynamically in response to changes to resource availability, job arrivals and departures, incoming data rates and so on. The design assumptions of System S, in particular, pose additional scheduling challenges. SODA must deal with a highly complex optimization problem, which must be solved in real-time while maintaining scalability. SODA relies on a careful problem decomposition, and intelligent use of both heuristic and exact algorithms. We describe the design and functionality of SODA, outline the mathematical components, and describe experiments to show the performance of the scheduler.