Architectural Support for Deep Recommendation Systems(RecSys)

RecSys Figure 1

Recommendation systems form the backbone of popular internet services like entertainment streaming, e-commerce and social media (e.g., Netflix, Amazon, Facebook). These deep learning-based systems not only present unique compute challenges compared to well-studied DNNs but also introduce significant infrastructure demands for at-scale deployment. In order to make recommendation more efficient, solutions will have to integrate insights across the entire execution stack (as shown below).

At the use-case level, our group looks at optimizing both training and inference cycles with datacenter-scale (e.g., workload scheduling) as well as mobile-centric (e.g., privacy preservation) solutions. Framing the problem in these specific contexts allows us to propose algorithmic adjustments (e.g., model compression and partitioning) that are most appropriate for each use case. Last but not least, we quantify the implications of current heterogeneous hardware designs on recommendation workloads and use these insights to propose future architectures specialized for recommendation.