#  Publications 

 



 



 Sort &amp; Filters  close  

## Filters

    Project expand\_more  - eNVM (5)
- Heterogeneous System Modeling and Optimization (7)
- Privacy-Preserving Machine Learning (3)
- Probabilistic AI (3)
- RecSys (7)
- Robobees 2.0/Drones (9)
- Speech and NLP (4)
- “Agile” Chip Building (CRAFT, ChipKit, Chip Gallery) (10)

 



 

   Year of Publication expand\_more    2023 (2) 

  2022 (11) 

  2021 (12) 

  2020 (17) 

  2019 (20) 

  2018 (10) 

  2017 (20) 

  2016 (8) 

  2015 (15) 

  2014 (11) 

  2013 (12) 

  2012 (13) 

  2011 (15) 

  2010 (9) 

  2009 (19) 

  2008 (23) 

  2007 (18) 

  2006 (21) 

  2005 (9) 

  2004 (13) 

  2003 (7) 

  2002 (5) 

  2001 (8) 

  2000 (6) 

  1999 (3) 

  1998 (1) 

  1997 (1) 

  1996 (1) 



 

   Publications by author expand\_more    Alexander Rush (4) 

  Ali Khan (3) 

  Alper Buyuktosunoglu (5) 

  Ankur Agrawal (3) 

  Arijit Raychowdhury (3) 

  Benjamin Lee (19) 

  Brandon Reagen (29) 

  Carole-Jean Wu (7) 

  Carole Wu (3) 

  Dan Connors (2) 

  David Albonesi (3) 

  David Brooks (239) 

  Douglas Clark (3) 

  En-Yu Yang (3) 

  Glenn Holloway (11) 

  Glenn Ko (6) 

  Gschwind (2) 

  Gu-Wei (4) 

  Gu-Yeon Wei (26) 

  Gu Wei (185) 

  Hanumolu Kumar (7) 

  Hayun Chung (4) 

  Horowitz Mark (5) 

  Hu (2) 

  Hyunkwang Lee (4) 

  Jin Lee (2) 

  Jude Rivers (2) 

  Judson Porter (2) 

  Kevin Brownell (7) 

  Kevin Skadron (6) 

  Kim Hazelwood (8) 

  Krishna Rangan (5) 

  Kshitij Bhardwaj (3) 

  Lillian Pentecost (12) 

  Lukasz Strozek (3) 

  Marco Donato (13) 

  Margaret Martonosi (14) 

  Mario Lok (6) 

  Mark Hempstead (13) 

  Mark Wilkening (3) 

  Martin Schulz (2) 

  Matt Welsh (2) 

  Maximilian Lam (4) 

  Meeta Gupta (15) 

  Michael Gschwind (3) 

  Michael Karpelson (6) 

  Michael Lyons (10) 

  Michael Mitzenmacher (3) 

  Michael Smith (12) 

  Moon Ku (6) 

  Niamh Mulholland (4) 

  Nikhil Tripathi (2) 

  Patrick Hansen (3) 

  Patrick Mauro (3) 

  Paul Whatmough (20) 

  Peter Bailis (2) 

  Peter Cook (4) 

  Philip Emma (4) 

  Pradip Bose (17) 

  Qiang Wu (2) 

  Ramon Canal (4) 

  Robert Adolf (10) 

  Robert Wood (6) 

  Rob Rutenbar (3) 

  Russ Joseph (4) 

  Sae Kyu Lee (2) 

  Sae Lee (19) 

  Saketh Rama (6) 

  Samuel Hsia (5) 

  Sam Xi (7) 

  Siming Ma (3) 

  Simon Chaput (3) 

  Simone Campanoni (17) 

  Skadron (3) 

  Stanley Schuster (4) 

  Svilen Kanev (14) 

  Tao Tong (12) 

  Thierry Tambe (11) 

  Tianyu Jia (5) 

  Timothy Jones (8) 

  Udit Gupta (18) 

  Victor Zyuban (3) 

  Vijayalakshmi Srinivasan (2) 

  Vijay Janapa Reddi (11) 

  Vijay Reddi (24) 

  Viji Srinivasan (4) 

  Wonyoung Kim (7) 

  Wood J (6) 

  Xiaodong Wang (3) 

  Xiaoyao Liang (16) 

  Xuan Zhang (14) 

  Yakun Shao (10) 

  Yingmin Li (7) 

  Youfeng Wu (2) 

  Yu-Shun Hsiao (5) 

  Yuhao Zhu (3) 

  Yuji Chai (5) 

  Yu Wang (2) 

  Zhigang Hu (5) 

  Zishen Wan (7) 



 

  



 

  Search Within Results  

  Search Within Results search  

##  306 results 

  Show filters filter\_alt    Sort by Year of PublicationAlphabetical A-Z sort



 

##  306 results 

  Download 306 citations  download- [BibTeX](/node/1575566/export?format=bibtex)
- [EndNote X3 XML](/node/1575566/export?format=endnote8)
- [EndNote 7 XML](/node/1575566/export?format=endnote7)
- [Endnote tagged](/node/1575566/export?format=tagged)
- [Marc](/node/1575566/export?format=marc)
- [PubMedId](/node/1575566/export?format=pubmed_id)
- [RIS](/node/1575566/export?format=ris)
 


 

### 2023

Matthew Adiletta, Jesmin Tithi, Emmanouil-Ioannis Farsarakis, Gerasimos Gerogiannis, Robert Adolf, Robert Benke, Sidharth Kashyap, Samuel Hsia, Kartik Lakhotia, Fabrizio Petrini, Gu-Yeon Wei, and David Brooks. 2023. “[Characterizing the Scalability of Graph Convolutional Networks on Intel® PIUMA](/publications/characterizing-scalability-graph-convolutional-networks-intel%C2%AE-piuma)”. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). Raleigh, North Carolina



 

 

Matthew Adiletta, Jesmin Tithi, Emmanouil-Ioannis Farsarakis, Gerasimos Gerogiannis, Robert Adolf, Robert Benke, Sidharth Kashyap, Samuel Hsia, Kartik Lakhotia, Fabrizio Petrini, Gu-Yeon Wei, and David Brooks. 2023. “[Characterizing the Scalability of Graph Convolutional Networks on Intel® PIUMA](/publications/characterizing-scalability-graph-convolutional-networks-intel%C2%AE-piuma)”. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). Raleigh, North Carolina



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ picture\_as\_pdfispass\_gnn\_characterizati...](/sites/g/files/omnuum11281/files/vlsiarch/files/ispass_gnn_characterization_on_piuma_final.pdf)
 
 Large-scale Graph Convolutional Network (GCN) inference on traditional CPU/GPU systems is challenging due to a large memory footprint, sparse computational patterns, and irregular memory accesses with poor locality. Intel's Programmable Integrated...



 

 

- [ picture\_as\_pdfispass\_gnn\_characterizati...](/sites/g/files/omnuum11281/files/vlsiarch/files/ispass_gnn_characterization_on_piuma_final.pdf)
 
 

Samuel Hsia, Udit Gupta, Bilge Acun, Newsha Ardalani, Pan Zhong, Gu-Yeon Wei, David Brooks, and Carole-Jean Wu. 2023. “[MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation](/publications/mp-rec-hardware-software-co-design-enable-multi-path-recommendation)”. In ASPLOS. Vancouver, Canada



 

 

Samuel Hsia, Udit Gupta, Bilge Acun, Newsha Ardalani, Pan Zhong, Gu-Yeon Wei, David Brooks, and Carole-Jean Wu. 2023. “[MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation](/publications/mp-rec-hardware-software-co-design-enable-multi-path-recommendation)”. In ASPLOS. Vancouver, Canada



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://arxiv.org/abs/2302.10872)
 
 Deep learning recommendation systems serve personalized content under diverse tail-latency targets and input-query loads. In order to do so, state-of-the-art recommendation models rely on terabyte-scale embedding tables to learn user preferences over...



 

 

- [ descriptionPublisher's Version](https://arxiv.org/abs/2302.10872)
 
 

 



### 2022

Maximilian Lam, Michael Mitzenmacher, Vijay Janapa Reddi, Gu-Yeon Wei, and David Brooks. 2022. “[Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference](/publications/efficiently-computing-nonlinear-activation-functions-secure-neural-network)”



 

 

Maximilian Lam, Michael Mitzenmacher, Vijay Janapa Reddi, Gu-Yeon Wei, and David Brooks. 2022. “[Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference](/publications/efficiently-computing-nonlinear-activation-functions-secure-neural-network)”



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2203.02833)
- [ picture\_as\_pdfTabula: Efficiently Compu...](/sites/g/files/omnuum11281/files/vlsiarch/files/2203.02833.pdf)
 
 Multiparty computation approaches to private neural network inference require significant communication between server and client, incur tremendous runtime penalties, and cost massive storage overheads. The primary source of these expenses is garbled... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2203.02833)
- [ picture\_as\_pdfTabula: Efficiently Compu...](/sites/g/files/omnuum11281/files/vlsiarch/files/2203.02833.pdf)
 
 

L. Pentecost, A. Hankin, M. Donato, M. Hempstead, G.-Y. Wei, and D. Brooks. 2022. “[NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories](/publications/nvmexplorer-framework-cross-stack-comparisons-embedded-non-volatile-memories)”. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). Seoul, South Korea



 

 

L. Pentecost, A. Hankin, M. Donato, M. Hempstead, G.-Y. Wei, and D. Brooks. 2022. “[NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories](/publications/nvmexplorer-framework-cross-stack-comparisons-embedded-non-volatile-memories)”. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). Seoul, South Korea



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2109.01188)
- [ picture\_as\_pdf2109.01188.pdf](/sites/g/files/omnuum11281/files/vlsiarch/files/2109.01188.pdf)
 
 Repeated off-chip memory accesses to DRAM drive up operating power for data-intensive applications, and SRAM technology scaling and leakage power limits the efficiency of embedded memories. Future on-chip storage will need higher density and energy... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2109.01188)
- [ picture\_as\_pdf2109.01188.pdf](/sites/g/files/omnuum11281/files/vlsiarch/files/2109.01188.pdf)
 
 

Yu-Shun Hsiao, Siva Kumar Sastry Hari, Michał Filipiuk, Timothy Tsai, Michael B. Sullivan, Vijay Janapa Reddi, Vasu Singh, and Stephen W. Keckler. 2022. “[Zhuyi: Perception Processing Rate Estimation for Safety in Autonomous Vehicles](/publications/zhuyi-perception-processing-rate-estimation-safety-autonomous-vehicles)”. In ACM/IEEE/Design/Automation/Conference/(DAC). San Francisco, CA, USA



 

 

Yu-Shun Hsiao, Siva Kumar Sastry Hari, Michał Filipiuk, Timothy Tsai, Michael B. Sullivan, Vijay Janapa Reddi, Vasu Singh, and Stephen W. Keckler. 2022. “[Zhuyi: Perception Processing Rate Estimation for Safety in Autonomous Vehicles](/publications/zhuyi-perception-processing-rate-estimation-safety-autonomous-vehicles)”. In ACM/IEEE/Design/Automation/Conference/(DAC). San Francisco, CA, USA



 

 

 

 

Zishen Wan, Aqeel Anwar, Abdulrahman Mahmoud, Tianyu Jia, Yu-Shun Hsiao, Vijay Janapa Reddi, and Arijit Raychowdhury. 2022. “[Frl-Fi: Transient Fault Analysis for Federated Reinforcement Learning-Based Navigation Systems](/publications/frl-fi-transient-fault-analysis-federated-reinforcement-learning-based)”. 2022 Design Automation and Test in Europe Conference (DATE)



 

 

Zishen Wan, Aqeel Anwar, Abdulrahman Mahmoud, Tianyu Jia, Yu-Shun Hsiao, Vijay Janapa Reddi, and Arijit Raychowdhury. 2022. “[Frl-Fi: Transient Fault Analysis for Federated Reinforcement Learning-Based Navigation Systems](/publications/frl-fi-transient-fault-analysis-federated-reinforcement-learning-based)”. 2022 Design Automation and Test in Europe Conference (DATE)



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2203.07276)
- [ picture\_as\_pdfFrl-fi: Transient fault a...](/sites/g/files/omnuum11281/files/vlsiarch/files/2203.07276.pdf)
 
 Swarm intelligence is being increasingly deployed in autonomous systems, such as drones and unmanned vehicles. Federated reinforcement learning (FRL), a key swarm intelligence paradigm where agents interact with their own environments and cooperatively... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2203.07276)
- [ picture\_as\_pdfFrl-fi: Transient fault a...](/sites/g/files/omnuum11281/files/vlsiarch/files/2203.07276.pdf)
 
 

Tianyu Jia, En-Yu Yang, Yu-Shun Hsiao, Jonathan Cruz, David Brooks, Gu-Yeon Wei, and Vijay Janapa Reddi. 2022. “[OMU: A Probabilistic 3D Occupancy Mapping Accelerator for Real-Time OctoMap at the Edge](/publications/omu-probabilistic-3d-occupancy-mapping-accelerator-real-time-octomap-edge)”. In DATE: Design, Automation, and Test in Europe (DATE)



 

 

Tianyu Jia, En-Yu Yang, Yu-Shun Hsiao, Jonathan Cruz, David Brooks, Gu-Yeon Wei, and Vijay Janapa Reddi. 2022. “[OMU: A Probabilistic 3D Occupancy Mapping Accelerator for Real-Time OctoMap at the Edge](/publications/omu-probabilistic-3d-occupancy-mapping-accelerator-real-time-octomap-edge)”. In DATE: Design, Automation, and Test in Europe (DATE)



 

 

 

 

Thierry Tambe, David Brooks, and Gu-Yeon Wei. 2022. “[Learnings from a HLS-Based High-Productivity Digital VLSI Flow](/publications/learnings-hls-based-high-productivity-digital-vlsi-flow)”. In Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE’22)



 

 

Thierry Tambe, David Brooks, and Gu-Yeon Wei. 2022. “[Learnings from a HLS-Based High-Productivity Digital VLSI Flow](/publications/learnings-hls-based-high-productivity-digital-vlsi-flow)”. In Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE’22)



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://capra.cs.cornell.edu/latte22/paper/6.pdf)
- [ picture\_as\_pdfLearnings from a HLS-base...](/sites/g/files/omnuum11281/files/vlsiarch/files/6.pdf)
 
 Thetwilight of Dennardscalinghasactivatedaglobaltrendtowards application-based hardware specialization. This trend is currently accelerating due to the surging democratization and deployment of machine learning on mobile and IoT compute platforms. At the... 

 

 

- [ descriptionPublisher's Version](https://capra.cs.cornell.edu/latte22/paper/6.pdf)
- [ picture\_as\_pdfLearnings from a HLS-base...](/sites/g/files/omnuum11281/files/vlsiarch/files/6.pdf)
 
 

Bo-Yuan Huang, Steven Lyubomirsky, Yi Li, Mike He, Thierry Tambe, Gus Henry Smith, Akash Gaonkar, Vishal Canumalla, Gu-Yeon Wei, Aarti Gupta, Zachary Tatlock, and Sharad Malik. 2022. “[Specialized Accelerators and Compiler Flows: Replacing Accelerator APIs With a Formal Software Hardware Interface](/publications/specialized-accelerators-and-compiler-flows-replacing-accelerator-apis-formal)”



 

 

Bo-Yuan Huang, Steven Lyubomirsky, Yi Li, Mike He, Thierry Tambe, Gus Henry Smith, Akash Gaonkar, Vishal Canumalla, Gu-Yeon Wei, Aarti Gupta, Zachary Tatlock, and Sharad Malik. 2022. “[Specialized Accelerators and Compiler Flows: Replacing Accelerator APIs With a Formal Software Hardware Interface](/publications/specialized-accelerators-and-compiler-flows-replacing-accelerator-apis-formal)”



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2203.00218)
- [ picture\_as\_pdfSpecialized Accelerators ...](/sites/g/files/omnuum11281/files/vlsiarch/files/2203.00218.pdf)
 
 Specialized accelerators are increasingly used to meet the power-performance goals of emerging applications such as machine learning, image processing, and graph analysis. Existing accelerator programming methodologies using APIs have several limitations... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2203.00218)
- [ picture\_as\_pdfSpecialized Accelerators ...](/sites/g/files/omnuum11281/files/vlsiarch/files/2203.00218.pdf)
 
 

Abdulrahman Mahmoud, Thierry Tambe, Tarek Aloui, David Brooks, and Gu-Yeon Wei. 2022. “[GoldenEye: A Platform for Evaluating Emerging Numerical Data Formats in DNN Accelerators](/publications/goldeneye-platform-evaluating-emerging-numerical-data-formats-dnn-accelerators)”



 

 

Abdulrahman Mahmoud, Thierry Tambe, Tarek Aloui, David Brooks, and Gu-Yeon Wei. 2022. “[GoldenEye: A Platform for Evaluating Emerging Numerical Data Formats in DNN Accelerators](/publications/goldeneye-platform-evaluating-emerging-numerical-data-formats-dnn-accelerators)”



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.1109/DSN53405.2022.00031)
 
 This paper presents GoldenEye, a functional simulator with fault injection capabilities for common and emerging numerical formats, implemented for the PyTorch deep learning framework. Gold- enEye provides a unified framework for numerical format... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.1109/DSN53405.2022.00031)
 
 

Matthew Adiletta, David Brooks, and Gu-Yeon Wei. 2022. [Architectural Implications of Embedding Dimension During GCN on CPU and GPU](/publications/architectural-implications-embedding-dimension-during-gcn-cpu-and-gpu). Cambridge: Harvard University



 

 

Matthew Adiletta, David Brooks, and Gu-Yeon Wei. 2022. [Architectural Implications of Embedding Dimension During GCN on CPU and GPU](/publications/architectural-implications-embedding-dimension-during-gcn-cpu-and-gpu). Cambridge: Harvard University



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ picture\_as\_pdfArchitectural Implication...](/sites/g/files/omnuum11281/files/vlsiarch/files/2022_fall_-_architectural_implications_of_embedding_dimension_during_gcn_on_cpu_and_gpu.pdf)
 
 Graph Neural Networks (GNNs) are a class of neural networks designed to extract information from the graphical structure of data. Graph Convolutional Networks (GCNs) are a widely used type of GNN for transductive graph learning problems which apply... 

 

 

- [ picture\_as\_pdfArchitectural Implication...](/sites/g/files/omnuum11281/files/vlsiarch/files/2022_fall_-_architectural_implications_of_embedding_dimension_during_gcn_on_cpu_and_gpu.pdf)
 
 

 



### 2021

Mark Wilkening, Udit Gupta, Samuel Hsia, Caroline Trippel, Carole-Jean Wu, David Brooks, and Gu-Yeon Wei. 2021. “[RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference](/publications/recssd-near-data-processing-solid-state-drive-based-recommendation-inference)”. ASPLOS 2021: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Pp. 717–729



 

 

Mark Wilkening, Udit Gupta, Samuel Hsia, Caroline Trippel, Carole-Jean Wu, David Brooks, and Gu-Yeon Wei. 2021. “[RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference](/publications/recssd-near-data-processing-solid-state-drive-based-recommendation-inference)”. ASPLOS 2021: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Pp. 717–729



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2102.00075)
- [ picture\_as\_pdfRecSSD: Near Data Process...](/sites/g/files/omnuum11281/files/vlsiarch/files/2102.00075.pdf)
 
 Neural personalized recommendationmodelsareusedacrossawide Samuel Hsia Harvard University Cambridge, Massachusetts, USA <shsia@g.harvard.edu> David Brooks Harvard University Cambridge, Massachusetts, USA <dbrooks@eecs.harvard.edu> USA. ACM, New York, NY, USA... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2102.00075)
- [ picture\_as\_pdfRecSSD: Near Data Process...](/sites/g/files/omnuum11281/files/vlsiarch/files/2102.00075.pdf)
 
 

 



 

 

 

 - Previous page chevron\_left
- [1](?page=0 "Current page")
- [2](?page=1 "Go to page 2")
- [3](?page=2 "Go to page 3")
- [ Last page 26 ](?page=25 "Go to last page")
- [ Next page chevron\_right ](?page=1 "Go to next page")