#  Heterogeneous System Modeling and Optimization 

 



 ##  

  expand\_more  

 
  

 

   ![heterogeneous sys modeling optimizations](/sites/g/files/omnuum11281/files/styles/hwp_1_1__960x960_scale/public/vlsiarch/files/heterogeneous_sys_modeling_optimizations_image.png?itok=ofTjS12y) 

 

With Moore's law ending, there is a major push towards heterogeneity, where the modern SoCs consist of general-purpose CPUs, specialized hardware accelerators, GPUs, and FPGAs. This trend is especially prevalent for AI and internet of things (IoT) applications, such as the recent computer vision chips for autonomous driving as well as ultra-low-power SoCs for wearable electronics. Our group has developed infrastructure to allow for rapid design space exploration of heterogeneous SoCs targeting AI applications.

While our Aladdin tool can accurately model a variety of accelerator designs at pre-RTL level, its integration with gem5 (i.e., gem5-Aladdin) models and simulates complex SoCs consisting of CPUs, accelerators, NoC, and memory hierarchies. Recently, our SMAUG framework, building on gem5-Aladdin, supports modeling of deep neural network accelerators and can simulate a variety of commonly-used DNN and RNN models as well as hardware architectures for these accelerators such as SIMD and systolic arrays. In addition, we have also introduced ParaDNN tool that can generate thousands of parameterized multi-layer NN models, which can then be used to benchmark different computing platforms such as Google TPU, Nvidia GPU etc. Finally, for a fast and scalable automated DSE, we also use Bayesian optimization in conjunction with the above modeling frameworks to allow for efficient black-box optimizations of heterogeneous SoCs.



 

##  Select Publications 

 



  Download 6 citations  download- [BibTeX](/bibcite/export?pager_style=no_pager&number_of_items=6&sort_field=bibcite_year--desc&taxonomy_filters%5Bfield_hwp_c_peoplepublications%5D&taxonomy_filters%5Bfield_hwp_c_project123456%5D%5B0%5D%5Btarget_id%5D=172616&&&format=bibtex)
- [EndNote X3 XML](/bibcite/export?pager_style=no_pager&number_of_items=6&sort_field=bibcite_year--desc&taxonomy_filters%5Bfield_hwp_c_peoplepublications%5D&taxonomy_filters%5Bfield_hwp_c_project123456%5D%5B0%5D%5Btarget_id%5D=172616&&&format=endnote8)
- [EndNote 7 XML](/bibcite/export?pager_style=no_pager&number_of_items=6&sort_field=bibcite_year--desc&taxonomy_filters%5Bfield_hwp_c_peoplepublications%5D&taxonomy_filters%5Bfield_hwp_c_project123456%5D%5B0%5D%5Btarget_id%5D=172616&&&format=endnote7)
- [Endnote tagged](/bibcite/export?pager_style=no_pager&number_of_items=6&sort_field=bibcite_year--desc&taxonomy_filters%5Bfield_hwp_c_peoplepublications%5D&taxonomy_filters%5Bfield_hwp_c_project123456%5D%5B0%5D%5Btarget_id%5D=172616&&&format=tagged)
- [Marc](/bibcite/export?pager_style=no_pager&number_of_items=6&sort_field=bibcite_year--desc&taxonomy_filters%5Bfield_hwp_c_peoplepublications%5D&taxonomy_filters%5Bfield_hwp_c_project123456%5D%5B0%5D%5Btarget_id%5D=172616&&&format=marc)
- [PubMedId](/bibcite/export?pager_style=no_pager&number_of_items=6&sort_field=bibcite_year--desc&taxonomy_filters%5Bfield_hwp_c_peoplepublications%5D&taxonomy_filters%5Bfield_hwp_c_project123456%5D%5B0%5D%5Btarget_id%5D=172616&&&format=pubmed_id)
- [RIS](/bibcite/export?pager_style=no_pager&number_of_items=6&sort_field=bibcite_year--desc&taxonomy_filters%5Bfield_hwp_c_peoplepublications%5D&taxonomy_filters%5Bfield_hwp_c_project123456%5D%5B0%5D%5Btarget_id%5D=172616&&&format=ris)
 


 

### 2022

Abdulrahman Mahmoud, Thierry Tambe, Tarek Aloui, David Brooks, and Gu-Yeon Wei. 2022. “[GoldenEye: A Platform for Evaluating Emerging Numerical Data Formats in DNN Accelerators](/publications/goldeneye-platform-evaluating-emerging-numerical-data-formats-dnn-accelerators)”



 

 

Abdulrahman Mahmoud, Thierry Tambe, Tarek Aloui, David Brooks, and Gu-Yeon Wei. 2022. “[GoldenEye: A Platform for Evaluating Emerging Numerical Data Formats in DNN Accelerators](/publications/goldeneye-platform-evaluating-emerging-numerical-data-formats-dnn-accelerators)”



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.1109/DSN53405.2022.00031)
 
 This paper presents GoldenEye, a functional simulator with fault injection capabilities for common and emerging numerical formats, implemented for the PyTorch deep learning framework. Gold- enEye provides a unified framework for numerical format... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.1109/DSN53405.2022.00031)
 
 

Bo-Yuan Huang, Steven Lyubomirsky, Yi Li, Mike He, Thierry Tambe, Gus Henry Smith, Akash Gaonkar, Vishal Canumalla, Gu-Yeon Wei, Aarti Gupta, Zachary Tatlock, and Sharad Malik. 2022. “[Specialized Accelerators and Compiler Flows: Replacing Accelerator APIs With a Formal Software Hardware Interface](/publications/specialized-accelerators-and-compiler-flows-replacing-accelerator-apis-formal)”



 

 

Bo-Yuan Huang, Steven Lyubomirsky, Yi Li, Mike He, Thierry Tambe, Gus Henry Smith, Akash Gaonkar, Vishal Canumalla, Gu-Yeon Wei, Aarti Gupta, Zachary Tatlock, and Sharad Malik. 2022. “[Specialized Accelerators and Compiler Flows: Replacing Accelerator APIs With a Formal Software Hardware Interface](/publications/specialized-accelerators-and-compiler-flows-replacing-accelerator-apis-formal)”



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2203.00218)
- [ picture\_as\_pdfSpecialized Accelerators ...](/sites/g/files/omnuum11281/files/vlsiarch/files/2203.00218.pdf)
 
 Specialized accelerators are increasingly used to meet the power-performance goals of emerging applications such as machine learning, image processing, and graph analysis. Existing accelerator programming methodologies using APIs have several limitations... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.48550/arXiv.2203.00218)
- [ picture\_as\_pdfSpecialized Accelerators ...](/sites/g/files/omnuum11281/files/vlsiarch/files/2203.00218.pdf)
 
 

 



### 2021

Bo-Yuan Huang, Steven Lyubomirsky, Thierry Tambe, Yi Li, Mike He, Gus Smith, Gu-Yeon Wei, Aarti Gupta, Sharad Malik, and Zachary Tatlock. 2021. “[From DSLs to Accelerator-Rich Platform Implementations: Addressing the Mapping Gap](/publications/dsls-accelerator-rich-platform-implementations-addressing-mapping-gap)”. Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE’21)



 

 

Bo-Yuan Huang, Steven Lyubomirsky, Thierry Tambe, Yi Li, Mike He, Gus Smith, Gu-Yeon Wei, Aarti Gupta, Sharad Malik, and Zachary Tatlock. 2021. “[From DSLs to Accelerator-Rich Platform Implementations: Addressing the Mapping Gap](/publications/dsls-accelerator-rich-platform-implementations-addressing-mapping-gap)”. Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE’21)



 

 

 

- [ descriptionPublisher's Version](https://capra.cs.cornell.edu/latte21/)
- [ picture\_as\_pdfFrom DSLs to Accelerator-...](/sites/g/files/omnuum11281/files/vlsiarch/files/slides.pdf)
 
- [ descriptionPublisher's Version](https://capra.cs.cornell.edu/latte21/)
- [ picture\_as\_pdfFrom DSLs to Accelerator-...](/sites/g/files/omnuum11281/files/vlsiarch/files/slides.pdf)
 
 

 



### 2017

Brandon Reagen, Jose Hernandez-Lobato, Robert Adolf, Michael Gelbart, Paul Whatmough, Gu Wei, and David Brooks. 2017. “[A Case for Efficient Accelerator Design Space Exploration via Bayesian Optimization](/publications/case-efficient-accelerator-design-space-exploration-bayesian-optimization)”. In International Symposium on Low Power Electronics and Design. Taipei, Taiwan



 

 

Brandon Reagen, Jose Hernandez-Lobato, Robert Adolf, Michael Gelbart, Paul Whatmough, Gu Wei, and David Brooks. 2017. “[A Case for Efficient Accelerator Design Space Exploration via Bayesian Optimization](/publications/case-efficient-accelerator-design-space-exploration-bayesian-optimization)”. In International Symposium on Low Power Electronics and Design. Taipei, Taiwan



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.1109/ISLPED.2017.8009208)
- [ picture\_as\_pdfA Case for Efficient Acce...](/sites/g/files/omnuum11281/files/vlsiarch/files/reagen2017bayesopt.pdf)
 
 In this paper we propose using machine learning to improve the design of deep neural network hardware accelerators. We show how to adapt multi-objective Bayesian optimization to overcome a challenging design problem: optimizing deep neural network... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.1109/ISLPED.2017.8009208)
- [ picture\_as\_pdfA Case for Efficient Acce...](/sites/g/files/omnuum11281/files/vlsiarch/files/reagen2017bayesopt.pdf)
 
 

 



### 2016

Yakun Shao, Sam Xi, Vijayalakshmi Srinivasan, Gu Wei, and David Brooks. 2016. “[Co-Designing Accelerators and SoC Interfaces Using Gem5-Aladdin](/publications/co-designing-accelerators-and-soc-interfaces-using-gem5-aladdin)”. In International Symposium on Microarchitecture (MICRO). Taipei, Taiwan



 

 

Yakun Shao, Sam Xi, Vijayalakshmi Srinivasan, Gu Wei, and David Brooks. 2016. “[Co-Designing Accelerators and SoC Interfaces Using Gem5-Aladdin](/publications/co-designing-accelerators-and-soc-interfaces-using-gem5-aladdin)”. In International Symposium on Microarchitecture (MICRO). Taipei, Taiwan



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://ieeexplore.ieee.org/document/7783751)
- [ picture\_as\_pdfCo-Designing Accelerators...](/sites/g/files/omnuum11281/files/vlsiarch/files/3195638.3195697.pdf)
 
 Increasing demand for power-efficient, high- performance computing has spurred a growing number and diversity of hardware accelerators in mobile and server Systems on Chip (SoCs). This paper makes the case that the co-design of the accelerator... 

 

 

- [ descriptionPublisher's Version](https://ieeexplore.ieee.org/document/7783751)
- [ picture\_as\_pdfCo-Designing Accelerators...](/sites/g/files/omnuum11281/files/vlsiarch/files/3195638.3195697.pdf)
 
 

 



### 2014

Yakun Shao, Brandon Reagen, Gu Wei, and David Brooks. 2014. “[Aladdin: A Pre-RTL, Power-Performance Accelerator Simulator Enabling Large Design Space Exploration of Customized Architectures](/publications/aladdin-pre-rtl-power-performance-accelerator-simulator-enabling-large-design)”. In International Symposium on Computer Architecture (ISCA)



 

 

Yakun Shao, Brandon Reagen, Gu Wei, and David Brooks. 2014. “[Aladdin: A Pre-RTL, Power-Performance Accelerator Simulator Enabling Large Design Space Exploration of Customized Architectures](/publications/aladdin-pre-rtl-power-performance-accelerator-simulator-enabling-large-design)”. In International Symposium on Computer Architecture (ISCA)



 

 

 

- add\_circle\_outline do\_not\_disturb\_on Abstract
- [ descriptionPublisher's Version](https://doi.org/10.1109/ISCA.2014.6853196)
- [ picture\_as\_pdfAladdin: A Pre-RTL, Power...](/sites/g/files/omnuum11281/files/vlsiarch/files/aladdin_a_pre-rtl_power-performance_accelerator_simulator_enabling_large_design_space_exploration_of_customized_architectures.pdf)
 
 Hardware specialization, in the form of accelerators that provide custom datapath and control for specific algorithms and applications, promises impressive performance and energy advantages compared to traditional architectures. Current research in... 

 

 

- [ descriptionPublisher's Version](https://doi.org/10.1109/ISCA.2014.6853196)
- [ picture\_as\_pdfAladdin: A Pre-RTL, Power...](/sites/g/files/omnuum11281/files/vlsiarch/files/aladdin_a_pre-rtl_power-performance_accelerator_simulator_enabling_large_design_space_exploration_of_customized_architectures.pdf)
 
 

 



 

 

 

 [ See all project publications arrow\_circle\_right ](https://prod-vlsiarch.drupalsites.harvard.edu/publications?f%5B0%5D=bibcite_reference_hwp_c_project123456%3A172616)