CGRA compiler

UPM CGRA Compilation Framework

ID: R21

Licence:
GPL-3.0 licence

Owner: UPM

Contacts:
joseandres.otero@upm.es
alfonso.rodriguezm@upm.es

Short Description	The multigrain reconfigurable CGRA requires compilation support to generate the CGRA configuration bitstream from the user’s application code.
Key features	Code compilation for its execution on the CGRA [1] [2].
Require	Application code with the indication of the section of code susceptible to be offloaded on the CGRA accelerator.
Provide	Configuration binary for the multi-grain CGRA.
Input	– Application code (C/C++). – Description and parameters of the CGRA. – Optimization goals and expected metrics for the CGRA.
Output	– Configuration binary for the CGRA.
User	– Embedded System developers, with application programming skills. Hardware knowledge is not required.
Benefits for the user	– Automatic generation of accelerators with minor intervention from the users. This implies higher energy efficiency and throughput, with minor design costs.
Position in the MYRTUS DPE	Step 3 – Node Level Optimization
TRL@M0	2-3
TRL@M36	4-5

General description

Based on MLIR, this compiler will generate configuration bitstreams for the CGRA (acting as a tightly-coupled domain-specific accelerator), in an automated manner for the user, without requiring the design of custom logic. This will allow the automatic generation of hardware accelerators in the edge without additional effort for the user. Moreover, run-time SW-to-HW computing offloading mechanisms will be incorporated to automatically extract application parallelism combined with run-time profiling/tracing to enable adaptive/speculative acceleration deployment

*Figure 1. Detailed view of the memory-mapped CGRA-based accelerator*

Role in the MYRTUS DPE

This compiler will allow users to deploy computing-intensive sections of code on the CGRA, without further user’s intervention. This enables the automatic generation of the compilers, and the potential adaptation of the architecture to multiple and different operation points, differing in the use of resources versus the achieved throughput. Based on MLIR, as part of the DPE, the compiler will enable the production of adaptable WL implementations to be executed on the computing continuum infrastructure.

MYRTUS Extension/Contribution

The original compilation framework existing for this multi-grain CGRA relied on LLVM with limited automatic parallelization capabilities. LLVM is less extensible than MLIR, and presents portability limitations. Moreover, run-time SW-to-HW computing offloading mechanisms will be incorporated to automatically extract application parallelism combined with run-time profiling/tracing to enable adaptive/speculative acceleration deployment.

Plans and Expectation

Assessment Plan@M18:
First demonstration of the compilation flow on the CGRA architecture described in R2.

Expected Results@M18:
First version of the toolchain, without the feature to automatically extract application parallelism.

References

[1] Vázquez, D., Rodríguez, A., Otero, A., & de la Torre, E. (2022, November). Extending RISC-V Processor Datapaths with Multi-Grain Reconfigurable Overlays. In 2022 37th Conference on Design of Circuits and Integrated Circuits (DCIS) (pp. 01-06). IEEE.
[2] https://github.com/des-cei/cgra_gen

UPM CGRA Compilation Framework

General description

Role in the MYRTUS DPE

MYRTUS Extension/Contribution

Plans and Expectation

References

Quick Links

Insights

Follow us on