Comparision of Image Processing in Hardware Accelerated Convolution Engine and RISC-V core
Pokhrel, Narayan (2021-05-24)
Comparision of Image Processing in Hardware Accelerated Convolution Engine and RISC-V core
Pokhrel, Narayan
(24.05.2021)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
suljettu
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2021052832099
https://urn.fi/URN:NBN:fi-fe2021052832099
Tiivistelmä
The edge processing in ultra-low power IoT devices is increasing with the highest level of accuracy, however there is still gap in faster performance and energy efficient computing. The low power processors with extended customs instructions and application specific hardware accelerators are built to address the energy consumption and performance. The energy consumption can be reduced by fine tuning the data movement, data storage and data processing. In this thesis, Hardware Accelerated Convolution Engine which operates on 8-bit data set is implemented in open-source RISC-V SoC, PULPissimo. Processing the sensor data in IoT edge node reduces the need of memory bandwidth, reduces the amount of data to be store. Implementation of hardware accelerator increases the performance and the energy efficiency because of less data movement and fewer computation than general purpose RISC-V core. The implemented Hardware Accelerated Convolution Engine achieve the higher level of performance by parallelizing the larger number of operations and fully reusing the transferred data.
Further, the implemented custom instructions in RISC-V core (RI5CY) in PULPissimo is explored. The use of the algorithm specific custom instruction greatly reduces execution time of the convolution algorithm. Even though, the RI5CY core is equipped with vector custom instructions, Hardware Accelerated Convolution Engine wins the convolution algorithms execution with reference to performance and energy-efficiency.
The Questasim simulation and FPGA-based prototyping results for the image size 64x64 shows that the convolution algorithm runs 91x faster in than RISC-V core with plain RISC-V instructions and 26x faster in Hardware Accelerated Convolution Engine than RISC-V core with plain and custom extended RISC-V instructions. The gain in performance increases with increase in image size. The low power computer devices which need to compute 2D convolutions are greatly benefits from the implementation of Hardware Accelerated Convolution Engine which runs the convolution algorithms very efficiently.
Further, the implemented custom instructions in RISC-V core (RI5CY) in PULPissimo is explored. The use of the algorithm specific custom instruction greatly reduces execution time of the convolution algorithm. Even though, the RI5CY core is equipped with vector custom instructions, Hardware Accelerated Convolution Engine wins the convolution algorithms execution with reference to performance and energy-efficiency.
The Questasim simulation and FPGA-based prototyping results for the image size 64x64 shows that the convolution algorithm runs 91x faster in than RISC-V core with plain RISC-V instructions and 26x faster in Hardware Accelerated Convolution Engine than RISC-V core with plain and custom extended RISC-V instructions. The gain in performance increases with increase in image size. The low power computer devices which need to compute 2D convolutions are greatly benefits from the implementation of Hardware Accelerated Convolution Engine which runs the convolution algorithms very efficiently.