twintech.dev

Project Overview

Deep Neural Networks (DNN) have gained wide adoption in various domains in recent years; however, their high computational requirements have made it challenging to implement them in resource-constrained Internet of Things (IoT) devices. DNNs often require multi-processor environments or high-capacity GPU systems, leading to performance loss and high energy consumption, especially when dealing with large data volumes and data transmission between DNN layers. To overcome these limitations, it is crucial to reduce the size of DNN structures without compromising their effectiveness and run them on application-specific architectures optimized for data communication. One promising approach for reducing the size of DNN models is to prune unnecessary model elements, such as weights, neurons, and filters, while maintaining performance. However, DNN models pruned with non-structural methods may be inadequate or even detrimental in architectures designed for standard structures such as CPUs and GPUs.

Therefore, it is more efficient to run pruned models in architectures that offer regular and flexible communication, such as Network-on-Chip (NoC). However, existing pruning methods for DNNs are not optimized for NoC architectures, and there is a need for optimization algorithms that consider the NoC structure and map the pruned system in an energy-efficient manner. In this project, we aim to develop NoC topology-aware heuristic and metaheuristic methods that can perform weight and neuron pruning, optimally group neurons in the pruned DNN structure, and map them to the NoC architecture in an energy-efficient manner. TÜBİTAK 1001

Developing Topology-Aware Pruning and Mapping Algorithms for Deep Neural Network Accelerators

Project Overview