Browsing by Author "Sambangi, Ramesh"

Now showing 1 - 4 of 4

Algorithm and architecture design of random fourier features-based kernel adaptive filters
(IEEE, 2022-12) Sambangi, Ramesh
Numerous real-life systems exhibit complex nonlinear input-output relationships. Kernel adaptive filters, a popular class of nonlinear adaptive filters, can efficiently model these nonlinear input-output relationships. Their growing network structure, however, poses considerable challenges in terms of their hardware implementation, making them inefficient for real-time applications. Random Fourier features (RFF) facilitate the development of kernel adaptive filters with a fixed network structure. For the first time, this paper attempts to implement the RFF-based kernel least mean square (RFF-KLMS) algorithm on hardware. To this end, we propose several reformulations of the feature functions (FFs) that are computationally expensive in their native form so that they can be implemented in real-time VLSI. Specifically, we reformulate inner product evaluation, cosine, and exponential functions that appear in the implementation of FFs. With these reformulations, the proposed delayed RFF-KLMS (DRFF-KLMS) is then synthesized using 45-nm CMOS technology with 16-bit fixed-point representations. According to the synthesis results, pipelined DRFF-KLMS architectures require minimal hardware increase over the state-of-the-art conventional delayed LMS architecture while significantly improving estimation performance for the nonlinear model. Our results suggest that the cosine feature function-based DRFF-KLMS is appropriate for applications requiring high accuracy, whereas the exponential function-based DRFF-KLMS may be well suited for resource-constrained applications.
Application mapping onto manycore processor architectures using active search framework
(IEEE, 2023-02) Sambangi, Ramesh
Finding an optimal application mapping solution in a manycore processor is an NP-hard problem. Heuristic search techniques have the advantage of finding near-optimal solutions faster than other methods when mapping large-scale applications. However, the majority of the heuristic-based application mapping methods easily fall into local minima. Machine learning (ML) methods can learn heuristics from training data on their own, require minimal assistance from humans, and produce better mapping solutions. Recently, a reinforcement learning-based framework (RLF) has been proposed to generate the initial population for metaheuristics, designed using genetic algorithm (GA) and particle swarm optimization (PSO). The RLF framework does not incorporate reward information while generating mapping solutions. However, the model performance can be improved further by refining the network parameters using the reward information during predictions. To overcome this challenge, we propose an active search framework (ASF). For the first time, we propose a new intellectual property (IP)-core numbering scheme, which will assist ASF in learning the mapping rules more effectively. We demonstrate that REINFORCE with multiple samples (predictions) per data point improves model accuracy and reduces variance by constructing a baseline using these samples. With these, we propose two RL models: active search (ATSR) and active search with pretraining (ATSRP). According to experimental results, both ATSRP and ATSR models produce better mapping solutions compared to RLF and other state-of-the-art methods. The results suggest that the ATSRP model is better suited for performing application mapping onto a 2-D mesh-based manycore processor. Finally, we extend this framework to other performance metrics and 3-D mesh-based manycore processors.
Congestion-aware vertical link placement and application mapping onto 3-D network-on-chip architectures
(IEEE, 2024-02) Sambangi, Ramesh
3-D Network-on-Chip (NoC) technology has emerged as a compelling solution in modern System-on-Chip (SoC) designs. This NoC technology effectively addresses the escalating need for high-performance and energy-efficient on-chip communication in various applications, including high-performance computing (HPC), graphics processing units (GPUs), and multiprocessor SoCs (MPSoCs). However, the efficient mapping of applications onto 3-D Network-on-Chips (3-D NoC) remains a complex challenge, necessitating the development of improved algorithms to address the issue. In this context, we present a novel neural mapping model with a reinforcement learning (RL) approach (NeurMap3D) to design application-specific 3-D NoC-based IC. Additionally, we propose the neural congestion-aware through-silicon vias (TSVs) placement and application mapping (NCTPAM) approach, which not only addresses application mapping but also incorporates TSVs placement and load balance across the TSVs for the specific application. In order to reduce the CPU execution time of NCTPAM algorithm, we propose incorporating a partial model parameter (θ) update mechanism. Experimental results indicate improved performance in terms of minimizing communication cost, load balancing across TSVs and energy consumption, highlighting the potential of our approach to enhance the efficiency of these synthesized network architectures.
LPNet: a DNN based latency prediction technique for application mapping in Network-on-Chip design
(Elsevier, 2021-11) Sambangi, Ramesh
Analytical models used for latency estimation of Network-on-Chip (NoC) are not producing reliable accuracy. This makes these analytical models difficult to use in optimization of design space exploration. In this paper, we propose a learning based model using deep neural network (DNN) for latency predictions. Input features for DNN model are collected from analytical model as well as from Booksim simulator. Then this DNN model has been adopted in mapping optimization loop for predicting the best mapping of given application and NoC parameters combination. Our simulations show that using the proposed DNN model, prediction error is less than 12% for both synthetic and application specific traffic. More than 108 times speedup could be achieved using DPSO with DNN model compared to DPSO using Booksim simulator.