Self-Recovering Multi-View Edge Tracker (SMVET): A Transformer-Augmented Lightweight Architecture for Occlusion-Resilient Real-Time Surveillance at the Edge

Sungho Jeon; Hyunjae Lee; Hee-Seob Kim; Yeonjin Kim

Authors

Sungho Jeon Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
Hyunjae Lee Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
Hee-Seob Kim Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
Yeonjin Kim Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea

Keywords:

Edge AI, Object Tracking, Occlusion Handling, Transformers, Real-Time Surveillance, Multi-View Vision, Lightweight Architecture, Self-Recovery, Deep Learning, Embedded Vision

Abstract

SMVET-Self-Recovering Multi-View Edge Tracker proposed in this paper is a low weight edge-friendly object monitoring framework particularly to deal the issue of occlusion and viewpoint change in real time surveillance systems. SMVET is a Transformer-enhanced feature fusion layer jointly used with a Siamese-based tracking back-bone to achieve this by dynamic adaptation of occlusion based upon self-recovery modes and multi-view contextualization. The use of quantized attention blocks and low power design philosophy has been applied to make its architecture efficient in implementation on edge computers.

A thorough analysis on both MOT17 and UAV123 datasets proves that SMVET provides a better performance in re-identification accuracy of 23 percent under severe occlusion conditions and makes a 31 percent improvement in latency relative to leading-edge tracking models. In addition, the system maintains real-time inference performance using power consumption of 1.5W, which is very appropriate to be used as embedded surveillance and autonomous drones. The suggested framework incorporates an efficient and elastic tool on intelligent edge-based tracking in the dynamic and resource-bounded settings. This structure is also the first one to integrate the quantized Transformer fusion, temporal self-recovery, and multi-view alignment into an end-to-end lightweight tracker capable of deployment on the edge. This combination allows a strong and power-efficient tracking even when there is an occlusion, motion blur and also on change of view, which provides a scalable solution to intelligent surveillance in dynamic settings.

Self-Recovering Multi-View Edge Tracker (SMVET): A Transformer-Augmented Lightweight Architecture for Occlusion-Resilient Real-Time Surveillance at the Edge

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Current Issue

Published by

Indexing

Keywords

Information