There is an ever-growing pressure to accelerate computer vision applications on embedded processors for wide-ranging equipment including mobile phones, network cameras, and automotive safety systems. Towards this goal, we propose a software library approach that eases common computational bottlenecks by optimizing over 60 low- and mid-level vision kernels. Optimized for a digital signal processor that is deployed in many embedded image & video processing systems, the library was designed for typical high-performance and low-power requirements. The algorithms are implemented in fixed-point arithmetic and support block-wise partitioning of video frames so that a direct memory access engine can efficiently move data between on-chip and external memory. We highlight the benefits of this library for a baseline video security application, which segments moving foreground objects from a static background. Benchmarks show a ten-fold acceleration over a bit-exact yet unoptimized C language implementation, creating more computational headroom to embed other vision algorithms.
展开▼