Abstract
In this paper we present a scalable dataflow hardware architecture optimized for the computation of generalpurpose vision algorithms neuFlow and a dataflow compiler luaFlow that transforms high-level flow-graph representations of these algorithms into machine code for neuFlow. This system was designed with the goal of providing real-time detection, categorization and localization of objects in complex scenes, while consuming 10 Watts when implemented on a Xilinx Virtex 6 FPGA platform, or about ten times less than a laptop computer, and producing speedups of up to 100 times in real-world applications.
Wepresent an application of the system on street scene analysis, segmenting 20 categories on 500 375 frames at 12 frames per second on our custom hardware neuFlow.