A 201.4 GOPS real-time multi-object recognitionprocessor is presented with a three-stage pipelined architecture.Visual perception based multi-object recognition algorithm isapplied to give multiple attentions to multiple objects in the inputimage. For human-like multi-object perception, a neural perceptionengine is proposed with biologically inspired neural networksand fuzzy logic circ
KIMetal.:A201.4GOPS496mWREAL-TIMEMULTI-OBJECTRECOGNITIONPROCESSORWITHBIO-INSPIREDNEURALPERCEPTIONENGINE
39
Fig.9.SIMDprocessorunitanditsdual-issuedVLIWPE.
pathsfordataprocessingoperationssuchasALU,shift,mul-tiply,andmultiply-and-accumulation(MAC),anddatatransferoperationssuchasloadandstore.A51-bitdual-issuedVLIWinstructionenablesparallelexecutionofthedataprocessinganddatatransferoperationforeverycycle.Utilizingitsownreg-ister lewith ve-readandthree-writeports,thePEcanexe-cutecomplexinstructionsforimageprocessingsuchastwo-waymultiply/MAC,three-operandedmin/maxcompare,and32-bitaccumulationinasinglecycle.Theregister lesoftheotherPEscanbedirectlyaccessedforwindowbasedimageprocessing.Inaddition,eachPEcanbeconditionallyexecutedforthesamein-structionusingitsindependentlymanagedstatusregister.C.DecisionProcessor
Theobjectdecisionstageiscomposedofrepeatedvectormatchingprocessesthatsearchthenearestvectorofeachinputdescriptoramongobjectdatabase.Theserepeatedvectormatchingcanbeaperformancebottleneckbecausedistancecalculationsbetweentheinputvectorandeachofthousandsofvectorsindatabaserequirealotofprocessingtime.Intheproposedprocessor,theDPacceleratesthevectormatchingtomaketheobjectdecisionstagetobeoperatedover60frame/secframerateforthedatabaseincludingmorethan15,000vectors.Toreducethesearchregionofdatabasewithoutaccuracyloss,theDPexploitstheH-VQalgorithmpresentedinthepreviousvectormatchingprocessor[12].However,asshowninFig.10,thehardwareisredesignedtoincreasethethroughputofvectormatchingwithtwomodi cations.First,theH-VQalgorithmisperformedwithdedicatedthree-stagepipelineddatapathforvectordistancecalculationandcomparison.Second,thebandwidthofdatabasevectormemoryisincreasedtwice,from256-bitto512-bit.ForthevectormatchingoperationsoftheDP,descriptorvectorsaregatheredinfeaturevectormemoryfromtheSPUsasthe rststep.Then,theH-VQalgorithmisperformedbyacontrollerwiththededicateddatapath.Onceaninputinquiryvectorisset,theDPcanobtaintheindexoftheminimumdistancevectorbyreadingvectorsfromthedatabasememorybecausethedistancecalculationsandminimumvectorupdatesareautomaticallyperformedinpipelineddatapathstages.SincetheDPcanreadtwo256-bitvectorsfromthedatabasememoryinasinglecycle,thethroughputoftheDPistwovectordistancecalculationspercycleat200MHz.Inoverall,theDPmatches256descriptorvectorswitha16384-entrydatabasewithin3Mcycles.