A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Proce

2020-11-29 00:07

A 201.4 GOPS real-time multi-object recognitionprocessor is presented with a three-stage pipelined architecture.Visual perception based multi-object recognition algorithm isapplied to give multiple attentions to multiple objects in the inputimage. For human-like multi-object perception, a neural perceptionengine is proposed with biologically inspired neural networksand fuzzy logic circ

32IEEEJOURNALOFSOLID-STATECIRCUITS,VOL.45,NO.1,JANUARY2010

A201.4GOPS496mWReal-TimeMulti-ObjectRecognitionProcessorWithBio-InspiredNeural

PerceptionEngine

Joo-YoungKim,StudentMember,IEEE,MinsuKim,StudentMember,IEEE,SeungjinLee,StudentMember,IEEE,

JinwookOh,StudentMember,IEEE,KwanhoKim,StudentMember,IEEE,andHoi-JunYoo,Fellow,IEEE

Abstract—A201.4GOPSreal-timemulti-objectrecognitionprocessorispresentedwithathree-stagepipelinedarchitecture.Visualperceptionbasedmulti-objectrecognitionalgorithmisappliedtogivemultipleattentionstomultipleobjectsintheinputimage.Forhuman-likemulti-objectperception,aneuralpercep-tionengineisproposedwithbiologicallyinspiredneuralnetworksandfuzzylogiccircuits.Intheproposedhardwarearchitecture,threerecognitiontasks(visualperception,descriptorgeneration,andobjectdecision)aredirectlymappedtotheneuralperceptionengine,16SIMDprocessorsincluding128processingelements,anddecisionprocessor,respectively,andexecutedinthepipelinetomaximizethroughputoftheobjectrecognition.Foref cienttaskpipelining,proposedtask/powermanagerbalancestheexecutiontimesofthethreestagesbasedonintelligentworkloadestimations.Inaddition,a118.4GB/smulti-castingnetwork-on-chipispro-posedforcommunicationarchitecturewithincorporatingoverall21IPblocks.Forlow-powerobjectrecognition,workload-awaredynamicpowermanagementisperformedinchip-level.The49mm2chipisfabricatedina

0.13m8-metalCMOSprocessandcontains3.7Mgatesand396KBon-chipSRAM.Itachieves60frame/secmulti-objectrecognitionupto10differentobjectsforVGA

(640480)videoinputwhiledissipating496mWat1.2V.Theobtained8.2mJ/frameenergyef ciencyis3.2timeshigherthanthestate-of-the-artrecognitionprocessor.

IndexTerms—Multi-castingnetwork-on-chip,multimediapro-cessor,multi-objectrecognition,neuralperceptionengine,visualperception,workload-awaredynamicpowermanagement,three-stagepipelinedarchitecture.

I.INTRODUCTION

O

BJECTrecognitionisafundamentaltechnologyforin-telligentvisionapplicationssuchasautonomouscruisecontrol,mobilerobotvision,andsurveillancesystems[1]–[5].Usually,itcontainsnotonlypixelbasedimageprocessingforobjectfeatureextractionbutalsovectordatabasematchingfor nalobjectdecision[6].Forobjectrecognition, rst,variousscalespacesaregeneratedbyacascaded lteringforinputvideo

ManuscriptreceivedMay04,2009;revisedJuly22,2009andSeptember01,2009.CurrentversionpublishedDecember23,2009.ThispaperwasapprovedbyGuestEditorKazutamiArimoto.

TheauthorsarewiththeDepartmentofElectricalEngineeringandComputerScience,KoreaAdvancedInstituteofScienceandTechnology,Daejeon305-701,Korea(e-mail:trample7@eeinfo.kaist.ac.kr).

Colorversionsofoneormoreofthe guresinthispaperareavailableonlineathttp://www.77cn.com.cn.

DigitalObjectIdenti er10.1109/JSSC.2009.2031768

stream.Then,key-pointsareextractedamongneighborscalespacesbylocalmaxima/minimasearch,http://www.77cn.com.cnst,the nalrecognitionismadebynearestneighbormatchingwithpre-de nedobjectdatabasethatgener-allyincludesovertenthousandsofobjectdescriptorvectors.Sinceeachstageoftheobjectrecognitionrequireshugeamountofcomputations,itsreal-timeoperationishardtobeachievedwithasinglegeneralpurposeCPU[3].Toachievereal-timeperformanceover20frame/secwithlowpowercon-sumptionunder1W,manymulti-corebasedvisionprocessorshavebeendeveloped[1]–[5].Inmassivelyparallelsingleinstructionmultipledata(SIMD)processors[1],[2],hundredsofprocessingelements(PEs)ofareemployedtomaximizedata-levelparallelismforper-pixelimageoperationssuchasimage lteringandhistogram.However,theiridenticaloper-ationsarenotsuitableforkey-pointorobjectleveloperationssuchasdescriptorvectorgenerationanddatabasematching.Ontheotherhand,themulti-coreprocessorof[3]exploitscoarse-grainedPEsandmemory-centricnetwork-on-chip(NoC)fortask-levelparallelismoverdata-levelparallelism;however,itcannotprovideenoughcomputingpowerforreal-timeobjectrecognitionduetoitsdatasynchronizationoverhead.Unlikethepreviousprocessors,aNoCbasedparallelprocessor[4]adoptsavisualattentionengine(VAE)[7]toreducethecomputationalcomplexityoftheobjectrecognition.Motivatedfromhumanvisualsystem,theVAEselectsmean-ingfulkey-pointsoutoftheextractedonestogiveattentionstothembeforethemainobjectrecognitionprocessingaforemen-tioned.Althoughitreducestheexecutiontimeofthewholeobjectrecognition,however,itsperformanceisstilllimitedbecauseitsvisualattention,objectfeatureextractionandde-scriptorgeneration,anddatabasematchingareperformedinseriesintimedomainduetotheirunbalancedworkloads.

Inthiswork,weproposeareal-timelow-powermulti-objectrecognitionprocessorwithathree-stagepipelinedarchitecture.Thepreviousvisualattentionisenhancedtovisualperceptiontogivemultipleattentionstomultipleobjectsintheinputimage.Forhuman-likemulti-objectperception,neuralperceptionen-gineisproposedwithbiologicallyinspiredneuralnetworksandfuzzylogiccircuits.Intheproposedprocessor,athree-stagepipelinedarchitectureisproposedtomaximizethethroughputofobjectrecognition.Thementionedthreeobjectrecognitiontasksarepipelinedinframelevelandtheirexecutiontimesarebalancedbasedonintelligentworkloadestimationstoimprove

0018-9200/$26.00©2009IEEE


A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Proce.doc 将本文的Word文档下载到电脑 下载失败或者文档不完整,请联系客服人员解决!

下一篇:左传外交辞令特色

相关阅读
本类排行
× 注册会员免费下载(下载后可以自由复制和排版)

马上注册会员

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信: QQ: