2012--Super-Bit Locality-Sensitive Hashing(6)

2021-01-20 18:04

thanSRP-LSH,whichveri esthecorrectnessofCorollary3tosomeextent.Furthermore,Figure2showsthatevenwhenθa,b∈(π/2,π],SBLSHstillhasasmallervariance.

2.3Discussion

FromCorollary1,SBLSHprovidesanunbiasedestimateofangularsimilarity.FromCorollary3,whenθa,b∈(0,π/2],withthesamelengthofbinarycode,thevarianceofSBLSHisstrictlysmallerthanSRP-LSH.Inrealapplications,manyvectorrepresentationsarelimitedinnon-negativeorthantwithallvectorentriesbeingnon-negative,e.g.,bag-of-wordsrepresentationofdocumentsandimages,andhistogram-basedrepresentationslikeSIFTlocaldescriptor[18].Usuallytheyarenormalizedtounitlength,withonlytheirorientationsmaintained.Forthiskindofdata,theangleofanytwodifferentsamplesislimitedin(0,π/2],andthusSBLSHwillprovidemoreaccurateestimationthanSRP-LSHonsuchdata.Infact,ourlaterexperimentsshowthatevenwhenθa,bisnotconstrainedin(0,π/2],SBLSHstillgivesmuchmoreaccurateestimateofangularsimilarity.3ExperimentalResults

Weconducttwosetsofexperiments,angularsimilarityestimationandapproximatenearestneighbor(ANN)retrieval,toevaluatetheeffectivenessofourproposedSBLSHmethod.Inthe rstsetofexperimentswedirectlymeasuretheaccuracyinestimatingpairwiseangularsimilarity.ThesecondsetofexperimentsthentesttheperformanceofSBLSHinrealretrievalapplications.

3.1AngularSimilarityEstimation

Inthisexperiment,weevaluatetheaccuracyofestimatingpairwiseangularsimilarityonseveraldatasets.Speci cally,wetesttheeffecttotheestimationaccuracywhentheSuper-BitdepthNvariesandthecodelengthKis xed,andviceversa.ForeachpreprocesseddataD,wegetDLSHafterperformingSRP-LSH,andgetDSBLSHafterperformingtheproposedSBLSH.WecomputetheanglesbetweeneachpairofsamplesinD,thecorrespondingHammingdistancesinDLSHandDSBLSH.WecomputethemeansquarederrorbetweenthetrueangleandtheapproximatedanglesfromDLSHandDSBLSHrespectively.NotethataftercomputingtheHammingdistance,wedividetheresultbyC=K/πandgettheapproximatedangle.

3.1.1DatasetsandPreprocessing

Weconducttheexperimentonthefollowingdatasets:

1)PhotoTourismpatchdataset1[24],NotreDame,whichcontains104,106patches,eachofwhichisrepresentedbya128DSIFTdescriptor(PhotoTourismSIFT);and2)MIR-Flickr2,whichcon-tains25,000images,eachofwhichisrepresentedbya3125Dbag-of-SIFT-featurehistogram;

Foreachdataset,wefurtherconductasimplepreprocessingstepasin[12],i.e.mean-centeringeachdatasample,soastoobtainadditionalmean-centeredversionsoftheabovedatasets,PhotoTourismSIFT(mean),andMIR-Flickr(mean).Theexperimentonthesemean-centereddatasetswilltesttheperformanceofSBLSHwhentheanglesofdatapairsarenotconstrainedin(0,π/2].

3.1.2TheEffectofSuper-BitDepthNandCodeLengthK

Ineachdataset,foreach(N,K)pair,i.e.Super-BitdepthNandcodelengthK,werandomlysample10,000data,whichinvolveabout50,000,000datapairs,andrandomlygenerateSRP-LSHfunctions,togetherwithSBLSHfunctionsbyorthogonalizingthegeneratedSRPinbatches.Werepeatthetestfor10times,andcomputethemeansquarederror(MSE)oftheestimation.

TotesttheeffectofSuper-BitdepthN,we xK=120forPhotoTourismSIFTandK=3000forMIR-Flickrrespectively,andtotesttheeffectofcodelengthK,we xN=120forPhotoTourismSIFTandN=3000forMIR-Flickr.Werepeattheexperimentonthemean-centeredversionsofthesedatasets,anddenotethemethodsbyMean+SRP-LSHandMean+SBLSHrespectively.1

2http://phototour.cs.washington.edu/patches/default.htmhttp://users.ecs.soton.ac.uk/jsh2/mirflickr/


2012--Super-Bit Locality-Sensitive Hashing(6).doc 将本文的Word文档下载到电脑 下载失败或者文档不完整,请联系客服人员解决!

下一篇:2020年公需科目当代科学技术前沿知识考题及答案6

相关阅读
本类排行
× 注册会员免费下载(下载后可以自由复制和排版)

马上注册会员

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信: QQ: