2012--Super-Bit Locality-Sensitive Hashing(7)

2021-01-20 18:04

SRP-LSH SBLSH Mean+SRP-LSH Mean+SBLSH

SRP-LSH SRP-LSH SBLSH SBLSH Mean+SRP-LSH Mean+SRP-LSH Mean+SBLSH Mean+SBLSH

Figure3:TheeffectofSuper-BitdepthN(1<N≤min(d,K))with xedcodelengthK(K=N×L),andtheeffectofcodelengthKwith xedSuper-BitdepthN.

Table1:ANNretrievalresults,measuredbyproportionofgoodneighborswithinquery’sHammingballofradius3.NotethatthecodelengthK=30.

Data

NotreDame

HalfDome

TreviE2LSH0.4675±0.09000.4503±0.07120.4661±0.0849SRP-LSH0.7500±0.05250.7137±0.04130.7591±0.0464SBLSH0.7845±0.03520.7535±0.02760.7891±0.0329

Figure3showsthatwhenusing xedcodelengthK,astheSuper-BitdepthNgetslarger(1<N≤min(d,K)),theMSEofSBLSHgetssmaller,andthegapbetweenSBLSHandSRP-LSHgetslarger.Particularly,whenN=K,over30%MSEreductioncanbeobservedonallthedatasets.Thisveri esCorollary2thatwhenapplyingSBLSH,thebeststrategywouldbetosettheSuper-BitdepthNaslargeaspossible,i.e.min(d,K).Aninformalexplanationtothisinterestingphenomenonisthatasthedegreeoforthogonalityoftherandomprojectionsgetshigher,thecodebecomesmoreandmoreinformative,andthusprovidesbetterestimate.Ontheotherhand,itcanbeobservedthattheperformancesonthemean-centereddatasetsaresimilarasontheoriginaldatasets.Thisshowsthatevenwhentheanglebetweeneachdatapairisnotconstrainedin(0,π/2],SBLSHstillgivesmuchmoreaccurateestimation.

Figure3alsoshowsthatwith xedSuper-BitdepthNSBLSHsigni cantlyoutperformsSRP-LSH.WhenincreasingthecodelengthK,theaccuraciesofSBLSHandSRP-LSHshallbothincrease.Theperformancesonthemean-centereddatasetsaresimilarasontheoriginaldatasets.

3.2ApproximateNearestNeighborRetrieval

Inthissubsection,weconductANNretrievalexperiment,whichcomparesSBLSHwithtwootherwidelyuseddata-independentbinaryLSHmethods:SRP-LSHandE2LSH(weusethebinaryver-sionin[23],theoriginalversionisin[1]).WeusethedatasetsNotreDame,HalfDomeandTrevifromthePhotoTourismpatchdataset[24],whichisalsousedin[12,10,13]forANNretrieval.Weuse128DSIFTrepresentationandnormalizethevectorstounitnorm.Foreachdataset,werandomlypick1,000samplesasqueries,andtherestsamples(around100,000)asthecorpusfortheretrievaltask.Wede nethegoodneighborstoaqueryqasthesampleswithinthetop5%nearestneighbors(measuredinEuclideandistance)toq.Weadopttheevaluationcriteriausedin[12,23],i.e.theproportionofgoodneighborsinreturnedsamplesthatarewithinthequery’sHammingballofradiusr.Wesetr=http://www.77cn.com.cningcodelengthK=30,werepeattheexperimentfor10timesandtakethemeanoftheresults.ForSBLSH,we xtheSuper-BitdepthN=K=30.Table1showsthatSBLSHperformsbestamongallthesedata-independenthashingmethods.

4RelationstoOtherHashingMethods

ThereexistdifferentkindsofLSHmethods,e.g.bit-samplingLSH[9,7]forHammingdistanceand 1-distance,min-hash[2]forJaccardcoef cient,p-stable-distributionLSH[6]for p-distancewhenp∈(0,2].Thesedata-independentmethodsaresimple,thuseasytobeintegratedasamoduleinmorecomplicatedalgorithmsinvolvingpairwisedistanceorsimilaritycomputation,e.g.nearestneighborsearch.Newdata-independentmethodsforimprovingtheseoriginalLSHmethodshave


2012--Super-Bit Locality-Sensitive Hashing(7).doc 将本文的Word文档下载到电脑 下载失败或者文档不完整,请联系客服人员解决!

下一篇:2020年公需科目当代科学技术前沿知识考题及答案6

相关阅读
本类排行
× 注册会员免费下载(下载后可以自由复制和排版)

马上注册会员

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信: QQ: