2012--Super-Bit Locality-Sensitive Hashing(4)

2021-01-20 18:04

Thislemmacanbeprovenbythede nitionofisotropicdistribution,andweomitthedetailshere.Lemma3.Givenkvectorsv1,...,vk∈Rd,whicharesampledi.i.d.fromthenormaldistributionN(0,Id),andspanasubspaceSk,letPSkdenotetheorthogonalprojectionontoSk,thenPSkisarandommatrixuniformlydistributedontheGrassmannmanifoldGk,d k.

ThislemmacanbeprovenbyapplyingtheTheorem2.2.1(iii),Theorem2.2.2(iii)in

[4].Lemma4.IfPisarandommatrixuniformlydistributedontheGrassmannmanifoldGk,d k,1≤k≤d,andv～N(0,Id)isindependentofP,thentherandomvectorv =Pvfollowsanisotropicdistribution.

FromtheuniformityofPontheGrassmannmanifoldandthepropertyofthenormaldistributionN(0,Id),wecangetthisresultdirectly.Wegiveasketchofproofbelow.

Proof.WecanwriteP=UUT,wherethecolumnsofU=[u1,u2,...,uk]constitutesanorthonor-malbasisofarandomk-dimensionalsubspace.Sincethestandardnormaldistributionis2-stableT[6],v =UTv=[v i～N(0,1),andit1,v2,...,vk]isanN(0,Ik)-distributedvector,whereeachvk iui.Sinceui,...,ukiseasytoverifythatv isindependentofU.Thereforev =Pv=Uv =Σi=1v

canbeanyorthonormalbasisofanyk-dimensionalsubspacewithequalprobabilitydensity,and{v followsanisotropicdistribution.1,v2,...,vk}arei.i.d.N(0,1)randomvariables,v

Theorem1.GivenNi.i.d.randomvectorsv1,v2,...,vN∈Rdsampledfromthenormaldistri-butionN(0,Id),where1≤N≤d,performtheGram-SchmidtprocessonthemandproduceNorthogonalizedvectorsw1,w2,...,wN,thenforanytwodatavectorsa,b∈Rd,byde ningNindicatorrandomvariablesX1,X2,...,XNas

1,hwi(a)=hwi(b)Xi=0,hwi(a)=hwi(b)

wehaveE[Xi]=θa,b/π,forany1≤i≤N.

Proof.DenoteSi 1thesubspacespannedby{w1,...,wi 1},andtheorthogonalprojectionontoits⊥⊥.Thenwi=PSv.Denotew¯=wi/ wi .orthogonalcomplementasPSi 1ii 1

Forany1≤i≤N,E[Xi]=Pr[Xi=1]=Pr[hwi(a)=hwi(b)]=Pr[hw¯(a)=hw¯(b)].Fori=1,byLemma2andLemma1,wehavePr[X1=1]=θa,b/π.

Forany1<i≤N,considerthedistributionofwi.Bylemma3,PSi 1isarandommatrix⊥uniformlydistributedontheGrassmannmanifoldGi 1,d i+1,thusPS=I PSi 1isuniformlyi 1distributedonGd i+1,i 1.Sincevi～N(0,Id)isindependentofv1,v2,...,vi 1,viisindependent

⊥⊥vfollowsanisotropicdistribution.ByLemmaofPS.ByLemma4,wehavethatwi=PSi 1i 1i

2,w¯=wi/ wi isuniformlydistributedontheunitsphereinRd.ByLemma1,Pr[hw¯(a)=hw¯(b)]=θa,b/π.

Corollary1.ForanySuper-BitdepthN,1≤N≤d,assumingthatthecodelengthK=N×L,theHammingdistancedHamming(h(a),h(b))isanunbiasedestimateofθa,b,foranytwodatavectorsaandb∈Rd,uptoaconstantscalefactorC=K/π.

Proof.ApplyTheorem1andwegetE[dHamming(h(a),h(b))]=L×E[ΣNi=1Xi]=L×

Kθa,bNΣN=Cθa,b.i=1E[Xi]=L×Σi=1θa,b/π=2.2Variance

Inthissubsectionweprovethatwhentheangleθa,b∈(0,π/2],thevarianceofSBLSHisstrictlysmallerthanthatofSRP-LSH.

Lemma5.Fortherandomvariables{Xi}de nedinTheorem1,wehavethefollowingequalityPr[Xi=1|Xj=1]=Pr[Xi=1|X1=1],1≤j<i≤N≤d.

共9页:

2012--Super-Bit Locality-Sensitive Hashing(4).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档