Thislemmacanbeprovenbythede nitionofisotropicdistribution,andweomitthedetailshere.Lemma3.Givenkvectorsv1,...,vk∈Rd,whicharesampledi.i.d.fromthenormaldistributionN(0,Id),andspanasubspaceSk,letPSkdenotetheorthogonalprojectionontoSk,thenPSkisarandommatrixuniformlydistributedontheGrassmannmanifoldGk,d k.
ThislemmacanbeprovenbyapplyingtheTheorem2.2.1(iii),Theorem2.2.2(iii)in
[4].Lemma4.IfPisarandommatrixuniformlydistributedontheGrassmannmanifoldGk,d k,1≤k≤d,andv~N(0,Id)isindependentofP,thentherandomvectorv =Pvfollowsanisotropicdistribution.
FromtheuniformityofPontheGrassmannmanifoldandthepropertyofthenormaldistributionN(0,Id),wecangetthisresultdirectly.Wegiveasketchofproofbelow.
Proof.WecanwriteP=UUT,wherethecolumnsofU=[u1,u2,...,uk]constitutesanorthonor-malbasisofarandomk-dimensionalsubspace.Sincethestandardnormaldistributionis2-stableT[6],v =UTv=[v i~N(0,1),andit1,v2,...,vk]isanN(0,Ik)-distributedvector,whereeachvk iui.Sinceui,...,ukiseasytoverifythatv isindependentofU.Thereforev =Pv=Uv =Σi=1v
canbeanyorthonormalbasisofanyk-dimensionalsubspacewithequalprobabilitydensity,and{v followsanisotropicdistribution.1,v2,...,vk}arei.i.d.N(0,1)randomvariables,v
Theorem1.GivenNi.i.d.randomvectorsv1,v2,...,vN∈Rdsampledfromthenormaldistri-butionN(0,Id),where1≤N≤d,performtheGram-SchmidtprocessonthemandproduceNorthogonalizedvectorsw1,w2,...,wN,thenforanytwodatavectorsa,b∈Rd,byde ningNindicatorrandomvariablesX1,X2,...,XNas
1,hwi(a)=hwi(b)Xi=0,hwi(a)=hwi(b)
wehaveE[Xi]=θa,b/π,forany1≤i≤N.
Proof.DenoteSi 1thesubspacespannedby{w1,...,wi 1},andtheorthogonalprojectionontoits⊥⊥.Thenwi=PSv.Denotew¯=wi/ wi .orthogonalcomplementasPSi 1ii 1
Forany1≤i≤N,E[Xi]=Pr[Xi=1]=Pr[hwi(a)=hwi(b)]=Pr[hw¯(a)=hw¯(b)].Fori=1,byLemma2andLemma1,wehavePr[X1=1]=θa,b/π.
Forany1<i≤N,considerthedistributionofwi.Bylemma3,PSi 1isarandommatrix⊥uniformlydistributedontheGrassmannmanifoldGi 1,d i+1,thusPS=I PSi 1isuniformlyi 1distributedonGd i+1,i 1.Sincevi~N(0,Id)isindependentofv1,v2,...,vi 1,viisindependent
⊥⊥vfollowsanisotropicdistribution.ByLemmaofPS.ByLemma4,wehavethatwi=PSi 1i 1i
2,w¯=wi/ wi isuniformlydistributedontheunitsphereinRd.ByLemma1,Pr[hw¯(a)=hw¯(b)]=θa,b/π.
Corollary1.ForanySuper-BitdepthN,1≤N≤d,assumingthatthecodelengthK=N×L,theHammingdistancedHamming(h(a),h(b))isanunbiasedestimateofθa,b,foranytwodatavectorsaandb∈Rd,uptoaconstantscalefactorC=K/π.
Proof.ApplyTheorem1andwegetE[dHamming(h(a),h(b))]=L×E[ΣNi=1Xi]=L×
Kθa,bNΣN=Cθa,b.i=1E[Xi]=L×Σi=1θa,b/π=2.2Variance
Inthissubsectionweprovethatwhentheangleθa,b∈(0,π/2],thevarianceofSBLSHisstrictlysmallerthanthatofSRP-LSH.
Lemma5.Fortherandomvariables{Xi}de nedinTheorem1,wehavethefollowingequalityPr[Xi=1|Xj=1]=Pr[Xi=1|X1=1],1≤j<i≤N≤d.