An Uncertainty-Aware Approach for Exploratory Microblog Retr(2)

2021-04-06 05:37

andoneresearcherinmediaandcommunications(C).Theexpertsareexperiencedinretrievingdatafrommicroblogs.Theyalsohadexperienceusingamethodsimilartotheonedescribedabove.Weconductedseveralinterviewswiththem,mainlyfocusingonprobingtheirneedsandmicroblogretrievalprocess.Wethenidenti edthefollowinghigh-levelrequirementsbasedontheirfeedback.

R1-Examininganinitialsetofsalientmicroblogdata.Bothex-pertsexpressedtheneedforarankinglistofkeywordsearchresults.Keyword-basedmicroblogretrievalresultsoftenincludemillionsofpostsandtensofthousandsofusersandhashtags.Thus,theseresultsaretoomassiveforanalyststoquicklydiscoverrelevantdata.Theexpertsusuallyhavetoexaminethedatacarefullyanddesignasetofrulesto lteroutirrelevantdata.Asaresult,theystatedtheneedforatoolkitthatcanrankextractedposts,users,andhashtagstofacilitatetheirdataretrievaltasks.Thisneedisconsistentwiththe ndingsofpreviousresearch[16,46].

R2-Revealingrelationshipswithinmicroblogdata.Previousre-search[11,16,25]hasalsoindicatedthattherelationshipswithindatahelpuserslocateinterestinginformationmoreeasily.Furthermore,therelationshipsamongthethreedimensionsofmicroblogdata(posts,users,andhashtags)canassisttheminextractingsalientdata.Forexample,postsfromopinionleadersareusuallymoreimportantthanthosefromaverageusers.Thedomainexpertsdesiredtheabilitytoexploredifferenttypesofrelationships.

R3-Exploringsalientmicroblogdatafromdifferentperspectives.Sincethethreedimensionsofmicroblogdatausuallyin uenceeachother,theexpertswantedtounderstandthisin uencesothattheycanlinkimportantdatainonedimensiontothatinanotherdimension.Forexample,expertSsaidthat,“Collectingrelevanttweetsisveryimportantforsomeofourprojects.After ndingoneimportanttweet,Iusuallycheckothertweetsfromthesameauthoraswellasthetweetsmarkedbythesamehashtag(s).Thishelpsme ndrelevanttweetsquickly.”

R4-Understandingtheerrorproducedbytherankingmecha-nism.Themicroblogdatarankingmechanismisnotperfectandoftenintroduceserrorsoruncertaintyintotheretrievalprocess.Thus,thedegreeofuncertaintymustbeanalyzedandunderstoodtofacilitateinformeddecision-making[14,37,49].Theexpertsrequestedtoknowwhichrankingscoresaremoreerror-prone.

R5-Analyzingthein uenceoftheerrorsofoneitemonotheritems.Theexpertsalsoexpressedtheneedtounderstanderrorpropagationamongdataitems.Theyclaimedthatthisinformationcanhelpthemconsiderablyin lteringoutirrelevantdata.Forexample,expertCcommented,“WhenI ndanitemwithanincorrectranking

erinterface:(a)MutualRankervisualization;(b)controlpanel;(c)informationpanel.

score,Ialsowanttoknowwhichitemsarein uencedbythissothatIcanadjusttherankingscorequickly.”3.2SystemOverview

Thecollectedrequirementshavemotivatedustodevelopavisualana-lyticstoolkit,MutualRanker.Itconsistsofthefollowingcomponents: AnMRGmodeltogeneratetheinitialrankinglistsofposts,users,andhashtags(R1);

Anuncertaintymodeltoestimateuncertaintyanditstopologicalpropagationonagraph(R4,R5);

Acompositevisualizationtopresentthegraph-basedrankingresults,uncertainty,anditspropagation(R2,R3).

TheprimarygoalofMutualRankeristoextractalistofkmicroblogposts/users/hashtagsthatarerelevanttoqueryq.Fig.2illustratesthemaincomponentsneededtoachievethisgoal.Givenamicroblogdatasetextractedbyaquery,thepreprocessingmodule rstextractsthepostgraph,theusergraph,andthehashtaggraph.ThethreegraphsarethenfedtotheMRGmodel,whichproducesthreerankinglistsofposts,users,andhashtags.Theuncertaintymoduleestimatestheuncertaintyintheretrievalmodelanditstopologicalpropagation.Thevisualizationmoduletakestherankingresultsandtheuncertaintyestimationasinputandillustratestheminacompositevisualizationthatincludesagraphvisualization,anuncertaintyglyph,anda erscaninteractwiththegeneratedvisualizationforfurtheranalysis.Forexample,ausercanmodifyarankingresult.Withthisinput,MutualRankerwillincrementallyupdatetherankingresults.

Fig.3depictstheuserinterfaceofMutualRanker.Itcontainsthreedifferentinteractionareas:MutualRankervisualization(Fig.3(a)),controlpanel(Fig.3(b)),andinformationpanel(Fig.3(c)).Thevisual-izationviewconsistsoftwoparts:1)thestackedtreevisualizationthatshowsthehierarchicalstructureofmicroblogdata;2)thecompositevisualizationthatsimultaneouslyrevealtheretrievedmicroblogdata,theuncertaintyoftherankingresults,anditstopologicalpropagation.Thecontrolpanelconsistsofasetofcontrolsthatenableuserstointeractivelyupdatetheranking.Theinformationpaneldisplaysthecorrespondingmicroblogdatasuchasposts,users,andhashtagsforaselectedaggregateitem.

10.1109/TVCG.2015.2467554, IEEE Transactions on Visualization and Computer Graphics#obamacare

#debtceiling

Hashtag Graph

GRAPH

mainofMRG[16,45]isthatitemploys

boththerelation-users,orhashtags,andtherelationshipsbetween

rankings.Thisfeaturesigni cantlyreducesthework-wheninteractingwithourvisualanalyticssystem.Foranalystmodi estherankingscoreofahashtag,MRGupdatestherankingscoresoftheneighboringPost Graphhashtags,butalsothoseofrelevantusersandposts.Thisprocessallowsoursystemtointegrateuserknowledgeintothevisualanalyticsprocesswithacceptableusereffort.ThisisalsothemainreasonwhyweadoptMRGinMutualRanker.

5.1MRGComputationwithMonteCarloSamplingMethodDuanetal.[16]proposedamatrix-basedmethodtosolveMRG,whichiterativelyupdatestherankingscoresusingEq.(1).Thematrix-basedmethodisaglobalone.Anupdatetoanyitemisachievedbyrunningthemethodontheentireitemset,whichisverytime-consuming.Toaddressthisproblem,weusetheMonteCarlosamplingmethod.Theadvantagesofthissamplingmethodoverthematrix-basedmethodareasfollows[2]:

Rankingscoresarelocallyupdatedwhentheinputchangeslo-cally;

Therankingscoresofimportantitemsareaccuratelyestimatedafterafewiterations;

TheuncertaintyoftherankingscoresaremodeledaccuratelybecausetheMonteCarlomethodcalculatesvariancestatistically.Toemploythismethod,we rstsolveRasformulatedbyEq.(2):

R=(1 d)(I dM) 1W=(1 d)(∑dkMk)W.

k=0∞

(3)

Wethenperformaseriesofrandomwalksforeachitem.Arandomwalkmaystopateachstepwithaprobabilityof1 d.Ifthewalkcontinues,thenitproceedstothenextstepaccordingtothematrixM.Eachelementmijde nesthetransitionprobabilityfromitoj.

∞

LetZ=∑dkMk.Therankingscoreofeachitemis:

k=0

Fig.4.Mutualreinforcementmodel.

rj=(1 d)∑wizij,

i=1

(4)

TheinputofMRGincludesthreegraphs,thepostgraph,theusergraph,andthehashtaggraph,aswellastherelationshipsamongthem.ThethreegraphsandtheirrelationshipsareshowninFig.4.Asin[16],thepostgraphisbuiltbasedoncosinesimilarity.ArecentstudyhasshownthatcosinesimilaritywithaTFIDFweightingschemeisthemostappropriatemeasuretocomputethesimilaritybetweenmicroblogposts[35,51].Asaresult,weemploycosinesimilarityinoursystem.Theusergraphisconstructedbasedonfollower-followeerelationships.Thehashtaggraphisgeneratedaccordingtotheco-occurrenceoftwohashtags.Thethreegraphsarealsoconnectedbytworelationships:authorshipandco-occurrence.Ifauserpublishesapost,thenweconnectthisuserwithhis/herpost.Wealsolinkthisuserwithallofthehashtagsinthispost.Eachpostisalsolinkedtoallofthehashtagsassociatedwithit.

Forsimplicity,weuniformlydenoteposts,users,andhashtagsasitemsinthefollowingdiscussion.

TheMRGemploysamethodsimilartoPageRank[7]tomodelthemutualin uenceamongdifferentitemsinheterogeneousgraphs:

RpαppMpp Ru =d αpuMpuRhαphMph

αupMup

αuuMuuαuhMuh

αhpMhpRpWpαhuMhu Ru +(1 d) Wu .αhhMhhRhWh

(1)

wheretheelementzijinZistheaveragenumberoftimesthatarandom

walkstartingfromitemivisitsitemj.Weestimatezijbycomputingtheempiricalmeanofanumberofrandomwalks.

Duanetal.[16]onlyconsiderthesimilarityofitemsincomputingmij.Thus,high-rankingscoresmaybeincorrectlyassignedtouserswhopublishmanypoststhatdonotreceiveattention.Toaddressthis,weconsiderthepriorsaliencyofitemsinthesamplingprocess.Speci -cally,thetransitionprobabilityfrommijisde nedassimilarity(i,j)·wj.5.2UncertaintyModeling

InMutualRanker,weuseanapproximationmethodtosolveMRG,whichmayintroduceuncertaintyintotheretrievalresults.Itisthereforeimportanttomodeluncertainty.SinceweemploytheMonteCarlosamplingmethod,thedistributionofeachrankingscoreisknown.Hence,wecanemploytheprobabilitytheorytomodeluncertainty.Uncertaintyisde nedasaparameterfordepictingthedispersionofvaluesthatcanbereasonablyattributedtothemeasuredvalue[5].Traditionalmethodsmodelthemeasuredvalueasanormallydistributedrandomvariable[14,49].Variance[14]andstandarddeviation[49]areamongthemostcommonlyusedmeasurestorepresentuncertaintywhereinthemeasuredvalueisde nedonthesetofbothpositiveandnegativerealnumbers.

Themeasuredvalue(rankingscore)inourapproachisde nedonthesetofpositiverealnumbers.Thus,theabovemodelingmethodcannotbeapplieddirectlytoourwork.

Accordingto[2],zijinEq.(4)hasaPoissondistribution.Therankingscoreistheweightedsumofaseriesofzij.Hence,therankingscoreismodeledasaPoissonmixture.ForaPoissonmixture,thevarianceisapproximatelyproportionaltothemean.Hence,ifweusevariancetomodeluncertainty,thelargertherankingscore,themoreuncertainitis,butthisisnotalwaystrue.

Standarddeviationisthesquarerootofvarianceandhasasimilarproblem.Consequently,varianceandstandarddeviationarenotgoodmeasuresfordepictinguncertaintyinourmodel.

Forsuchadistribution,acommonlyusedmeasureofdispersionisthevariance-mean-ratio(VMR)[15].ThehighertheVMR,themoredispersedthedistribution.Foritemj,itsVMR(uj)canbede nedas:

uj=vj/rj,

(5)

themutualreinforcementstrengthamongposts,users,andhashtags.disthedampingfactorinPageRank,andwesetitto0.85,asin[7].Wp,Wu,andWharevectorsforthepriorsaliencyoftheitems(e.g.,thecontentqualityofposts,thesocialin uenceofusers,orthepopularityofhashtags). LetR= Ru ,W= Wu ,andM= αpuMpu

αppMppαphMph

αupMup

αuuMuuαuhMuh

αhpMhpαhuMhu αhhMhh

Rp,Ru,andRharetherankingscorevectorsofposts(p),users(u),andhashtags(h).Mxydenotestheaf nitymatrixfromxtoy,wherex,ycanbeposts,users,orhashtags.αxyisaweightusedtobalance

Then,Eq.(1)canbesimpli edas:

R=dMR+(1 d)W.

(2)

5MRG-BASEDUNCERTAINTYANALYSIS

SinceexactinferenceofMRGisverytime-consumingonalargegraph,weapproximateitusingamoreef cientMonteCarlosamplingmethod.Wealsoexplicitlymodeltheuncertaintyassociatedwitheachitem(e.g.,apost,auser,orahashtag),aswellasitspropagationonthegraph.

wherevjisthedistributionvarianceoftherankingscoreofitemj.Accordingto[2],vjcanbecalculatedasfollows:

10.1109/TVCG.2015.2467554, IEEE Transactions on Visualization and Computer Graphics

vj=(1 d)2∑w2ivzij.

i=1N

(6)

wherevzijisthevarianceofzij.EachzijobeysaPoissondistribution

anditsvariancecanbecalculatedfromitsexpectation.

Themassivenumberofitemsinthemicroblogdatameanswecannotplaceallofthemonthescreen.Hence,weaggregatesimilaritemstoformacluster.Theoverallrankingscoreofaclusterrcisde nedasthesumoftherankingscoresofitsitems[4].Therankingscoresareindependentofeachotherandtheoverallvarianceofthecluster,vc,isthesumofthevarianceoftherankingscores.Thus,theuncertaintyofacluster,uc,canbecalculatednaturallybydividingvcandrc.

uc=vc/rc=∑(rj/rc)(vj/rj)=∑kjuj.

j∈c

(c)

Fig.5.Topologicaluncertaintypropagationcalculation

MtoM .Thischangeonlyaffectsasmallpartoftherandomwalks

(7)

Eq.(7)showsthatuccanbeexpressedbyaweightedsumoftheuncertaintyofitsitemswhereeachweightkjistheratiooftherankingscoresofitemjandclusterc.Thus,theuncertaintyofaclusterismainlydeterminedbyitsimportantitems.

5.3TopologicalUncertaintyPropagation

Ifananalyst ndsanincorrectlyrankeditem,hecanmodifyitbasedonhisknowledge.Hecanfurthertrackhowtheuncertaintypropa-gatesfromoneclustertoanothertoidentifyotheraffecteditems.Tohelpananalysttrackuncertainty,weexplicitlymodelitstopologicalpropagationonthegraph.

InMRG,therankingscoreofanitemcanbeexpressedasalinearcombinationofrankingscoresofrelateditems.Hence,thevariancesofarankingscorecanalsobeexpressedasalinearcombinationofthevariancesofrelatedrankingscores.Theuncertaintyofeachitemcanbecalculatedfromitsrankingscoreanditsvariance,andhence,theuncertaintyofanitemcanalsobeexpressedlinearlybytheuncertaintyofotheritems.Speci cally,

uj=

i=1,i=j

∑

m ijui,

(8)

222whereeachm ij=(dmijri)/((1 dmjj)rj).Eq.(8)showsthattheuncer-taintyofeachitemisnotindependentanditpropagatesonthegraphin

alinearform.Thus,foreachpairofitemsiandj,m ijuicanbeviewedasthepropagateduncertaintyfromitemitoj.Wedenoteitbyui→j.RewritingEq.(8)inamatrixform,wecanformulatetheuncertaintypropagationasaMarkovchain:

usedintheMonteCarlosamplingmethod.

Fortheaffectedrandomwalks,existingincrementalgraphrankingalgorithms[3]performre-samplingandupdatetherankingscoresbyaggregatingthestatisticsofthesenewrandomwalksintotheoriginalresults.Onemainproblemwiththesealgorithmsisthatre-samplingrequiresaconsiderableamountoftime,whichmaymakereal-timeinteractionimpossible.Supposenistheaveragenumberofneighborsthatanitemhasandlistheaveragelengthofasampledrandomwalk.Ateachstepinarandomwalk,wehavetosamplefromamultinomialdistributionwithnpossibleoutcomesandthetimecostisO(n).Thus,samplinganewrandomwalkwilltakeO(nl)time.ThetimeneededtocomputeandaggregatethestatisticsofthesesamplesisO(l).ThetotaltimerequiredforanewsampleisO(nl)+O(l)=O(nl).

However,inourscenario,wedonotdeleteoraddedgesonthegraphs.Asaresult,wedonotneedtoperformre-sampling.Weonlyneedtomodifythestatisticsofarandomwalkbasedonthemodi edtransitionprobabilitymij,therebyavoidingthehighcostassociatedwithresampling.Thetimecostofupdatinganin uencedrandomwalkisreducedtoO(l).

Givenarandomwalk:path={i→n1→...→nk 1→j},wede neanewrandomvariablexikj.Inparticular,xikj=1indicatesthattherandomwalkstartsfromiandreachesjbymovingksteps.Theoriginalweightofeachstepintherandomwalkis1.Duringanupdate,were-calculatetheweightofthisstepusingP (xikj=1)/P(xikj=1).P(xikj=1)istheprobabilityofxikj=1accordingtoMandP (xikj=1)istheprobabilityofk=1accordingtoM .Hence,P(xk=1)iscalculatedby:xijij

P(xikj=1)=min1mn1n2...mnk 1j.

(12)

Similarly,P (xikj=1)canalsobecalculated.6

VISUALIZATION

UM =U,

(9)

whereU=[uj]1×NandM =[m ij]N×N.

Similartotheuncertaintypropagationfromitemtoitem,wecanmodeltheuncertaintypropagationfromclustertoclusterusingthefol-lowingprocedure.First,basedonEq.(8),wecalculatethepropagateduncertaintyfromeachitemiinthesourceclustercstoeachitemjinthetargetclusterct(Fig.5(a)).Second,foreachitemjinct,wecomputethepropagateduncertaintyucs→jfromcstoitemjbyaggregatingtheuncertaintypropagatedfromeachiteminthesourcecluster(Fig.5(b)).

ucs→j=

i∈cs

Tohelpanalystsextractmicroblogdataofinterestinteractively,wehavedesignedacompositevisualizationthatincludesagraphvisualization,anuncertaintyglyph,anda owmap(Fig1(a)).6.1RankingResultsasGraphVisualization

Sinceonepostcorrespondstoonlyoneuserandafewhashtags,thescopeofin uenceofapostissmallerthanthatofauserorahashtag.Updatingtherankingscoreofapostwillonlydirectlyaffecttherankingscoresofitsauthor,afewrelatedhashtags,andanumberofposts.Incontrast,updatingtherankingscoreofauserorhashtagwilldirectlyimpacttherankingscoresofhundredsoreventhousandsofpostsaswellasanumberofusersandhashtags.Ontheotherhand,thenumberofpostsisusuallyhuge,around10-100timesthatofusersorhashtags.Analystswouldrequiremoretimetoprovidetheirfeedbackonapostgraph.Asaresult,weregardtheuserandthehashtagastheprimaryvisualizationelementsandthepostasasecondaryelementmainlyusedtoillustratethecontentoftheprimaryelements.Accordingly,usersandhashtagsarevisuallyrepresentedbyanode-linkgraphwhereaspostsarerepresentedasalist.Forsimplicity,wetakeahashtaggraphasanexampletoillustratethebasicideaofgraphvisualization.

Toallowanalyststonavigatelargegraphsef ciently,ahierarchyisbuiltbasedonaBayesianRoseTree[26]witheachnon-leafnoderep-resentingahashtagcluster.AsshowninFig.3(a),astackedtreeisadoptedtorepresentthehashtaghierarchyandadensity-basedgraphvisualizationisemployedtoillustratetherelationshipswithintheuser/hashtaggraphsandbetweenthem(R2,R3).

∑ui→j.

(10)

Finally,theuncertaintyofaclusterisaweightedsumoftheuncertaintyoftheitemsinit(Eq.(7)).Thus,theoverallpropagateduncertaintyucs→ctfromcstoctcanbecalculatedastheweightedsumofthepropa-gateduncertaintyfromcstoeachjinct(Fig.5(c)).

ucs→ct=

j∈ct

∑kjucs→j.

(11)

5.4IncrementalRankingUpdate

Wealsoallowanalyststointeractivelymodifytheitemrankingresultbasedontheirknowledge.WecanupdatethemodellocallybecauseweusetheMonteCarlosamplingmethod.Aftertheanalystchangestherankingscore(s),ourapproachiterativelyupdatesthepriorsaliencescore(s)oftheitem(s).Accordingly,theaf nitymatrixischangedfrom

Fig.6.Basicideaofthelayoutalgorithm:(a)placetheclusternodesandderivethelayoutcenterofeachcluster;(b)computetheVoronoitessellationandtreatthecellasthelayoutareaofeachcluster;(c)layoutofrepresentativeandnon-representativenodes;and(d) nallayoutresultwithcontext.

Thedensity-basedgraphvisualizationcombinesanode-linkdiagramwithadensitymaptodisplaythenodesattheselectedlevelofthehashtagtree.Asin[25],weextractrepresentativenodesforeachoftheclusternodesattheselectedtreelevelandassignothernon-representativenodestotheirclosestrepresentativenodes.AsshowninFig.1(a),therepresentativenodesaredisplayedasanode-linkdiagramandtheothernodesasadensitymap.Inthisvisualization,therepresentativenodesofoneclusterareplacedneareachothertore ecttheircloseness.Thesizeofthenodeencodesthesumoftherankingscoreofeachitem.Thecorrespondingusersareoverlaidaroundtheselectedhashtagnodetoprovidemoreanalysiscontext(Fig.6(d)).Layout.Thelayoutofthestackedtreeisquitestraightforward.Thus,weintroducethelayoutofthedensity-basedgraph,whichcontainsthefollowingsteps.

Step1:Derivethelayoutcenterofeachclusterattheselectedtreelevel.Webuildaclustergraphbycheckingtheedgeconnectionsbetweenthetwoclusternodes.Anedgeisaddedifasuf cientnumberofconnectionsbetweenthetwoclusternodescanbefound.Theclustergraphisthenplacedbyaforce-directedlayout[21].AsshowninFig.6(a),thepositionofeachclusternodeistreatedasthecenterofeachhashtagcluster.

Step2:Computethelayoutareaofeachcluster.Inthisstep,wecomputethecorrespondingVoronoitessellationbasedontheclustercenter.Thecorrespondingtessellationcellsaretreatedaslayoutareasofthehashtagclusters(Fig.6(b)).

Step3:Layoutofrepresentativeandnon-representativenodes.Inthisstep,theforce-directedlayoutisadoptedtoplacetherepresentativenodes.Toensuretherepresentativenodeswithinoneclusterareplacedinthecorrespondingclusterlayoutarea,arepulsionforceisaddedfromtheareaboundarytoeachnodewithinthisarea.Thekerneldensityestimation[22]isutilizedtorepresentthedistributionofnon-representativenodes(Fig.6(c)).

Step4:Layoutofthecontextwordcloud.Showingthehashtaggraphandusergraphsimultaneouslywouldintroducevisualclutter.Tosolvethisissue,wetreatthehashtaggraphasaprimaryelementandtheuserinformationascontext.Inparticular,whenahashtagnodeisselected,awordcloudthatincludestheuserswhousethishashtagislaidouttoprovideusercontext.Inthiswordcloud,theselectedhashtagisplacedinthemiddle.Asweep-line-basedwordcloudlayoutalgorithm[36]isemployedtoproducesuchawordcloud.Fig.6(d)showsalayoutresultwithawordcloudcontext.

Interaction.Thefollowinginteractionsareprovidedtoassistanalystsininvestigatingtherankingresultsfrommultipleperspectives.

Examiningtherankedmicroblogdataandtheirrelationships(R2).Thedensity-basedgraphvisualizationprovidesaneasywaytoexploretherankingresultsfromthehashtagoruserperspective.Utilizingthehashtaghierarchyallowstheanalysttoexploretherankingresultsfromaglobaloverviewtolocaldetails.Several lters,suchastheedgeortheglyph lter,enableanalyststocustomizethisvieweasily.Relevantposts,hashtags,andusersarealsoprovidedtohelpanalystsbetterunderstandthecontentoftheselectedclusternode.

Smoothlyswitchingbetweendifferentdatadimensions(R3).Inspiredbythecontextpopupinteractionin[18],wealsooverlaycontextofaselecteditemtoprovidefurthernavigationcues.Forexample,ifthe

Upper Hinge (quartile)Lower Hinge (quartile)

Lower Extreme

(a)(b)(c)

Fig.7.Designoftheuncertaintyglyph:(a)boxplot;(b)transformingtheboxplottotheuncertaintyglyph;(c)analternativedesign.

analystselectsahashtag,thelabelsofuserswhousethathashtagcanbeoverlaidaroundtheselectedhashtagviaawordcloud(Fig.11(a)).Iftheanalyst ndssomethingofinterest,thehashtaggraphwillbesmoothlytransitionedtotheusergraph(Fig.11(b)).

6.2UncertaintyasGlyph

Aftertestingwiththe rstprototype,theexpertsidenti edseveralincorrectrankingresults.Theyexpressedtheneedtobeinformedofsuchresults.Thisrequirementisrelatedintimatelywiththeconclusionofpreviouswork,whichstatedthateffectivelyconveyinguncertaintyisveryimportanttothevisualanalyticsprocess[14,49].Sincetherankingresultsareaggregatedintoclustersintheoverview,theexpertswantedtoexaminetheuncertaintydistributionoftheaggregatenode,includingtheminimumvalue(0),maximumvalue(1.0),lowerextreme,upperextreme,lowerhinge(25%),andupperhinge(75%).

Inspiredbytheboxplotdesign(Fig.7(a)),wehavedesignedaglyphtomeettheaboverequirements(Fig.7(b)).AsshowninFig.7(a),sixvaluesfromasetofdataareconventionallyusedinaboxplot,includingtheminimumandmaximumvalues,theextremes,andtheupperandlowerhinges(quartiles).Atotalof50%percentofitemsfallinbetweentheupperandlowerhinges.Tocombineaboxplotwithagraphnode,we rsttransformtheboxplottoaline-basedone,andthenbenditaroundtheupperboundaryofthenode(Fig.7(b)).Wealsoattemptedseveralalternativesintheparticipatorydesignprocesswithexperts.Fig.7(c)isoneofthem.Afterinteractingwiththisalternative,theexpertsstatedthatitwasconfusing.Theythoughtthattheitemwithmoreofa lledareainsideshouldbetheoneonwhichtheyshouldfocus.However,inreality,thesenodeswereonlynodeswithalargerareabetweentheupperandlowerhinges.APhDstudentfromanartschoollatercon rmedthatalargeramountofdigitalinkwillattractmoreattentionfromusers.Afterseveralinteractionswiththeexpertsandtheartstudent,wechooseFig.7(b)asour naldesign.

Analystscanobtainanoverviewoftheuncertaintydistributioninaclusterbyexaminingitsuncertaintyglyph.Fig.8illustratesseveralexamplepatterns.Forexample,inFig.8(a),themajorityofitemsinthisclusterarecharacterizedbylowuncertainty.However,theclusteralsocontainssomeitemswithhigheruncertainty.Asaresult,exploringtheitemswithhighuncertaintyisaworthwhileendeavor.

Interaction.Inadditiontoallowinganalyststoexaminetheuncertaintyscore(R4),wealsoprovidetheinteractionshownbelowtointegrateanexpert’sknowledgeintotheretrievalprocess.

Interactiverankingre nement.Afteranexpert ndsanincorrectrank-ingresultbyexaminingtheuncertaintyglyph,theexpertcanmodifytherankingresult.Therankingscoresofthecorrespondinggraphnodeswillalsobeupdatedaccordingly.AsshowninFigs.1(c)-(f),the

isquiteshort.Thus,ifweconsiderthismeasure,manyoftheselinesegmentswillnotbebundledtogether.

Thetotaledgecompatibilityisde nedby:

Ce(ei,ej)=Cα(ei,ej)·Cs(ei,ej)·Cp(ei,ej).

(a)mostitemsinhigheruncertaintyarealsoincluded;(b)mostitemsinthisclusterhavehighuncertaintyandsomeitemswithhigheruncertaintyalsooccur;(c)uncertaintydistributionisuniform;(d)mostitemsinthisclusterhaveloweruncertainty.

Fig.9(b)showsthematchedresultsofthepropagationpaths.

Step3:Computetheforcetobundlethepropagationpath.Thecom-binedforceforapointpioneiisde nedas:

Fpi=Ki( pi 1 pi + pi pi+1 )+

ej∈E

∑

pi pj ·Ce(ei,ej)

whereKiisthespringconstantforeachsegmentandEisthesetofallthematchededgesofei.In[19],thelastitemistheelectrostaticforceFe=1/ ei ej .Inordertobundlethematchedpathsthatarelocatedawayfromeachother,wereplaceitwithanattractingspringforce.Fig.9(c)showsthelayoutresultsofthepropagationpaths.7

QUANTITATIVEEVALUATION

Inthissection,wequantitativelyevaluatetheeffectivenessofourMRGcomputationandincrementalrankingupdatealgorithm.

youtofmultipleuncertaintypropagationpaths:(a)initiallayoutbasedonthe owmaplayout;(b)thematchedresultofthepropagationpaths;(c)thelayoutresultofthepropagationpaths.

7.1MRGComputation

ToevaluatetheperformanceofourMRGcomputationbasedontheMonteCarlosamplingmethod,wecompareditwiththematrix-basedmethodproposedin[16].WeusedtwoTwitterdatasetsintheexperiments:governmentshutdownandEbolaoutbreak.Theshutdowndatasetcontainstweetsonthe2013USgovernmentshutdown(5,132,510tweetsfromOct.1toOct.16,2013),whichwerecollectedbyusingqueriessuchas“shutdown.”TheEboladatasetcontainstweetsontheEbolaoutbreak(1,425,017tweetsfromJan.1toDec.25,2014),whichwerecollectedbyusingqueriessuchas“ebola.”AllexperimentswereconductedonaPCwitha3.1GHzCPUand16GBRAM.

Thereweretoomanyposts,users,orhashtagsandwecouldnotlabelallofthem.Thus,wedidnotreporttherecallinourevaluation.Inthisevaluation,weusedtopn-precision(n-Prec)astheevaluationmeasure.Topn-precisionisthepercentageofthecorrectlyretrieveditemsamongthetop-nrankeditems.Thismeasureisoftenusedwhentherecallishardtocalculate[9].Tofullycomparethetwoalgorithms,wecalculatedthetop10,50,100,and200-precisionforposts,users,andhashtags,respectively.WeinvitedtwoPhDstudentswhomajoredindataminingandarefamiliarwiththedatasetstoevaluatethere-trievalresults.Theylabeledtheresultsindividuallyandresolvedthedifferencesviadiscussion.TheresultsareshowninTable1.Overall,ouralgorithmperformedbetterthanthebaselineonbothdatasets.Weinspectedthetop10retrieveditemswithbothmethods.Ingeneral,theretrieveditemswerequiteaccurate.However,thebaselinehadonemistakeinthetop10usersselectedfromtheshutdowndataset.Itoverestimatedtheimportanceofausercalled@governmentclosd,whopostedasigni cantnumberoftweetswithanumberofhashtags.However,thisuserdidnothavemanyfollowersandhis/hertweetswereseldomretweeted.Incontrast,ouralgorithmcanavoidthismistakebytakingauser’sauthorityintoconsideration.ThebaselinealgorithmalsohadsimilarmistakesintheEboladataset.

Dataset

n-Precrankingscores(e.g.,nodesizes)ofseveralnodeschanged.Aglyphisdesignedtoillustratethechange,withthedottedorangecircleencodingthepreviousrankingscoreandtheboundaryofthe lledcircle(graycolor)representingthechangedrankingscore(Figs.1(d)-(f)).6.3UncertaintyPropagationasFlowMap

The owmap[33,44]isdesignedtovisuallyanalyzethemovementofobjectsfromonelocationtomultiplelocations.Inspiredbythisdesign,wedeveloptheuncertaintypropagationpath(Fig.1),whichisusefulforquicklyderivingtheunknownuncertainnode(s)fromtheknownone(s)(R5).

Layout.Thelayoutofmultipleuncertaintypropagationpathsofdiffer-entnodesisbasedonthe owmaplayoutin[44]andtheedgebundlingin[19].Thelayoutcontainsthefollowingsteps.

Step1:Derivetheinitialuncertaintypropagationpathbasedonthe owmaplayout.We rstcomputetheuncertaintypropagationoftheselectednodebasedonthetopologybyusingthemethodinSec.5.3.The owmaplayoutviaspiraltreesisthenunitizedtogeneratetheinitialuncertaintypropagationpath(Fig.9(a)).

Step2:Employedgecompatibilitymeasurestomatchthecorrespond-ingpropagationpathsfromdifferentnodes.Inthisstep,weemploythethreecompatibilitymeasuresdescribedin[19]tomatchthepropagationpathsfromdifferentnodes.

The rstmeasureisanglecompatibility,whichaimstomatchtheedgeswithasmallerangle.Itisde nedby:

Cα(ei,ej)=|cos(α)|.

Thesecondmeasureisscalecompatibility,whichtendstomatchtheedgeswithsimilarlengths.Itismeasuredby:

Cs(ei,ej)=2/(lavg·min(|ei|,|ej|)+min(|ei|,|ej|)/lavg),

wherelavg=(|ei|+|ej|)/2.

Thethirdmeasureispositioncompatibility,whichaimstomatchthecloseedgestogether.Itisde nedby:

Cp(ei,ej)=lavg/(lavg+ Qm1 Qm2 ),

Shutdown

whereQm1andQm2arethemidpointsofedgeseiandej.

Thelastmeasure,visibilitycompatibility,describedin[19]isnotconsideredinourmethodbecausetherearetoomanylinesegmentsinthepropagationpathgeneratedbythe owmaplayout,eachofwhich

Ebola

parisonoftheMRGcomputationmethodwiththebaselineusingtop-nprecisionforposts,users,andhashtags.

10.Theoverviewofthegovernmentshutdowndataset.

ciology(S)andoneresearcherinmediaandcommunications(C),tounderstandtheirrespectiveinterestsinthedatasets.Wedesignedanumberofexplorationtasks.Inthesecondphase,wecollaboratedwiththeexpertsto nishthedesignedtasks.Duringthisphase,weaskedquestionstodiscusswiththeexpertstheusefulnessofourtoolforeachtask.Finally,theexpertswereinvitedtoanotherdiscussionsessiontoprovideoverallfeedbackonhowourtoolcouldhelpthemwithreal-worldtasks.

8.1CaseStudy:GovernmentShutdown

Inthisstudy,weworkedwithexpertSto:1)evaluatehowuncertaintyanalysiscanbeutilizedtoidentifykeyhashtagsanduserswithasatis-factorycon dencelevel;2)leverageoursystemtoiterativelyreducetheuncertaintylevels;3)extractrelevanthashtags/users/tweetsrelatedtothegovernmentshutdown.

.Theexpertquicklyfoundinterestingresultsafterexamininghashtagoverview(Fig.10(a))generatedbyoursystem.Sheiden-sevenprominenttopicsdescribedbyasetofhashtags:generalabouttheshutdownandObamacare(Fig.10A),politicalontwitter(Fig.10B),discussiononendingtheshutdown10C),thein uenceoftheshutdownonpeople’slives(Fig.10D)thegovernmentshutdownonnewsmedia(Fig.10E),debt-discussion(Fig.10F),andcriticsoftheshutdown(Fig.10G).analysis:The“#shutdown”cluster(Fig.10(b))attractedexpert’sattentionbecauseitcontainsitemswithhigheruncertainty.expertexaminedthedetailedhashtagsandtweetsinthecluster.foundthatinadditiontocommonhashtagssuchas#govtshutdown,and#shutdowngop,anumberofdiversehashtagsalsocreated.Suchhashtagsincludedthosethatcriticizedthee.g.,#shutdownharry;localnewsposts,e.g.,#hounews,andcampaigns,e.g.,#dontcutkids.Shewantedtoexaminethemostones,soshesortedthehashtagsbytheuncertaintylevel.#lewinskywasrankedasthemostuncertainhashtag10(c)).Theanalystsearchedtherelatedtweetsandfoundthattaggedwith#lewinskyconcernedtheshutdownoftheClintonin1995.Theexpertdecidedtolowertherankingscoreofhashtag.Duringtheprocess,shecommentedthattheuncertaintyanditem- lteringfeaturewereuseful,helpingher lteroutitemsbyloweringtheirrankingscores.

Uncertaintypropagation:Next,theexpertexaminedhowtheuncer-taintyofthe“#shutdown”clusterwouldin uenceneighboringclusters.Sheclickedthe“propagation”buttonandthecorrespondinguncertaintypropagationwasdisplayed(theorange owinFig.1).Shealsoselectedtheuncertaintypropagationofthe“#democrats”cluster(theblue owinFig.1)and“#republicans”cluster(thegreen owinFig.1),whichwerecloselyrelatedtothe“#shutdown”cluster.AsshowninFig.1(b),clus-ter“#nationalparks”sharedtheuncertaintypropagatedfromthethreeclusters.GiventhattheclosingofthenationalparkswasaresultofthegovernmentshutdownandstimulateddiscussiononTwitter,theexpertincreasedtherankingscoreof#nationalparks(from4to6).Inoursys-tem,therankingscoreisfrom1to10,with10beingthehighestscore.Aftertheadjustment,shenoticedthescoresofanothertwohashtagclusterswereautomaticallyincreased:“#spitehouse”and“#teaparty.”Inthe rstcluster,therankingscoresofhashtagssuchas“#spitehouse”

switchedbacktothehashtaggraphtocheckthe

onthisgraph.Shefoundanewhashtagcluster,Shethenzoomedintothiscluster.Asshowninprimarilyexpressescriticismofthegovernment,DemocratsorRepublicans(“@PeteSessions#De-#shutdown#MakeDCListen#senatemustactStandforusefulnessofoursystem,weconductedasemi-withthetwoexperts.TheyusedMutualRankerin2hours,sotheywerefamiliarwithitsbasicfunctions.waswellreceivedbythem.

MutualRankerasaresearchtooltohelpposts,users,andhashtagsquicklyandconve-believedthatMutualRankerisveryusefulforcodingAccordingtohim,codingisthemostinhis eld.Extensivetrainingandcarefulatten-beenrequiredtoproducereliabledata.Hecommented,isurgentlyneededinmydailyworktoandcosts.Especiallywhentherearenotthelinkagebetweenitems[inthissystem]willprovidetomakedecision.[...]Thissystemalsoprovidesanthedataretrievalprocess.”

wereimpressedbytheuncertaintyillustrationandfunction.Forexample,ExpertCsaid,“Uncertaintyawesomefeature,Icanuseitto ndsomeunexpectedthecoverageofcoding.”

thatsmoothlyswitchingbetweendifferentdatagraphshelpedthem ndrelevantdatamorequickly.ExpertScom-mented,“Thisswitchingfunctionenablesmetoeasilytransitionbe-tweenthehashtaggraphandtheusergraph.WhenImodifyonerankingscoreinonegraph,Icannotonlyverifytheresultinthisgraph,butalsoverifyitinanothergraph.”

Theexpertsalsosuggestedseveralimprovements.Thetargetaudi-enceofMutualRankerisexpertswithdomainknowledge.Theexpertsbelievedthataverageuserscanalsobene tfromit.Theysuggestedthatmoreintuitivevisualdesignbeused.ExpertCsaid,“Theuncertaintyglyphcanbesimpli edforageneraluser.Forexample,maybetheglyphdoesnotneedtoencodetheuncertaintydistribution,justsimplyshowthatthisrankingscoreisuncertainty.”Theyalsoexpressedtheneedtoretrievestreamingdata.9

DISCUSSION

AND

iscausedbytherankingscoredecreaseofhashtags“#ebt”and“#obam-zombies.”Theexpertthenexaminedtherelevanttweetstoprobethereason.TheEBTsystemwascrashedatthattimeandmanypeoplewonderedwhetherthecrashwascausedbythegovernmentshutdown:“Ahh...#ebtnotworkingcauseifa#governmentshutdown?Howsadyoucan’tspendmoneytakenfrommeagainstmywillthatIworkedfor...”Then,thecrashwasexplainedtobearesultofacomputerfailure(“AccordingtoNBC,#ebtisdownbecauseofatechnicalissue,NOT#governmentshutdown”).Thus,theexpertbelieved“#ebt”wasirrelevantandappreciatedthisautomaticchange.

Switchingbetweendifferentdataviews.Inadditiontohashtags,theexpertwantedtoexaminetheuserswhoparticipatedindifferentdiscus-siongroups.Forexample,shewantedtoidentifythemostactiveusersinthe“#shutdown”cluster,sosheoverlaidtheuserlabelsaroundthehashtaglabels(Fig.11(a)).Theexpertthenswitchedtotheuserviewtoexploreadditionaluserinformation(Fig.11(b)andFig.11(c)).Sheimmediatelyidenti edtheleadingusersinFig.11(b)andFig.11(c).Shedescribedthemwithtwocategories:1)keygovernmentof cialaccounts,including“@barackobama,”“@whitehouse”(Fig.11(b));and2)newsagencies/publicmediasuchas“@nytimes,”“@guardian,”and“@bloombergnews”(Fig.11(c)).Consideringthatpartisanleaderswereofmajorinteresttoher,she rstobservedtherankingscoresofselectpoliticians,e.g.,@speakerboehner(Rank8),@whiphoyer(Rank8),@nancypelosi(Rank7),etc.Shebelievedthattheimportanceoftheseuseraccountswasunderestimatedbecausethein uenceandac-tivenessofpoliticiansontwitterareusuallymuchlowerthanthatinreallife.Shechangedtherankingsofthepartisanleaders,“@speaker-boehner,”“@whiphoyer,”and“@nancypelosi,”to10,whichisthehighest.Fig.11(d)showsthedifferenceafterthisre nement.

Afterthechange,theuserclusterswereregeneratedandtheuncertainlylevelsofsomenodeswerelargelyreduced.Notably,“@whiphoyer”becameanimportantclusterwiththescoresofseveralusersintheclusterautomaticallyincreased(Fig.11(e)).Forexample,“@repmaloney,”from5to6and“@repteddeutch,”from5to6.“ThesearemembersofCongress.Thechangeoftheirrankingscoresisnaturalhere.”Theexpertcommented,“Thisiscool.[...]IfIwanttochangetherankingscoreofoneuser,othersjustautomaticallyfollow.Thiscouldhelpme ndtheimportantuserswhosenamesIamnotfamiliarwithorwhoarenotactiveonTwitter.”

FUTUREWORK

Thispaperpresentsavisualanalyticssystem,MutualRanker,tohelpanalystsinteractivelyretrievedataofinterestfrommicroblogs.WeextendtheMRGmodeltoextractamultifacetedretrievalresultthatincludesthemutualreinforcementrankingresults,theuncertaintyofeachrank,andtheuncertaintypropagationamongdifferentgraphnodes.Themodelistightlyintegratedwithacompositevisualizationtoassistanalystsinretrievingsalientposts,users,andhashtagseffectively,inanuncertainty-awareenvironment.

Inthefuture,weplantoimprovesystemperformancebyimplement-ingaparallelMonteCarlosamplingmethod.Anotherexcitingavenueforfutureworkistoretrievestreamingdatainmicroblogs,whichcanbeveryusefulinemergencymanagementandthreatanalysis.Webe-lievethesystemcanalsobene taverageusersinterestedincollectingmicroblogdata.Inthefuture,wewillalsoinvitemoreuserstotryoursystemandconductaformaluserstudy.Accordingly,wewillimproveMutualRankerbasedonthecollectedfeedback.ACKNOWLEDGMENTS

WewouldliketothankX.WangandJ.Yin,J.Gong,andDr.W.Cuiforhelpfuldiscussionsonthevisualizationdesign,Dr.J.ZhangandDr.Y.Songforconstructivesuggestionsonsimilaritymeasures,aswellasDr.W.PengandDr.J.Suforprovidingdomainexpertise.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/TVCG.2015.2467554, IEEE Transactions on Visualization and Computer Graphics

REFERENCES

[1]A.C.Alhadi,T.Gottron,J.Kunegis,andN.Naveed.Livetweet:Microblog

retrievalbasedoninterestingnessandanadaptationofthevectorspacemodel.InProceedingsofTREC,2011.

[2]K.Avrachenkov,N.Litvak,D.Nemirovsky,andN.Osipova.Montecarlo

methodsinpagerankcomputation:Whenoneiterationissuf cient.SIAMJ.Numer.Anal.,45(2):890–904,2007.

[3]B.Bahmani,A.Chowdhury,andA.Goel.Fastincrementalandpersonal-izedpagerank.Proc.VLDBEndow.,4(3):173–184,2010.

[4]M.Bianchini,M.Gori,andF.Scarselli.Insidepagerank.ACMTrans.

InternetTechnol.,5(1):92–128,2005.

[5]I.BIPM,I.IFcc,andI.IuPAc.Oiml,guidetotheexpressionofuncertainty

inmeasurement.InternationalOrganizationforStandardization,Geneva.ISBN,pages92–67,1995.

¨ttmann,S.Koch,R.Kru¨ger,[6]H.Bosch,D.Thom,F.Heimerl,E.Pu

¨rner,andT.Ertl.Scatterblogs2:Real-timemonitoringofmicroblogM.Wo

messagesthroughuser-guided ltering.IEEETVCG,19(12):2022–2031,2013.

[7]S.BrinandL.Page.Theanatomyofalarge-scalehypertextualwebsearch

puternetworksandISDNsystems,30(1):107–117,1998.[8]N.Cao,Y.-R.Lin,X.Sun,zer,S.Liu,andH.Qu.Whisper:Tracing

thespatiotemporalprocessofinformationdiffusioninrealtime.IEEETVCG,18(12):2649–2658,2012.

[9]A.ChandramouliandS.Gauch.Aco-operativewebservicesparadigm

forsupportingcrawlers.InLargeScaleSemanticAccesstoContent(Text,Image,Video,andSound),pages475–489,2007.

[10]H.Chen,S.Zhang,W.Chen,H.Mei,J.Zhang,A.Mercer,R.Liang,and

H.Qu.Uncertainty-awaremultidimensionalensembledatavisualizationandexploration.IEEETVCG,2015(ToAppear).

[11]J.Chen,J.Zhu,Z.Wang,X.Zheng,andB.Zhang.Scalableinferencefor

logistic-normaltopicmodels.InProceedingsofNIPS,pages2445–2453.2013.

[12]S.CherichiandR.Faiz.Relevantinformationmanagementinmicroblogs.

InformationSystemsforKnowledgeManagement,pages159–182,2013.[13]C.Collins,S.Carpendale,andG.Penn.Visualizationofuncertaintyin

latticestosupportdecision-making.InProceedingsofEUROVIS,pages51–58,2007.

[14]C.Correa,Y.-H.Chan,andK.-L.Ma.Aframeworkforuncertainty-aware

visualanalytics.InProceedingsofIEEEVAST,pages51–58,Oct2009.[15]D.R.CoxandP.A.Lewis.Thestatisticalanalysisofseriesofevents.

Wiley,1966.

[16]Y.Duan,F.Wei,Z.Chen,M.Zhou,andH.Shum.Twittertopicsumma-rizationbyrankingtweetsusingsocialin uenceandcontentquality.InProceedingsofColing,pages763–780,2012.

[17]rmationsearchandretrievalinmicroblogs.Journalof

theAmericanSocietyforInformationScienceandTechnology,62(6):996–1008,2011.

[18]S.Ghani,B.Kwon,S.Lee,J.-S.Yi,andN.Elmqvist.Visualanalyticsfor

multimodalsocialnetworkanalysis:Adesignstudywithsocialscientists.IEEETVCG,19(12):2032–2041,2013.

[19]D.HoltenandJ.J.VanWijk.Force-directededgebundlingforgraph

puterGraphicsForum,28(3):983–990,2009.

[20]W.JavedandN.Elmqvist.Exploringthedesignspaceofcomposite

visualization.InProceedingsofPaci cVis,pages1–8,2012.

[21]T.KamadaandS.Kawai.Analgorithmfordrawinggeneralundirected

rmationprocessingletters,31(1):7–15,1989.

[22]mpeandH.Hauser.Interactivevisualizationofstreamingdata

withkerneldensityestimation.InProceedingsofPaci cVis,pages171–178,2011.

[23]B.LiuandL.Zhang.Asurveyofopinionminingandsentimentanalysis.

InMiningtextdata,pages415–463.2012.

[24]S.Liu,W.Cui,Y.Wu,andM.Liu.Asurveyoninformationvisualization:

recentadvancesandchallenges.TheVisualComputer,pages1–21,2014.[25]S.Liu,X.Wang,J.Chen,J.Zhu,andB.Guo.Topicpanorama:Afull

pictureofrelevanttopics.InProceedingsofIEEEVAST,pages183–192,2014.

[26]X.Liu,Y.Song,S.Liu,andH.Wang.Automatictaxonomyconstruction

fromkeywords.InProceedingsofKDD,pages1433–1441,2012.

[27]S.Lodha,A.Pang,R.Sheehan,andC.Wittenbrink.U ow:visualizing

uncertaintyin uid ow.InProceedingsofIEEEVisualization,pages249–254,Oct1996.

[28]Y.Lu,F.Wang,andR.Maciejewski.Businessintelligencefromsocial

[29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55]

media:Astudyfromthevastboxof cechallenge.IEEEComputerGraphicsandApplications,34(5):58–69,2014.

Z.Luo,M.Osborne,S.Petrovic,andT.Wang.Improvingtwitterretrievalbyexploitingstructuralinformation.InProceedingsofAAAI,2012.A.Marcus,M.S.Bernstein,O.Badar,D.R.Karger,S.Madden,andR.C.Miller.Twitinfo:Aggregatingandvisualizingmicroblogsforeventexploration.InProceedingsofCHI,pages227–236,2011.

R.McCreadieandC.Macdonald.Relevanceinmicroblogs:Enhancingtweetretrievalusinghyperlinkeddocuments.InProceedingsofOAIR,pages189–196,2013.

A.T.Pang,C.M.Wittenbrink,andS.K.Lodha.Approachestouncertaintyvisualization.TheVisualComputer,13(8):370–390,1997.

D.Phan,L.Xiao,R.Yeh,andP.Hanrahan.Flowmaplayout.InProceed-ingsofIEEEInfoVis,pages219–224,2005.

E.J.Ruiz,V.Hristidis,C.Castillo,A.Gionis,andA.Jaimes.Correlating nancialtimeserieswithmicro-bloggingactivity.InProceedingsofWSDM,pages513–522,2012.

S.SedhaiandA.Sun.Hashtagrecommendationforhyperlinkedtweets.InProceedingsofSIGIR,pages831–834,2014.

L.Shi,F.Wei,S.Liu,L.Tan,X.Lian,andM.Zhou.Understandingtextcorporawithmultiplefacets.InProceedingsofIEEEVAST,pages99–106,2010.

M.Skeels,B.Lee,G.Smith,rmationVisualization,9(1):70–81,2010.A.Slingsby,J.Dykes,andJ.Wood.Exploringuncertaintyingeode-mographicswithinteractivegraphics.IEEETVCG,17(12):2545–2554,2011.

G.Sun,Y.Wu,R.Liang,andS.Liu.Asurveyofvisualanalyticstech-niquesandapplications:State-of-the-artresearchandfuturechallenges.JournalofComputerScienceandTechnology,28(5):852–867,2013.G.Sun,Y.Wu,S.Liu,T.-Q.Peng,J.Zhu,andR.Liang.Evoriver:Visualanalysisoftopiccoopetitiononsocialmedia.IEEETVCG,20(12):1753–1762,2014.

J.Tang,Z.Liu,M.Sun,andJ.Liu.Portrayinguserlifestatusfrommicrobloggingposts.TsinghuaScienceandTechnology,18(2):182–195,2013.

J.Thomson,E.Hetzler,A.MacEachren,M.Gahegan,andM.Pavel.Atypologyforvisualizinguncertainty.SPIE,5669:146–157,2005.

C.Vehlow,T.Reinhardt,andD.Weiskopf.Visualizingfuzzyoverlappingcommunitiesinnetworks.IEEETVCG,19(12):2486–2495,Dec2013.K.Verbeek,K.Buchin,andB.Speckmann.Flowmaplayoutviaspiraltrees.IEEETVCG,17(12):2536–2544,2011.

F.Wei,W.Li,Q.Lu,andY.He.Query-sensitivemutualreinforcementchainanditsapplicationinquery-orientedmulti-documentsummarization.InProceedingsofSIGIR,pages283–290,2008.

J.Weng,E.-P.Lim,J.Jiang,andQ.He.Twitterrank:Findingtopic-sensitivein uentialtwitterers.InProceedingsofWSDM,pages261–270,2010.

Y.Wu,S.Liu,K.Yan,M.Liu,andF.Wu.Opinion ow:Visualanalysisofopiniondiffusiononsocialmedia.IEEETVCG,20(12):1763–1772,2014.Y.Wu,F.Wei,S.Liu,N.Au,W.Cui,H.Zhou,andH.Qu.Opinion-seer:Interactivevisualizationofhotelcustomerfeedback.IEEETVCG,16(6):1109–1118,2010.

Y.Wu,G.-X.Yuan,andK.-L.Ma.Visualizing owofuncertaintythroughanalyticalprocesses.IEEETVCG,18(12):2526–2535,Dec2012.

P.Xu,Y.Wu,E.Wei,T.-Q.Peng,S.Liu,J.J.H.Zhu,andH.Qu.Visualanalysisoftopiccompetitiononsocialmedia.IEEETVCG,19(12):2012–2021,2013.

E.Zangerle,W.Gassler,andG.Specht.Ontheimpactoftextsimilarityfunctionsonhashtagrecommendationsinmicrobloggingenvironments.SocialNetworkAnalysisandMining,3(4):889–898,2013.

J.Zhao,N.Cao,Z.Wen,Y.Song,Y.-R.Lin,andC.Collins.# ux ow:Visualanalysisofanomalousinformationspreadingonsocialmedia.IEEETVCG,20(12):1773–1782,2014.

X.W.Zhao,Y.Guo,Y.He,H.Jiang,Y.Wu,andX.Li.Weknowwhatyouwanttobuy:Ademographic-basedsystemforproductrecommendationonmicroblogs.InProceedingsofKDD,pages1935–1944,2014.

H.-J.Zimmermann.Fuzzysettheoryanditsapplications.SpringerScience&BusinessMedia,2001.

T.ZukandS.Carpendale.Visualizationofuncertaintyandreasoning.InProceedingsofSG,pages164–177,2007.

共2页:

An Uncertainty-Aware Approach for Exploratory Microblog Retr(2).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档