andoneresearcherinmediaandcommunications(C).Theexpertsareexperiencedinretrievingdatafrommicroblogs.Theyalsohadexperienceusingamethodsimilartotheonedescribedabove.Weconductedseveralinterviewswiththem,mainlyfocusingonprobingtheirneedsandmicroblogretrievalprocess.Wethenidenti edthefollowinghigh-levelrequirementsbasedontheirfeedback.
R1-Examininganinitialsetofsalientmicroblogdata.Bothex-pertsexpressedtheneedforarankinglistofkeywordsearchresults.Keyword-basedmicroblogretrievalresultsoftenincludemillionsofpostsandtensofthousandsofusersandhashtags.Thus,theseresultsaretoomassiveforanalyststoquicklydiscoverrelevantdata.Theexpertsusuallyhavetoexaminethedatacarefullyanddesignasetofrulesto lteroutirrelevantdata.Asaresult,theystatedtheneedforatoolkitthatcanrankextractedposts,users,andhashtagstofacilitatetheirdataretrievaltasks.Thisneedisconsistentwiththe ndingsofpreviousresearch[16,46].
R2-Revealingrelationshipswithinmicroblogdata.Previousre-search[11,16,25]hasalsoindicatedthattherelationshipswithindatahelpuserslocateinterestinginformationmoreeasily.Furthermore,therelationshipsamongthethreedimensionsofmicroblogdata(posts,users,andhashtags)canassisttheminextractingsalientdata.Forexample,postsfromopinionleadersareusuallymoreimportantthanthosefromaverageusers.Thedomainexpertsdesiredtheabilitytoexploredifferenttypesofrelationships.
R3-Exploringsalientmicroblogdatafromdifferentperspectives.Sincethethreedimensionsofmicroblogdatausuallyin uenceeachother,theexpertswantedtounderstandthisin uencesothattheycanlinkimportantdatainonedimensiontothatinanotherdimension.Forexample,expertSsaidthat,“Collectingrelevanttweetsisveryimportantforsomeofourprojects.After ndingoneimportanttweet,Iusuallycheckothertweetsfromthesameauthoraswellasthetweetsmarkedbythesamehashtag(s).Thishelpsme ndrelevanttweetsquickly.”
R4-Understandingtheerrorproducedbytherankingmecha-nism.Themicroblogdatarankingmechanismisnotperfectandoftenintroduceserrorsoruncertaintyintotheretrievalprocess.Thus,thedegreeofuncertaintymustbeanalyzedandunderstoodtofacilitateinformeddecision-making[14,37,49].Theexpertsrequestedtoknowwhichrankingscoresaremoreerror-prone.
R5-Analyzingthein uenceoftheerrorsofoneitemonotheritems.Theexpertsalsoexpressedtheneedtounderstanderrorpropagationamongdataitems.Theyclaimedthatthisinformationcanhelpthemconsiderablyin lteringoutirrelevantdata.Forexample,expertCcommented,“WhenI ndanitemwithanincorrectranking
erinterface:(a)MutualRankervisualization;(b)controlpanel;(c)informationpanel.
score,Ialsowanttoknowwhichitemsarein uencedbythissothatIcanadjusttherankingscorequickly.”3.2SystemOverview
Thecollectedrequirementshavemotivatedustodevelopavisualana-lyticstoolkit,MutualRanker.Itconsistsofthefollowingcomponents: AnMRGmodeltogeneratetheinitialrankinglistsofposts,users,andhashtags(R1);
Anuncertaintymodeltoestimateuncertaintyanditstopologicalpropagationonagraph(R4,R5);
Acompositevisualizationtopresentthegraph-basedrankingresults,uncertainty,anditspropagation(R2,R3).
TheprimarygoalofMutualRankeristoextractalistofkmicroblogposts/users/hashtagsthatarerelevanttoqueryq.Fig.2illustratesthemaincomponentsneededtoachievethisgoal.Givenamicroblogdatasetextractedbyaquery,thepreprocessingmodule rstextractsthepostgraph,theusergraph,andthehashtaggraph.ThethreegraphsarethenfedtotheMRGmodel,whichproducesthreerankinglistsofposts,users,andhashtags.Theuncertaintymoduleestimatestheuncertaintyintheretrievalmodelanditstopologicalpropagation.Thevisualizationmoduletakestherankingresultsandtheuncertaintyestimationasinputandillustratestheminacompositevisualizationthatincludesagraphvisualization,anuncertaintyglyph,anda erscaninteractwiththegeneratedvisualizationforfurtheranalysis.Forexample,ausercanmodifyarankingresult.Withthisinput,MutualRankerwillincrementallyupdatetherankingresults.
Fig.3depictstheuserinterfaceofMutualRanker.Itcontainsthreedifferentinteractionareas:MutualRankervisualization(Fig.3(a)),controlpanel(Fig.3(b)),andinformationpanel(Fig.3(c)).Thevisual-izationviewconsistsoftwoparts:1)thestackedtreevisualizationthatshowsthehierarchicalstructureofmicroblogdata;2)thecompositevisualizationthatsimultaneouslyrevealtheretrievedmicroblogdata,theuncertaintyoftherankingresults,anditstopologicalpropagation.Thecontrolpanelconsistsofasetofcontrolsthatenableuserstointeractivelyupdatetheranking.Theinformationpaneldisplaysthecorrespondingmicroblogdatasuchasposts,users,andhashtagsforaselectedaggregateitem.
10.1109/TVCG.2015.2467554, IEEE Transactions on Visualization and Computer Graphics#obamacare
#debtceiling
Hashtag Graph
GRAPH
mainofMRG[16,45]isthatitemploys
boththerelation-users,orhashtags,andtherelationshipsbetween
rankings.Thisfeaturesigni cantlyreducesthework-wheninteractingwithourvisualanalyticssystem.Foranalystmodi estherankingscoreofahashtag,MRGupdatestherankingscoresoftheneighboringPost Graphhashtags,butalsothoseofrelevantusersandposts.Thisprocessallowsoursystemtointegrateuserknowledgeintothevisualanalyticsprocesswithacceptableusereffort.ThisisalsothemainreasonwhyweadoptMRGinMutualRanker.
5.1MRGComputationwithMonteCarloSamplingMethodDuanetal.[16]proposedamatrix-basedmethodtosolveMRG,whichiterativelyupdatestherankingscoresusingEq.(1).Thematrix-basedmethodisaglobalone.Anupdatetoanyitemisachievedbyrunningthemethodontheentireitemset,whichisverytime-consuming.Toaddressthisproblem,weusetheMonteCarlosamplingmethod.Theadvantagesofthissamplingmethodoverthematrix-basedmethodareasfollows[2]:
Rankingscoresarelocallyupdatedwhentheinputchangeslo-cally;
Therankingscoresofimportantitemsareaccuratelyestimatedafterafewiterations;
TheuncertaintyoftherankingscoresaremodeledaccuratelybecausetheMonteCarlomethodcalculatesvariancestatistically.Toemploythismethod,we rstsolveRasformulatedbyEq.(2):
R=(1 d)(I dM) 1W=(1 d)(∑dkMk)W.
k=0∞
(3)
Wethenperformaseriesofrandomwalksforeachitem.Arandomwalkmaystopateachstepwithaprobabilityof1 d.Ifthewalkcontinues,thenitproceedstothenextstepaccordingtothematrixM.Eachelementmijde nesthetransitionprobabilityfromitoj.
∞
LetZ=∑dkMk.Therankingscoreofeachitemis:
k=0
Fig.4.Mutualreinforcementmodel.
rj=(1 d)∑wizij,
i=1
N
(4)
TheinputofMRGincludesthreegraphs,thepostgraph,theusergraph,andthehashtaggraph,aswellastherelationshipsamongthem.ThethreegraphsandtheirrelationshipsareshowninFig.4.Asin[16],thepostgraphisbuiltbasedoncosinesimilarity.ArecentstudyhasshownthatcosinesimilaritywithaTFIDFweightingschemeisthemostappropriatemeasuretocomputethesimilaritybetweenmicroblogposts[35,51].Asaresult,weemploycosinesimilarityinoursystem.Theusergraphisconstructedbasedonfollower-followeerelationships.Thehashtaggraphisgeneratedaccordingtotheco-occurrenceoftwohashtags.Thethreegraphsarealsoconnectedbytworelationships:authorshipandco-occurrence.Ifauserpublishesapost,thenweconnectthisuserwithhis/herpost.Wealsolinkthisuserwithallofthehashtagsinthispost.Eachpostisalsolinkedtoallofthehashtagsassociatedwithit.
Forsimplicity,weuniformlydenoteposts,users,andhashtagsasitemsinthefollowingdiscussion.
TheMRGemploysamethodsimilartoPageRank[7]tomodelthemutualin uenceamongdifferentitemsinheterogeneousgraphs:
RpαppMpp Ru =d αpuMpuRhαphMph
αupMup
αuuMuuαuhMuh
αhpMhpRpWpαhuMhu Ru +(1 d) Wu .αhhMhhRhWh
(1)
wheretheelementzijinZistheaveragenumberoftimesthatarandom
walkstartingfromitemivisitsitemj.Weestimatezijbycomputingtheempiricalmeanofanumberofrandomwalks.
Duanetal.[16]onlyconsiderthesimilarityofitemsincomputingmij.Thus,high-rankingscoresmaybeincorrectlyassignedtouserswhopublishmanypoststhatdonotreceiveattention.Toaddressthis,weconsiderthepriorsaliencyofitemsinthesamplingprocess.Speci -cally,thetransitionprobabilityfrommijisde nedassimilarity(i,j)·wj.5.2UncertaintyModeling
InMutualRanker,weuseanapproximationmethodtosolveMRG,whichmayintroduceuncertaintyintotheretrievalresults.Itisthereforeimportanttomodeluncertainty.SinceweemploytheMonteCarlosamplingmethod,thedistributionofeachrankingscoreisknown.Hence,wecanemploytheprobabilitytheorytomodeluncertainty.Uncertaintyisde nedasaparameterfordepictingthedispersionofvaluesthatcanbereasonablyattributedtothemeasuredvalue[5].Traditionalmethodsmodelthemeasuredvalueasanormallydistributedrandomvariable[14,49].Variance[14]andstandarddeviation[49]areamongthemostcommonlyusedmeasurestorepresentuncertaintywhereinthemeasuredvalueisde nedonthesetofbothpositiveandnegativerealnumbers.
Themeasuredvalue(rankingscore)inourapproachisde nedonthesetofpositiverealnumbers.Thus,theabovemodelingmethodcannotbeapplieddirectlytoourwork.
Accordingto[2],zijinEq.(4)hasaPoissondistribution.Therankingscoreistheweightedsumofaseriesofzij.Hence,therankingscoreismodeledasaPoissonmixture.ForaPoissonmixture,thevarianceisapproximatelyproportionaltothemean.Hence,ifweusevariancetomodeluncertainty,thelargertherankingscore,themoreuncertainitis,butthisisnotalwaystrue.
Standarddeviationisthesquarerootofvarianceandhasasimilarproblem.Consequently,varianceandstandarddeviationarenotgoodmeasuresfordepictinguncertaintyinourmodel.
Forsuchadistribution,acommonlyusedmeasureofdispersionisthevariance-mean-ratio(VMR)[15].ThehighertheVMR,themoredispersedthedistribution.Foritemj,itsVMR(uj)canbede nedas:
uj=vj/rj,
(5)
themutualreinforcementstrengthamongposts,users,andhashtags.disthedampingfactorinPageRank,andwesetitto0.85,asin[7].Wp,Wu,andWharevectorsforthepriorsaliencyoftheitems(e.g.,thecontentqualityofposts,thesocialin uenceofusers,orthepopularityofhashtags). LetR= Ru ,W= Wu ,andM= αpuMpu
Rh
Wh
Rp
Wp
αppMppαphMph
αupMup
αuuMuuαuhMuh
αhpMhpαhuMhu αhhMhh
Rp,Ru,andRharetherankingscorevectorsofposts(p),users(u),andhashtags(h).Mxydenotestheaf nitymatrixfromxtoy,wherex,ycanbeposts,users,orhashtags.αxyisaweightusedtobalance
.
Then,Eq.(1)canbesimpli edas:
R=dMR+(1 d)W.
(2)
5MRG-BASEDUNCERTAINTYANALYSIS
SinceexactinferenceofMRGisverytime-consumingonalargegraph,weapproximateitusingamoreef cientMonteCarlosamplingmethod.Wealsoexplicitlymodeltheuncertaintyassociatedwitheachitem(e.g.,apost,auser,orahashtag),aswellasitspropagationonthegraph.
wherevjisthedistributionvarianceoftherankingscoreofitemj.Accordingto[2],vjcanbecalculatedasfollows:
10.1109/TVCG.2015.2467554, IEEE Transactions on Visualization and Computer Graphics
vj=(1 d)2∑w2ivzij.
i=1N
(6)
wherevzijisthevarianceofzij.EachzijobeysaPoissondistribution
anditsvariancecanbecalculatedfromitsexpectation.
Themassivenumberofitemsinthemicroblogdatameanswecannotplaceallofthemonthescreen.Hence,weaggregatesimilaritemstoformacluster.Theoverallrankingscoreofaclusterrcisde nedasthesumoftherankingscoresofitsitems[4].Therankingscoresareindependentofeachotherandtheoverallvarianceofthecluster,vc,isthesumofthevarianceoftherankingscores.Thus,theuncertaintyofacluster,uc,canbecalculatednaturallybydividingvcandrc.
uc=vc/rc=∑(rj/rc)(vj/rj)=∑kjuj.
j∈c
j∈c
(c)
Fig.5.Topologicaluncertaintypropagationcalculation
MtoM .Thischangeonlyaffectsasmallpartoftherandomwalks
(7)
Eq.(7)showsthatuccanbeexpressedbyaweightedsumoftheuncertaintyofitsitemswhereeachweightkjistheratiooftherankingscoresofitemjandclusterc.Thus,theuncertaintyofaclusterismainlydeterminedbyitsimportantitems.
5.3TopologicalUncertaintyPropagation
Ifananalyst ndsanincorrectlyrankeditem,hecanmodifyitbasedonhisknowledge.Hecanfurthertrackhowtheuncertaintypropa-gatesfromoneclustertoanothertoidentifyotheraffecteditems.Tohelpananalysttrackuncertainty,weexplicitlymodelitstopologicalpropagationonthegraph.
InMRG,therankingscoreofanitemcanbeexpressedasalinearcombinationofrankingscoresofrelateditems.Hence,thevariancesofarankingscorecanalsobeexpressedasalinearcombinationofthevariancesofrelatedrankingscores.Theuncertaintyofeachitemcanbecalculatedfromitsrankingscoreanditsvariance,andhence,theuncertaintyofanitemcanalsobeexpressedlinearlybytheuncertaintyofotheritems.Speci cally,
uj=
i=1,i=j
∑
N
m ijui,
(8)
222whereeachm ij=(dmijri)/((1 dmjj)rj).Eq.(8)showsthattheuncer-taintyofeachitemisnotindependentanditpropagatesonthegraphin
alinearform.Thus,foreachpairofitemsiandj,m ijuicanbeviewedasthepropagateduncertaintyfromitemitoj.Wedenoteitbyui→j.RewritingEq.(8)inamatrixform,wecanformulatetheuncertaintypropagationasaMarkovchain:
usedintheMonteCarlosamplingmethod.
Fortheaffectedrandomwalks,existingincrementalgraphrankingalgorithms[3]performre-samplingandupdatetherankingscoresbyaggregatingthestatisticsofthesenewrandomwalksintotheoriginalresults.Onemainproblemwiththesealgorithmsisthatre-samplingrequiresaconsiderableamountoftime,whichmaymakereal-timeinteractionimpossible.Supposenistheaveragenumberofneighborsthatanitemhasandlistheaveragelengthofasampledrandomwalk.Ateachstepinarandomwalk,wehavetosamplefromamultinomialdistributionwithnpossibleoutcomesandthetimecostisO(n).Thus,samplinganewrandomwalkwilltakeO(nl)time.ThetimeneededtocomputeandaggregatethestatisticsofthesesamplesisO(l).ThetotaltimerequiredforanewsampleisO(nl)+O(l)=O(nl).
However,inourscenario,wedonotdeleteoraddedgesonthegraphs.Asaresult,wedonotneedtoperformre-sampling.Weonlyneedtomodifythestatisticsofarandomwalkbasedonthemodi edtransitionprobabilitymij,therebyavoidingthehighcostassociatedwithresampling.Thetimecostofupdatinganin uencedrandomwalkisreducedtoO(l).
Givenarandomwalk:path={i→n1→...→nk 1→j},wede neanewrandomvariablexikj.Inparticular,xikj=1indicatesthattherandomwalkstartsfromiandreachesjbymovingksteps.Theoriginalweightofeachstepintherandomwalkis1.Duringanupdate,were-calculatetheweightofthisstepusingP (xikj=1)/P(xikj=1).P(xikj=1)istheprobabilityofxikj=1accordingtoMandP (xikj=1)istheprobabilityofk=1accordingtoM .Hence,P(xk=1)iscalculatedby:xijij
P(xikj=1)=min1mn1n2...mnk 1j.
(12)
Similarly,P (xikj=1)canalsobecalculated.6
VISUALIZATION
UM =U,
(9)
whereU=[uj]1×NandM =[m ij]N×N.
Similartotheuncertaintypropagationfromitemtoitem,wecanmodeltheuncertaintypropagationfromclustertoclusterusingthefol-lowingprocedure.First,basedonEq.(8),wecalculatethepropagateduncertaintyfromeachitemiinthesourceclustercstoeachitemjinthetargetclusterct(Fig.5(a)).Second,foreachitemjinct,wecomputethepropagateduncertaintyucs→jfromcstoitemjbyaggregatingtheuncertaintypropagatedfromeachiteminthesourcecluster(Fig.5(b)).
ucs→j=
i∈cs
Tohelpanalystsextractmicroblogdataofinterestinteractively,wehavedesignedacompositevisualizationthatincludesagraphvisualization,anuncertaintyglyph,anda owmap(Fig1(a)).6.1RankingResultsasGraphVisualization
Sinceonepostcorrespondstoonlyoneuserandafewhashtags,thescopeofin uenceofapostissmallerthanthatofauserorahashtag.Updatingtherankingscoreofapostwillonlydirectlyaffecttherankingscoresofitsauthor,afewrelatedhashtags,andanumberofposts.Incontrast,updatingtherankingscoreofauserorhashtagwilldirectlyimpacttherankingscoresofhundredsoreventhousandsofpostsaswellasanumberofusersandhashtags.Ontheotherhand,thenumberofpostsisusuallyhuge,around10-100timesthatofusersorhashtags.Analystswouldrequiremoretimetoprovidetheirfeedbackonapostgraph.Asaresult,weregardtheuserandthehashtagastheprimaryvisualizationelementsandthepostasasecondaryelementmainlyusedtoillustratethecontentoftheprimaryelements.Accordingly,usersandhashtagsarevisuallyrepresentedbyanode-linkgraphwhereaspostsarerepresentedasalist.Forsimplicity,wetakeahashtaggraphasanexampletoillustratethebasicideaofgraphvisualization.
Toallowanalyststonavigatelargegraphsef ciently,ahierarchyisbuiltbasedonaBayesianRoseTree[26]witheachnon-leafnoderep-resentingahashtagcluster.AsshowninFig.3(a),astackedtreeisadoptedtorepresentthehashtaghierarchyandadensity-basedgraphvisualizationisemployedtoillustratetherelationshipswithintheuser/hashtaggraphsandbetweenthem(R2,R3).
∑ui→j.
(10)
Finally,theuncertaintyofaclusterisaweightedsumoftheuncertaintyoftheitemsinit(Eq.(7)).Thus,theoverallpropagateduncertaintyucs→ctfromcstoctcanbecalculatedastheweightedsumofthepropa-gateduncertaintyfromcstoeachjinct(Fig.5(c)).
ucs→ct=
j∈ct
∑kjucs→j.
(11)
5.4IncrementalRankingUpdate
Wealsoallowanalyststointeractivelymodifytheitemrankingresultbasedontheirknowledge.WecanupdatethemodellocallybecauseweusetheMonteCarlosamplingmethod.Aftertheanalystchangestherankingscore(s),ourapproachiterativelyupdatesthepriorsaliencescore(s)oftheitem(s).Accordingly,theaf nitymatrixischangedfrom
Fig.6.Basicideaofthelayoutalgorithm:(a)placetheclusternodesandderivethelayoutcenterofeachcluster;(b)computetheVoronoitessellationandtreatthecellasthelayoutareaofeachcluster;(c)layoutofrepresentativeandnon-representativenodes;and(d) nallayoutresultwithcontext.
Thedensity-basedgraphvisualizationcombinesanode-linkdiagramwithadensitymaptodisplaythenodesattheselectedlevelofthehashtagtree.Asin[25],weextractrepresentativenodesforeachoftheclusternodesattheselectedtreelevelandassignothernon-representativenodestotheirclosestrepresentativenodes.AsshowninFig.1(a),therepresentativenodesaredisplayedasanode-linkdiagramandtheothernodesasadensitymap.Inthisvisualization,therepresentativenodesofoneclusterareplacedneareachothertore ecttheircloseness.Thesizeofthenodeencodesthesumoftherankingscoreofeachitem.Thecorrespondingusersareoverlaidaroundtheselectedhashtagnodetoprovidemoreanalysiscontext(Fig.6(d)).Layout.Thelayoutofthestackedtreeisquitestraightforward.Thus,weintroducethelayoutofthedensity-basedgraph,whichcontainsthefollowingsteps.
Step1:Derivethelayoutcenterofeachclusterattheselectedtreelevel.Webuildaclustergraphbycheckingtheedgeconnectionsbetweenthetwoclusternodes.Anedgeisaddedifasuf cientnumberofconnectionsbetweenthetwoclusternodescanbefound.Theclustergraphisthenplacedbyaforce-directedlayout[21].AsshowninFig.6(a),thepositionofeachclusternodeistreatedasthecenterofeachhashtagcluster.
Step2:Computethelayoutareaofeachcluster.Inthisstep,wecomputethecorrespondingVoronoitessellationbasedontheclustercenter.Thecorrespondingtessellationcellsaretreatedaslayoutareasofthehashtagclusters(Fig.6(b)).
Step3:Layoutofrepresentativeandnon-representativenodes.Inthisstep,theforce-directedlayoutisadoptedtoplacetherepresentativenodes.Toensuretherepresentativenodeswithinoneclusterareplacedinthecorrespondingclusterlayoutarea,arepulsionforceisaddedfromtheareaboundarytoeachnodewithinthisarea.Thekerneldensityestimation[22]isutilizedtorepresentthedistributionofnon-representativenodes(Fig.6(c)).
Step4:Layoutofthecontextwordcloud.Showingthehashtaggraphandusergraphsimultaneouslywouldintroducevisualclutter.Tosolvethisissue,wetreatthehashtaggraphasaprimaryelementandtheuserinformationascontext.Inparticular,whenahashtagnodeisselected,awordcloudthatincludestheuserswhousethishashtagislaidouttoprovideusercontext.Inthiswordcloud,theselectedhashtagisplacedinthemiddle.Asweep-line-basedwordcloudlayoutalgorithm[36]isemployedtoproducesuchawordcloud.Fig.6(d)showsalayoutresultwithawordcloudcontext.
Interaction.Thefollowinginteractionsareprovidedtoassistanalystsininvestigatingtherankingresultsfrommultipleperspectives.
Examiningtherankedmicroblogdataandtheirrelationships(R2).Thedensity-basedgraphvisualizationprovidesaneasywaytoexploretherankingresultsfromthehashtagoruserperspective.Utilizingthehashtaghierarchyallowstheanalysttoexploretherankingresultsfromaglobaloverviewtolocaldetails.Several lters,suchastheedgeortheglyph lter,enableanalyststocustomizethisvieweasily.Relevantposts,hashtags,andusersarealsoprovidedtohelpanalystsbetterunderstandthecontentoftheselectedclusternode.
Smoothlyswitchingbetweendifferentdatadimensions(R3).Inspiredbythecontextpopupinteractionin[18],wealsooverlaycontextofaselecteditemtoprovidefurthernavigationcues.Forexample,ifthe
Upper Hinge (quartile)Lower Hinge (quartile)
Lower Extreme
(a)(b)(c)
Fig.7.Designoftheuncertaintyglyph:(a)boxplot;(b)transformingtheboxplottotheuncertaintyglyph;(c)analternativedesign.
analystselectsahashtag,thelabelsofuserswhousethathashtagcanbeoverlaidaroundtheselectedhashtagviaawordcloud(Fig.11(a)).Iftheanalyst ndssomethingofinterest,thehashtaggraphwillbesmoothlytransitionedtotheusergraph(Fig.11(b)).
6.2UncertaintyasGlyph
Aftertestingwiththe rstprototype,theexpertsidenti edseveralincorrectrankingresults.Theyexpressedtheneedtobeinformedofsuchresults.Thisrequirementisrelatedintimatelywiththeconclusionofpreviouswork,whichstatedthateffectivelyconveyinguncertaintyisveryimportanttothevisualanalyticsprocess[14,49].Sincetherankingresultsareaggregatedintoclustersintheoverview,theexpertswantedtoexaminetheuncertaintydistributionoftheaggregatenode,includingtheminimumvalue(0),maximumvalue(1.0),lowerextreme,upperextreme,lowerhinge(25%),andupperhinge(75%).
Inspiredbytheboxplotdesign(Fig.7(a)),wehavedesignedaglyphtomeettheaboverequirements(Fig.7(b)).AsshowninFig.7(a),sixvaluesfromasetofdataareconventionallyusedinaboxplot,includingtheminimumandmaximumvalues,theextremes,andtheupperandlowerhinges(quartiles).Atotalof50%percentofitemsfallinbetweentheupperandlowerhinges.Tocombineaboxplotwithagraphnode,we rsttransformtheboxplottoaline-basedone,andthenbenditaroundtheupperboundaryofthenode(Fig.7(b)).Wealsoattemptedseveralalternativesintheparticipatorydesignprocesswithexperts.Fig.7(c)isoneofthem.Afterinteractingwiththisalternative,theexpertsstatedthatitwasconfusing.Theythoughtthattheitemwithmoreofa lledareainsideshouldbetheoneonwhichtheyshouldfocus.However,inreality,thesenodeswereonlynodeswithalargerareabetweentheupperandlowerhinges.APhDstudentfromanartschoollatercon rmedthatalargeramountofdigitalinkwillattractmoreattentionfromusers.Afterseveralinteractionswiththeexpertsandtheartstudent,wechooseFig.7(b)asour naldesign.
Analystscanobtainanoverviewoftheuncertaintydistributioninaclusterbyexaminingitsuncertaintyglyph.Fig.8illustratesseveralexamplepatterns.Forexample,inFig.8(a),themajorityofitemsinthisclusterarecharacterizedbylowuncertainty.However,theclusteralsocontainssomeitemswithhigheruncertainty.Asaresult,exploringtheitemswithhighuncertaintyisaworthwhileendeavor.
Interaction.Inadditiontoallowinganalyststoexaminetheuncertaintyscore(R4),wealsoprovidetheinteractionshownbelowtointegrateanexpert’sknowledgeintotheretrievalprocess.
Interactiverankingre nement.Afteranexpert ndsanincorrectrank-ingresultbyexaminingtheuncertaintyglyph,theexpertcanmodifytherankingresult.Therankingscoresofthecorrespondinggraphnodeswillalsobeupdatedaccordingly.AsshowninFigs.1(c)-(f),the
isquiteshort.Thus,ifweconsiderthismeasure,manyoftheselinesegmentswillnotbebundledtogether.
Thetotaledgecompatibilityisde nedby:
Ce(ei,ej)=Cα(ei,ej)·Cs(ei,ej)·Cp(ei,ej).
(a)mostitemsinhigheruncertaintyarealsoincluded;(b)mostitemsinthisclusterhavehighuncertaintyandsomeitemswithhigheruncertaintyalsooccur;(c)uncertaintydistributionisuniform;(d)mostitemsinthisclusterhaveloweruncertainty.
Fig.9(b)showsthematchedresultsofthepropagationpaths.
Step3:Computetheforcetobundlethepropagationpath.Thecom-binedforceforapointpioneiisde nedas:
Fpi=Ki( pi 1 pi + pi pi+1 )+
ej∈E
∑
pi pj ·Ce(ei,ej)
whereKiisthespringconstantforeachsegmentandEisthesetofallthematchededgesofei.In[19],thelastitemistheelectrostaticforceFe=1/ ei ej .Inordertobundlethematchedpathsthatarelocatedawayfromeachother,wereplaceitwithanattractingspringforce.Fig.9(c)showsthelayoutresultsofthepropagationpaths.7
QUANTITATIVEEVALUATION
Inthissection,wequantitativelyevaluatetheeffectivenessofourMRGcomputationandincrementalrankingupdatealgorithm.
youtofmultipleuncertaintypropagationpaths:(a)initiallayoutbasedonthe owmaplayout;(b)thematchedresultofthepropagationpaths;(c)thelayoutresultofthepropagationpaths.
7.1MRGComputation
ToevaluatetheperformanceofourMRGcomputationbasedontheMonteCarlosamplingmethod,wecompareditwiththematrix-basedmethodproposedin[16].WeusedtwoTwitterdatasetsintheexperiments:governmentshutdownandEbolaoutbreak.Theshutdowndatasetcontainstweetsonthe2013USgovernmentshutdown(5,132,510tweetsfromOct.1toOct.16,2013),whichwerecollectedbyusingqueriessuchas“shutdown.”TheEboladatasetcontainstweetsontheEbolaoutbreak(1,425,017tweetsfromJan.1toDec.25,2014),whichwerecollectedbyusingqueriessuchas“ebola.”AllexperimentswereconductedonaPCwitha3.1GHzCPUand16GBRAM.
Thereweretoomanyposts,users,orhashtagsandwecouldnotlabelallofthem.Thus,wedidnotreporttherecallinourevaluation.Inthisevaluation,weusedtopn-precision(n-Prec)astheevaluationmeasure.Topn-precisionisthepercentageofthecorrectlyretrieveditemsamongthetop-nrankeditems.Thismeasureisoftenusedwhentherecallishardtocalculate[9].Tofullycomparethetwoalgorithms,wecalculatedthetop10,50,100,and200-precisionforposts,users,andhashtags,respectively.WeinvitedtwoPhDstudentswhomajoredindataminingandarefamiliarwiththedatasetstoevaluatethere-trievalresults.Theylabeledtheresultsindividuallyandresolvedthedifferencesviadiscussion.TheresultsareshowninTable1.Overall,ouralgorithmperformedbetterthanthebaselineonbothdatasets.Weinspectedthetop10retrieveditemswithbothmethods.Ingeneral,theretrieveditemswerequiteaccurate.However,thebaselinehadonemistakeinthetop10usersselectedfromtheshutdowndataset.Itoverestimatedtheimportanceofausercalled@governmentclosd,whopostedasigni cantnumberoftweetswithanumberofhashtags.However,thisuserdidnothavemanyfollowersandhis/hertweetswereseldomretweeted.Incontrast,ouralgorithmcanavoidthismistakebytakingauser’sauthorityintoconsideration.ThebaselinealgorithmalsohadsimilarmistakesintheEboladataset.
Dataset
n-Precrankingscores(e.g.,nodesizes)ofseveralnodeschanged.Aglyphisdesignedtoillustratethechange,withthedottedorangecircleencodingthepreviousrankingscoreandtheboundaryofthe lledcircle(graycolor)representingthechangedrankingscore(Figs.1(d)-(f)).6.3UncertaintyPropagationasFlowMap
The owmap[33,44]isdesignedtovisuallyanalyzethemovementofobjectsfromonelocationtomultiplelocations.Inspiredbythisdesign,wedeveloptheuncertaintypropagationpath(Fig.1),whichisusefulforquicklyderivingtheunknownuncertainnode(s)fromtheknownone(s)(R5).
Layout.Thelayoutofmultipleuncertaintypropagationpathsofdiffer-entnodesisbasedonthe owmaplayoutin[44]andtheedgebundlingin[19].Thelayoutcontainsthefollowingsteps.
Step1:Derivetheinitialuncertaintypropagationpathbasedonthe owmaplayout.We rstcomputetheuncertaintypropagationoftheselectednodebasedonthetopologybyusingthemethodinSec.5.3.The owmaplayoutviaspiraltreesisthenunitizedtogeneratetheinitialuncertaintypropagationpath(Fig.9(a)).
Step2:Employedgecompatibilitymeasurestomatchthecorrespond-ingpropagationpathsfromdifferentnodes.Inthisstep,weemploythethreecompatibilitymeasuresdescribedin[19]tomatchthepropagationpathsfromdifferentnodes.
The rstmeasureisanglecompatibility,whichaimstomatchtheedgeswithasmallerangle.Itisde nedby:
Cα(ei,ej)=|cos(α)|.
Thesecondmeasureisscalecompatibility,whichtendstomatchtheedgeswithsimilarlengths.Itismeasuredby:
Cs(ei,ej)=2/(lavg·min(|ei|,|ej|)+min(|ei|,|ej|)/lavg),
wherelavg=(|ei|+|ej|)/2.
Thethirdmeasureispositioncompatibility,whichaimstomatchthecloseedgestogether.Itisde nedby:
Cp(ei,ej)=lavg/(lavg+ Qm1 Qm2 ),
Shutdown
whereQm1andQm2arethemidpointsofedgeseiandej.
Thelastmeasure,visibilitycompatibility,describedin[19]isnotconsideredinourmethodbecausetherearetoomanylinesegmentsinthepropagationpathgeneratedbythe owmaplayout,eachofwhich
Ebola
parisonoftheMRGcomputationmethodwiththebaselineusingtop-nprecisionforposts,users,andhashtags.
10.Theoverviewofthegovernmentshutdowndataset.
ciology(S)andoneresearcherinmediaandcommunications(C),tounderstandtheirrespectiveinterestsinthedatasets.Wedesignedanumberofexplorationtasks.Inthesecondphase,wecollaboratedwiththeexpertsto nishthedesignedtasks.Duringthisphase,weaskedquestionstodiscusswiththeexpertstheusefulnessofourtoolforeachtask.Finally,theexpertswereinvitedtoanotherdiscussionsessiontoprovideoverallfeedbackonhowourtoolcouldhelpthemwithreal-worldtasks.
8.1CaseStudy:GovernmentShutdown
Inthisstudy,weworkedwithexpertSto:1)evaluatehowuncertaintyanalysiscanbeutilizedtoidentifykeyhashtagsanduserswithasatis-factorycon dencelevel;2)leverageoursystemtoiterativelyreducetheuncertaintylevels;3)extractrelevanthashtags/users/tweetsrelatedtothegovernmentshutdown.
.Theexpertquicklyfoundinterestingresultsafterexamininghashtagoverview(Fig.10(a))generatedbyoursystem.Sheiden-sevenprominenttopicsdescribedbyasetofhashtags:generalabouttheshutdownandObamacare(Fig.10A),politicalontwitter(Fig.10B),discussiononendingtheshutdown10C),thein uenceoftheshutdownonpeople’slives(Fig.10D)thegovernmentshutdownonnewsmedia(Fig.10E),debt-discussion(Fig.10F),andcriticsoftheshutdown(Fig.10G).analysis:The“#shutdown”cluster(Fig.10(b))attractedexpert’sattentionbecauseitcontainsitemswithhigheruncertainty.expertexaminedthedetailedhashtagsandtweetsinthecluster.foundthatinadditiontocommonhashtagssuchas#govtshutdown,and#shutdowngop,anumberofdiversehashtagsalsocreated.Suchhashtagsincludedthosethatcriticizedthee.g.,#shutdownharry;localnewsposts,e.g.,#hounews,andcampaigns,e.g.,#dontcutkids.Shewantedtoexaminethemostones,soshesortedthehashtagsbytheuncertaintylevel.#lewinskywasrankedasthemostuncertainhashtag10(c)).Theanalystsearchedtherelatedtweetsandfoundthattaggedwith#lewinskyconcernedtheshutdownoftheClintonin1995.Theexpertdecidedtolowertherankingscoreofhashtag.Duringtheprocess,shecommentedthattheuncertaintyanditem- lteringfeaturewereuseful,helpingher lteroutitemsbyloweringtheirrankingscores.
Uncertaintypropagation:Next,theexpertexaminedhowtheuncer-taintyofthe“#shutdown”clusterwouldin uenceneighboringclusters.Sheclickedthe“propagation”buttonandthecorrespondinguncertaintypropagationwasdisplayed(theorange owinFig.1).Shealsoselectedtheuncertaintypropagationofthe“#democrats”cluster(theblue owinFig.1)and“#republicans”cluster(thegreen owinFig.1),whichwerecloselyrelatedtothe“#shutdown”cluster.AsshowninFig.1(b),clus-ter“#nationalparks”sharedtheuncertaintypropagatedfromthethreeclusters.GiventhattheclosingofthenationalparkswasaresultofthegovernmentshutdownandstimulateddiscussiononTwitter,theexpertincreasedtherankingscoreof#nationalparks(from4to6).Inoursys-tem,therankingscoreisfrom1to10,with10beingthehighestscore.Aftertheadjustment,shenoticedthescoresofanothertwohashtagclusterswereautomaticallyincreased:“#spitehouse”and“#teaparty.”Inthe rstcluster,therankingscoresofhashtagssuchas“#spitehouse”
switchedbacktothehashtaggraphtocheckthe
onthisgraph.Shefoundanewhashtagcluster,Shethenzoomedintothiscluster.Asshowninprimarilyexpressescriticismofthegovernment,DemocratsorRepublicans(“@PeteSessions#De-#shutdown#MakeDCListen#senatemustactStandforusefulnessofoursystem,weconductedasemi-withthetwoexperts.TheyusedMutualRankerin2hours,sotheywerefamiliarwithitsbasicfunctions.waswellreceivedbythem.
MutualRankerasaresearchtooltohelpposts,users,andhashtagsquicklyandconve-believedthatMutualRankerisveryusefulforcodingAccordingtohim,codingisthemostinhis eld.Extensivetrainingandcarefulatten-beenrequiredtoproducereliabledata.Hecommented,isurgentlyneededinmydailyworktoandcosts.Especiallywhentherearenotthelinkagebetweenitems[inthissystem]willprovidetomakedecision.[...]Thissystemalsoprovidesanthedataretrievalprocess.”
wereimpressedbytheuncertaintyillustrationandfunction.Forexample,ExpertCsaid,“Uncertaintyawesomefeature,Icanuseitto ndsomeunexpectedthecoverageofcoding.”
thatsmoothlyswitchingbetweendifferentdatagraphshelpedthem ndrelevantdatamorequickly.ExpertScom-mented,“Thisswitchingfunctionenablesmetoeasilytransitionbe-tweenthehashtaggraphandtheusergraph.WhenImodifyonerankingscoreinonegraph,Icannotonlyverifytheresultinthisgraph,butalsoverifyitinanothergraph.”
Theexpertsalsosuggestedseveralimprovements.Thetargetaudi-enceofMutualRankerisexpertswithdomainknowledge.Theexpertsbelievedthataverageuserscanalsobene tfromit.Theysuggestedthatmoreintuitivevisualdesignbeused.ExpertCsaid,“Theuncertaintyglyphcanbesimpli edforageneraluser.Forexample,maybetheglyphdoesnotneedtoencodetheuncertaintydistribution,justsimplyshowthatthisrankingscoreisuncertainty.”Theyalsoexpressedtheneedtoretrievestreamingdata.9
DISCUSSION
AND
iscausedbytherankingscoredecreaseofhashtags“#ebt”and“#obam-zombies.”Theexpertthenexaminedtherelevanttweetstoprobethereason.TheEBTsystemwascrashedatthattimeandmanypeoplewonderedwhetherthecrashwascausedbythegovernmentshutdown:“Ahh...#ebtnotworkingcauseifa#governmentshutdown?Howsadyoucan’tspendmoneytakenfrommeagainstmywillthatIworkedfor...”Then,thecrashwasexplainedtobearesultofacomputerfailure(“AccordingtoNBC,#ebtisdownbecauseofatechnicalissue,NOT#governmentshutdown”).Thus,theexpertbelieved“#ebt”wasirrelevantandappreciatedthisautomaticchange.
Switchingbetweendifferentdataviews.Inadditiontohashtags,theexpertwantedtoexaminetheuserswhoparticipatedindifferentdiscus-siongroups.Forexample,shewantedtoidentifythemostactiveusersinthe“#shutdown”cluster,sosheoverlaidtheuserlabelsaroundthehashtaglabels(Fig.11(a)).Theexpertthenswitchedtotheuserviewtoexploreadditionaluserinformation(Fig.11(b)andFig.11(c)).Sheimmediatelyidenti edtheleadingusersinFig.11(b)andFig.11(c).Shedescribedthemwithtwocategories:1)keygovernmentof cialaccounts,including“@barackobama,”“@whitehouse”(Fig.11(b));and2)newsagencies/publicmediasuchas“@nytimes,”“@guardian,”and“@bloombergnews”(Fig.11(c)).Consideringthatpartisanleaderswereofmajorinteresttoher,she rstobservedtherankingscoresofselectpoliticians,e.g.,@speakerboehner(Rank8),@whiphoyer(Rank8),@nancypelosi(Rank7),etc.Shebelievedthattheimportanceoftheseuseraccountswasunderestimatedbecausethein uenceandac-tivenessofpoliticiansontwitterareusuallymuchlowerthanthatinreallife.Shechangedtherankingsofthepartisanleaders,“@speaker-boehner,”“@whiphoyer,”and“@nancypelosi,”to10,whichisthehighest.Fig.11(d)showsthedifferenceafterthisre nement.
Afterthechange,theuserclusterswereregeneratedandtheuncertainlylevelsofsomenodeswerelargelyreduced.Notably,“@whiphoyer”becameanimportantclusterwiththescoresofseveralusersintheclusterautomaticallyincreased(Fig.11(e)).Forexample,“@repmaloney,”from5to6and“@repteddeutch,”from5to6.“ThesearemembersofCongress.Thechangeoftheirrankingscoresisnaturalhere.”Theexpertcommented,“Thisiscool.[...]IfIwanttochangetherankingscoreofoneuser,othersjustautomaticallyfollow.Thiscouldhelpme ndtheimportantuserswhosenamesIamnotfamiliarwithorwhoarenotactiveonTwitter.”
FUTUREWORK
Thispaperpresentsavisualanalyticssystem,MutualRanker,tohelpanalystsinteractivelyretrievedataofinterestfrommicroblogs.WeextendtheMRGmodeltoextractamultifacetedretrievalresultthatincludesthemutualreinforcementrankingresults,theuncertaintyofeachrank,andtheuncertaintypropagationamongdifferentgraphnodes.Themodelistightlyintegratedwithacompositevisualizationtoassistanalystsinretrievingsalientposts,users,andhashtagseffectively,inanuncertainty-awareenvironment.
Inthefuture,weplantoimprovesystemperformancebyimplement-ingaparallelMonteCarlosamplingmethod.Anotherexcitingavenueforfutureworkistoretrievestreamingdatainmicroblogs,whichcanbeveryusefulinemergencymanagementandthreatanalysis.Webe-lievethesystemcanalsobene taverageusersinterestedincollectingmicroblogdata.Inthefuture,wewillalsoinvitemoreuserstotryoursystemandconductaformaluserstudy.Accordingly,wewillimproveMutualRankerbasedonthecollectedfeedback.ACKNOWLEDGMENTS
WewouldliketothankX.WangandJ.Yin,J.Gong,andDr.W.Cuiforhelpfuldiscussionsonthevisualizationdesign,Dr.J.ZhangandDr.Y.Songforconstructivesuggestionsonsimilaritymeasures,aswellasDr.W.PengandDr.J.Suforprovidingdomainexpertise.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2015.2467554, IEEE Transactions on Visualization and Computer Graphics
REFERENCES
[1]A.C.Alhadi,T.Gottron,J.Kunegis,andN.Naveed.Livetweet:Microblog
retrievalbasedoninterestingnessandanadaptationofthevectorspacemodel.InProceedingsofTREC,2011.
[2]K.Avrachenkov,N.Litvak,D.Nemirovsky,andN.Osipova.Montecarlo
methodsinpagerankcomputation:Whenoneiterationissuf cient.SIAMJ.Numer.Anal.,45(2):890–904,2007.
[3]B.Bahmani,A.Chowdhury,andA.Goel.Fastincrementalandpersonal-izedpagerank.Proc.VLDBEndow.,4(3):173–184,2010.
[4]M.Bianchini,M.Gori,andF.Scarselli.Insidepagerank.ACMTrans.
InternetTechnol.,5(1):92–128,2005.
[5]I.BIPM,I.IFcc,andI.IuPAc.Oiml,guidetotheexpressionofuncertainty
inmeasurement.InternationalOrganizationforStandardization,Geneva.ISBN,pages92–67,1995.
¨ttmann,S.Koch,R.Kru¨ger,[6]H.Bosch,D.Thom,F.Heimerl,E.Pu
¨rner,andT.Ertl.Scatterblogs2:Real-timemonitoringofmicroblogM.Wo
messagesthroughuser-guided ltering.IEEETVCG,19(12):2022–2031,2013.
[7]S.BrinandL.Page.Theanatomyofalarge-scalehypertextualwebsearch
puternetworksandISDNsystems,30(1):107–117,1998.[8]N.Cao,Y.-R.Lin,X.Sun,zer,S.Liu,andH.Qu.Whisper:Tracing
thespatiotemporalprocessofinformationdiffusioninrealtime.IEEETVCG,18(12):2649–2658,2012.
[9]A.ChandramouliandS.Gauch.Aco-operativewebservicesparadigm
forsupportingcrawlers.InLargeScaleSemanticAccesstoContent(Text,Image,Video,andSound),pages475–489,2007.
[10]H.Chen,S.Zhang,W.Chen,H.Mei,J.Zhang,A.Mercer,R.Liang,and
H.Qu.Uncertainty-awaremultidimensionalensembledatavisualizationandexploration.IEEETVCG,2015(ToAppear).
[11]J.Chen,J.Zhu,Z.Wang,X.Zheng,andB.Zhang.Scalableinferencefor
logistic-normaltopicmodels.InProceedingsofNIPS,pages2445–2453.2013.
[12]S.CherichiandR.Faiz.Relevantinformationmanagementinmicroblogs.
InformationSystemsforKnowledgeManagement,pages159–182,2013.[13]C.Collins,S.Carpendale,andG.Penn.Visualizationofuncertaintyin
latticestosupportdecision-making.InProceedingsofEUROVIS,pages51–58,2007.
[14]C.Correa,Y.-H.Chan,andK.-L.Ma.Aframeworkforuncertainty-aware
visualanalytics.InProceedingsofIEEEVAST,pages51–58,Oct2009.[15]D.R.CoxandP.A.Lewis.Thestatisticalanalysisofseriesofevents.
Wiley,1966.
[16]Y.Duan,F.Wei,Z.Chen,M.Zhou,andH.Shum.Twittertopicsumma-rizationbyrankingtweetsusingsocialin uenceandcontentquality.InProceedingsofColing,pages763–780,2012.
[17]rmationsearchandretrievalinmicroblogs.Journalof
theAmericanSocietyforInformationScienceandTechnology,62(6):996–1008,2011.
[18]S.Ghani,B.Kwon,S.Lee,J.-S.Yi,andN.Elmqvist.Visualanalyticsfor
multimodalsocialnetworkanalysis:Adesignstudywithsocialscientists.IEEETVCG,19(12):2032–2041,2013.
[19]D.HoltenandJ.J.VanWijk.Force-directededgebundlingforgraph
puterGraphicsForum,28(3):983–990,2009.
[20]W.JavedandN.Elmqvist.Exploringthedesignspaceofcomposite
visualization.InProceedingsofPaci cVis,pages1–8,2012.
[21]T.KamadaandS.Kawai.Analgorithmfordrawinggeneralundirected
rmationprocessingletters,31(1):7–15,1989.
[22]mpeandH.Hauser.Interactivevisualizationofstreamingdata
withkerneldensityestimation.InProceedingsofPaci cVis,pages171–178,2011.
[23]B.LiuandL.Zhang.Asurveyofopinionminingandsentimentanalysis.
InMiningtextdata,pages415–463.2012.
[24]S.Liu,W.Cui,Y.Wu,andM.Liu.Asurveyoninformationvisualization:
recentadvancesandchallenges.TheVisualComputer,pages1–21,2014.[25]S.Liu,X.Wang,J.Chen,J.Zhu,andB.Guo.Topicpanorama:Afull
pictureofrelevanttopics.InProceedingsofIEEEVAST,pages183–192,2014.
[26]X.Liu,Y.Song,S.Liu,andH.Wang.Automatictaxonomyconstruction
fromkeywords.InProceedingsofKDD,pages1433–1441,2012.
[27]S.Lodha,A.Pang,R.Sheehan,andC.Wittenbrink.U ow:visualizing
uncertaintyin uid ow.InProceedingsofIEEEVisualization,pages249–254,Oct1996.
[28]Y.Lu,F.Wang,andR.Maciejewski.Businessintelligencefromsocial
[29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55]
media:Astudyfromthevastboxof cechallenge.IEEEComputerGraphicsandApplications,34(5):58–69,2014.
Z.Luo,M.Osborne,S.Petrovic,andT.Wang.Improvingtwitterretrievalbyexploitingstructuralinformation.InProceedingsofAAAI,2012.A.Marcus,M.S.Bernstein,O.Badar,D.R.Karger,S.Madden,andR.C.Miller.Twitinfo:Aggregatingandvisualizingmicroblogsforeventexploration.InProceedingsofCHI,pages227–236,2011.
R.McCreadieandC.Macdonald.Relevanceinmicroblogs:Enhancingtweetretrievalusinghyperlinkeddocuments.InProceedingsofOAIR,pages189–196,2013.
A.T.Pang,C.M.Wittenbrink,andS.K.Lodha.Approachestouncertaintyvisualization.TheVisualComputer,13(8):370–390,1997.
D.Phan,L.Xiao,R.Yeh,andP.Hanrahan.Flowmaplayout.InProceed-ingsofIEEEInfoVis,pages219–224,2005.
E.J.Ruiz,V.Hristidis,C.Castillo,A.Gionis,andA.Jaimes.Correlating nancialtimeserieswithmicro-bloggingactivity.InProceedingsofWSDM,pages513–522,2012.
S.SedhaiandA.Sun.Hashtagrecommendationforhyperlinkedtweets.InProceedingsofSIGIR,pages831–834,2014.
L.Shi,F.Wei,S.Liu,L.Tan,X.Lian,andM.Zhou.Understandingtextcorporawithmultiplefacets.InProceedingsofIEEEVAST,pages99–106,2010.
M.Skeels,B.Lee,G.Smith,rmationVisualization,9(1):70–81,2010.A.Slingsby,J.Dykes,andJ.Wood.Exploringuncertaintyingeode-mographicswithinteractivegraphics.IEEETVCG,17(12):2545–2554,2011.
G.Sun,Y.Wu,R.Liang,andS.Liu.Asurveyofvisualanalyticstech-niquesandapplications:State-of-the-artresearchandfuturechallenges.JournalofComputerScienceandTechnology,28(5):852–867,2013.G.Sun,Y.Wu,S.Liu,T.-Q.Peng,J.Zhu,andR.Liang.Evoriver:Visualanalysisoftopiccoopetitiononsocialmedia.IEEETVCG,20(12):1753–1762,2014.
J.Tang,Z.Liu,M.Sun,andJ.Liu.Portrayinguserlifestatusfrommicrobloggingposts.TsinghuaScienceandTechnology,18(2):182–195,2013.
J.Thomson,E.Hetzler,A.MacEachren,M.Gahegan,andM.Pavel.Atypologyforvisualizinguncertainty.SPIE,5669:146–157,2005.
C.Vehlow,T.Reinhardt,andD.Weiskopf.Visualizingfuzzyoverlappingcommunitiesinnetworks.IEEETVCG,19(12):2486–2495,Dec2013.K.Verbeek,K.Buchin,andB.Speckmann.Flowmaplayoutviaspiraltrees.IEEETVCG,17(12):2536–2544,2011.
F.Wei,W.Li,Q.Lu,andY.He.Query-sensitivemutualreinforcementchainanditsapplicationinquery-orientedmulti-documentsummarization.InProceedingsofSIGIR,pages283–290,2008.
J.Weng,E.-P.Lim,J.Jiang,andQ.He.Twitterrank:Findingtopic-sensitivein uentialtwitterers.InProceedingsofWSDM,pages261–270,2010.
Y.Wu,S.Liu,K.Yan,M.Liu,andF.Wu.Opinion ow:Visualanalysisofopiniondiffusiononsocialmedia.IEEETVCG,20(12):1763–1772,2014.Y.Wu,F.Wei,S.Liu,N.Au,W.Cui,H.Zhou,andH.Qu.Opinion-seer:Interactivevisualizationofhotelcustomerfeedback.IEEETVCG,16(6):1109–1118,2010.
Y.Wu,G.-X.Yuan,andK.-L.Ma.Visualizing owofuncertaintythroughanalyticalprocesses.IEEETVCG,18(12):2526–2535,Dec2012.
P.Xu,Y.Wu,E.Wei,T.-Q.Peng,S.Liu,J.J.H.Zhu,andH.Qu.Visualanalysisoftopiccompetitiononsocialmedia.IEEETVCG,19(12):2012–2021,2013.
E.Zangerle,W.Gassler,andG.Specht.Ontheimpactoftextsimilarityfunctionsonhashtagrecommendationsinmicrobloggingenvironments.SocialNetworkAnalysisandMining,3(4):889–898,2013.
J.Zhao,N.Cao,Z.Wen,Y.Song,Y.-R.Lin,andC.Collins.# ux ow:Visualanalysisofanomalousinformationspreadingonsocialmedia.IEEETVCG,20(12):1773–1782,2014.
X.W.Zhao,Y.Guo,Y.He,H.Jiang,Y.Wu,andX.Li.Weknowwhatyouwanttobuy:Ademographic-basedsystemforproductrecommendationonmicroblogs.InProceedingsofKDD,pages1935–1944,2014.
H.-J.Zimmermann.Fuzzysettheoryanditsapplications.SpringerScience&BusinessMedia,2001.
T.ZukandS.Carpendale.Visualizationofuncertaintyandreasoning.InProceedingsofSG,pages164–177,2007.