Purdue University
Higgs Boson
CERN's Large Hadron Collider generates hundreds of millions of particle
collisions each second. Recording, storing and analyzing these vast amounts of collisions presents a massive data challenge because the collider produces roughly 20 million gigabytes of data each year.
1,000,000,000,000,000: The number of proton-proton collisions, a thousand trillion, analyzed by ATLAS and CMS experiments.
100,000: The number of CDs it would take to record all the data from the ATLAS detector per second, or a stack reaching 450 feet (137 meters) high every second; at this rate, the CD stack could reach the moon and back twice each year, according to CERN.
27: The number of CDs per minute it would take to hold the amount of data ATLAS actually records, since it only records data that shows signs of something new.
\
happened,\conference. The computing power and the network that CERN uses is a very important part of the research, he added.
Purdue University
Current database tools are insufficient to capture, analyze, search, and visualize the size of data encountered today.
Purdue University
Theory to support new directions
Large graphs Spectral analysis
High dimensions and dimension reduction Clustering
Collaborative filtering
Extracting signal from noise Sparse vectors
Purdue University
Sparse vectors
There are a number of situations where sparse vectors are important.
Tracking the flow of ideas in scientific literature Biological applications Signal processing
Purdue University