Modern Massive Data Sets Reflections

The workshop was a blast! I had an incredible time getting up to speed on the latest and greatest in data analysis research. It was quite humbling to brush shoulders with some of top folks pushing the frontiers of science. There were also many opportunities to network where I could get a peek at the motivations behind some of the projects presented.

Each day had a theme:

  • Data Analysis and Data Applications
  • Networked Data and Algorithmic Tools
  • Statistical, Geometric, and Topological Methods
  • Machine Learning and Dimensionality Reduction

The breadth of topics was quite exhaustive. I mostly pushed my own agenda: streaming algorithms. There’s so much to write here that I won’t even attempt to.

Besides streaming algorithms, the presentations on mathematical topics were really interesting. Some of it I’ve previously seen from my day-to-day work, some of it was new. Of particular interest to me were the following:

  • Graph Sparsification : Never seen anything like this before.
  • Massive Terrain Data : Real smart use of offline datastrutures.
  • Symmetries in point cloud data : I’m intimately familiar with this style of mathematics from my previous work.
  • Pathway Analysis in Protein Folding : Puts bread on the table.
  • Intersection SVMs : Didn’t know this was a well known concept in machine learning known as the kernel trick. Goes by Reproducing Kernel Hilbert Space in my neck of the woods and also a precursor to the above mentioned image matching algorithm.
  • Manifold regularization : Fréchet means anyone?
  • Sufficient Dimension Reduction
  • Semi-definite programming : Some mathematical insights to a couple of engineering problems (where <1e-4 is good enough) that’s making my life difficult.
  • Spectral Algorithms
  • Matrix/Tensor Factorization
  • Future of Parallel Linear Algebra

Thanks to my friend Krishna who let me sleep on the floor in his house, thereby saving me from the grips of boredom.

Possibly related:

Leave a Reply