Archive for the 'Computing' Category

On Git

Posted in Computing 5 months ago

Many years ago, I tried playing with distributed version controls. Darcs seemed to be all the rage back then, but somehow I found them to be way too complicated for my uses. Yesterday, I decided to give Git a whirl.

I got it.

Having used Subversion for a couple of years and CVS for more years than I care to remember, I realized that a central repository system was fundamentally broken. It requires one to keep so much in one’s head. I almost always have to execute a svn status before every commit because I can’t remember about files that have moved, changed, deleted, added etc. This also adds a lot of cruft to the metadata of files that have had substantial changes across branch merges.

On the other hand, Git only cares about changes between trees. It cares about content and not files. You diff trees, not files. This is all very beautiful.

I have only scratched the surface. I cheated and took the crash course for people coming from the Subversion world. I haven’t even explored how merges work across branches: they are about five of them in Git.

One day, I will get all of it. Until then, I’ll trust the judgement of the incredibly smart people who work on Git. This is another of Linus’ masterpieces.

PS: Here’s a fairly good explanation of the guts of Git: Git for computer scientists.

Hot Chips 20 Reflections

Posted in Computing 10 months ago

Here’s my summary of the hot chips workshop that I recently attended. It was well attended with over 600 people showing up. The organizers also provided lunch on all days and dinner on one day. I learnt a lot, not only from the tutorials and talks, but also from talking to people during lunch and dinner.

Read the rest of this entry »

Hot Chips 20

Posted in Activity, Computing 10 months, 2 weeks ago

I’ll be at Stanford for the next few days for Hot Chips 20, a symposium on high performance chips. Sessions I’m particularly interested in:

  • D.E. Shaw’s specialized ASIC for molecular dynamics which I’ve written about earlier and IBM’s PowerXCell powering Roadrunner.
  • Upcoming architectures: AMD’s 780G and Intel’s Nehalem (dot products of special interest to me.)
  • Chips tuned for network or IO (Sun’s Rock, Fujitsu’s SPARC64VII and Intel’s Tukwila.)
  • Algorithmic content: Roofline models for automatic tuning of kernels (good addition to Demmel’s talk on the future of linear algebra from MMDS.)
  • Intel’s Larrabee: response to “the can of whoop-ass” (detailed architectural paper from SIGRAPH.)
  • CUDA: useful for a class of algorithms (based on memory access.)

I’m going to be trying something new this time — live blogging. I’ll try to push constant updates to my twitter stream : gane5h.

I’ll be staying at the Sheraton in Palo Alto. Drop me a line if you want to meetup for a chat.

Fast Determinant Calculation

Posted in Computing 10 months, 3 weeks ago

Someone came to me recently for some help on computing determinants. Determinants of 2×2 and 3×3 and maybe even 4×4 matrices are trivial and you can easily hard code the expanded symbolic form. Determinants of bigger matrices are much harder and require careful thought.

First some rules:

  • Singular matrices have determinant zero. Simple, but often overlooked.
  • If you require a spectral decomposition at a later stage, the easiest is the product of the eigenvalues (the matrix cookbook is a handy reference.)
  • For triangular matrices, it’s the product of the diagonal entries.
  • The determinant of the product of two matrices is the product of their determinants.

Using the above rules, the first step is to factor your matrix into two triangular matrices. The LU decomposition achieves this. In general, a permutation matrix allows all ones on the diagonal of L. If you only care about the absolute value of the determinant, simply computing the product of the diagonal values of U gets you the determinant.

If your matrix is symmetric and positive definite, then you can use a specialization known as Cholesky decomposition which makes L the transpose of U (doing half the work with half the storage.) The determinant is then the product of the square of the diagonal elements. Diffusion tensors are covariance matrices (symmetric, positive definite) and so this decomposition always applies.

Modern Massive Data Sets Reflections

Posted in Activity, Computing 10 months, 3 weeks ago

The workshop was a blast! I had an incredible time getting up to speed on the latest and greatest in data analysis research. It was quite humbling to brush shoulders with some of top folks pushing the frontiers of science. There were also many opportunities to network where I could get a peek at the motivations behind some of the projects presented.

Each day had a theme:

  • Data Analysis and Data Applications
  • Networked Data and Algorithmic Tools
  • Statistical, Geometric, and Topological Methods
  • Machine Learning and Dimensionality Reduction

The breadth of topics was quite exhaustive. I mostly pushed my own agenda: streaming algorithms. There’s so much to write here that I won’t even attempt to.

Besides streaming algorithms, the presentations on mathematical topics were really interesting. Some of it I’ve previously seen from my day-to-day work, some of it was new. Of particular interest to me were the following:

  • Graph Sparsification : Never seen anything like this before.
  • Massive Terrain Data : Real smart use of offline datastrutures.
  • Symmetries in point cloud data : I’m intimately familiar with this style of mathematics from my previous work.
  • Pathway Analysis in Protein Folding : Puts bread on the table.
  • Intersection SVMs : Didn’t know this was a well known concept in machine learning known as the kernel trick. Goes by Reproducing Kernel Hilbert Space in my neck of the woods and also a precursor to the above mentioned image matching algorithm.
  • Manifold regularization : Fréchet means anyone?
  • Sufficient Dimension Reduction
  • Semi-definite programming : Some mathematical insights to a couple of engineering problems (where <1e-4 is good enough) that’s making my life difficult.
  • Spectral Algorithms
  • Matrix/Tensor Factorization
  • Future of Parallel Linear Algebra

Thanks to my friend Krishna who let me sleep on the floor in his house, thereby saving me from the grips of boredom.