Hot Chips 20 Reflections

Here’s my summary of the hot chips workshop that I recently attended. It was well attended with over 600 people showing up. The organizers also provided lunch on all days and dinner on one day. I learnt a lot, not only from the tutorials and talks, but also from talking to people during lunch and dinner.

Morning Tutorial

As processors become faster and faster, memory is not able to keep up the demand made by applications, further exacerbated by multicore technology. As as result, applications spend most of their time waiting for data to arrive from memory. This is know as the “memory wall.” This tutorial had presenters from Rambus, Intel and AMD who described their respective approaches to dramatically improve the situation. Most of the technology showcased is still in the prototype stage. I should also never get confused between the terms bandwidth, latency and throughput.

Afternoon Tutorial

The second half of the day had various people from Nvidia showcasing the CUDA technology. They had just released version 2.0. I’ve seen most of the material a couple of times before, so I’m only going to write about things I found really interesting:

  • really simple extensions to the C language. My biggest concern with this toolkit is if we will ever see a return on investment in writing for CUDA. What if Nvidia were to abandon this effort a few years down the line? One of the product managers alleviated this concern for me — the language is so simple that it should be trivial to write a source-to-source translator to convert it to plain multithreaded code.

  • the nvcc compiler can spit out code for their GPUs or plain multithreaded code that can be compiled with gcc or visual studio.

  • the CUDA FFT library supports three-dimensional transforms of Hermitian sequences. This special optimization is twice as fast as the regular complex-to-complex transform.

  • hardware support for various types of interpolation, filtering and boundary conditions owing to its graphics roots.

  • hardware support for sin, exp, pow, rcp (reciprocal) that can be done in one clock cycle.

  • future: fortran/c++ support, multiple GPU devices, debugger, profiler, GPU cluster support.

Conference Day 1

Cars that drive themselves: One of the most interesting talks on the first day was the keynote was Sebastian Thurn. I had attended his talk before at SFU when he had come over just after winning the DARPA challenge n 2005. I was impressed then and I’m continued to be impressed by the quality of work that he and his team puts out. This time he spent a lot of time describing the challenges in the next DARPA Urban Challenge. Two interesting points from his talk:

  • the occupancy of freeways is really bad. With automated cars, we can improve the “throughput” of the freeways.
  • you can think of these automated cars as train carriages, only attaching/detaching themselves as required.

Mobile Media Processing: Three really impressive technologies.

  • a 300 mW single chip television from Telegent. Most of the world is still on analog television. Take technology from the 50s and apply the latest process and methodologies to it.
  • voice enhancer based on human audio system from Audience. Based on the Fast Cochlea Transform, they can pick out just your voice and segment it from the background. Enables really advanced voice manipulation techniques — useful for ransom negotiation. Maybe not.
  • Nvidia Tegra. Really cool HD video in the palm of your hand.

Supercomputing: The reason I was there.

  • Cell broadband engine: For some reason I was finding it difficult to stay focussed as this was not x86.

  • Anton: Written about this earlier. They can do a nonbonded computation in one clock cycle! That’s awesome. They get massive speedups with careful organization of memory and treating their hardware as a massive stream processor. All of the messy conditionals and correction terms are treated separately. Interesting to see that they get better performance by doing more computation and reducing unpredictability.

Possibly related:

Leave a Reply