<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ganesh Swami &#187; Computing</title>
	<atom:link href="http://ergodicity.iamganesh.com/category/computing/feed/" rel="self" type="application/rss+xml" />
	<link>http://ergodicity.iamganesh.com</link>
	<description>Quick brown foxes and lazy dogs.</description>
	<lastBuildDate>Sat, 31 Jan 2009 21:07:31 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>On Git</title>
		<link>http://ergodicity.iamganesh.com/2009/01/on-git/</link>
		<comments>http://ergodicity.iamganesh.com/2009/01/on-git/#comments</comments>
		<pubDate>Sat, 31 Jan 2009 11:06:44 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[distributed]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[open-source]]></category>
		<category><![CDATA[version-control]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/?p=303</guid>
		<description><![CDATA[Many years ago, I tried playing with distributed version controls. Darcs seemed to be all the rage back then, but somehow I found them to be way too complicated for my uses. Yesterday, I decided to give Git a whirl.

I got it.

Having used Subversion for a couple of years and CVS for more years than [...]]]></description>
			<content:encoded><![CDATA[<p>Many years ago, I tried playing with distributed version controls. Darcs seemed to be all the rage back then, but somehow I found them to be way too complicated for my uses. Yesterday, I decided to give Git a whirl.</p>

<p>I got it.</p>

<p>Having used Subversion for a couple of years and CVS for more years than I care to remember, I realized that a central repository system was fundamentally broken. It requires one to keep so much in one&#8217;s head. I almost always have to execute a <code>svn status</code> before every commit because I can&#8217;t remember about files that have moved, changed, deleted, added etc. This also adds a lot of cruft to the metadata of files that have had substantial changes across branch merges.</p>

<p>On the other hand, Git only cares about changes between trees. It cares about content and not files. You <code>diff</code> trees, not files. This is all very beautiful.</p>

<p>I have only scratched the surface. I cheated and took the <a href="http://git.or.cz/course/svn.html">crash course</a> for people coming from the Subversion world. I haven&#8217;t even explored how merges work across branches: they are about <a href="http://www.kernel.org/pub/software/scm/git/docs/git-merge.html">five</a> of them in Git.</p>

<p>One day, I will get all of it. Until then, I&#8217;ll trust the judgement of the incredibly smart people who work on Git. This is another of Linus&#8217; masterpieces.</p>

<p>PS: Here&#8217;s a fairly good explanation of the guts of Git: <a href="http://eagain.net/articles/git-for-computer-scientists/">Git for computer scientists</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2009/01/on-git/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hot Chips 20 Reflections</title>
		<link>http://ergodicity.iamganesh.com/2008/09/hot-chips-20-reflections/</link>
		<comments>http://ergodicity.iamganesh.com/2008/09/hot-chips-20-reflections/#comments</comments>
		<pubDate>Sun, 07 Sep 2008 09:22:17 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[anton]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[cuda]]></category>
		<category><![CDATA[hpc]]></category>
		<category><![CDATA[silicon valley]]></category>
		<category><![CDATA[supercomputing]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/?p=307</guid>
		<description><![CDATA[Here&#8217;s my summary of the hot chips workshop that I recently attended. It was well attended with over 600 people showing up. The organizers also provided lunch on all days and dinner on one day. I learnt a lot, not only from the tutorials and talks, but also from talking to people during lunch and [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s my summary of the hot chips workshop that I recently attended. It was well attended with over 600 people showing up. The organizers also provided lunch on all days and dinner on one day. I learnt a lot, not only from the tutorials and talks, but also from talking to people during lunch and dinner. </p>

<p><span id="more-307"></span></p>

<h3>Morning Tutorial</h3>

<p>As processors become faster and faster, memory is not able to keep up the demand made by applications, further exacerbated by multicore technology. As as result, applications spend most of their time waiting for data to arrive from memory. This is know as the &#8220;memory wall.&#8221; This tutorial had presenters from Rambus, Intel and AMD who described their respective approaches to dramatically improve the situation. Most of the technology showcased is still in the prototype stage. I should also never get confused between the terms bandwidth, latency and throughput.</p>

<h3>Afternoon Tutorial</h3>

<p>The second half of the day had various people from Nvidia showcasing the <a href="http://www.nvidia.com/cuda">CUDA</a> technology. They had just released version 2.0. I&#8217;ve seen most of the material a couple of times before, so I&#8217;m only going to write about things I found really interesting:</p>

<ul>
<li><p>really simple extensions to the C language. My biggest concern with this toolkit is if we will ever see a return on investment in writing for CUDA. What if Nvidia were to abandon this effort a few years down the line? One of the product managers alleviated this concern for me &#8212; the language is so simple that it should be trivial to write a source-to-source translator to convert it to plain multithreaded code.</p></li>
<li><p>the <code>nvcc</code> compiler can spit out code for their GPUs or plain multithreaded code that can be compiled with gcc or visual studio.</p></li>
<li><p>the CUDA FFT library supports three-dimensional transforms of Hermitian sequences. This special optimization is twice as fast as the regular complex-to-complex transform.</p></li>
<li><p>hardware support for various types of interpolation, filtering and boundary conditions owing to its graphics roots.</p></li>
<li><p>hardware support for sin, exp, pow, rcp (reciprocal) that can be done in one clock cycle.</p></li>
<li><p>future: fortran/c++ support, multiple GPU devices, debugger, profiler, GPU cluster support.</p></li>
</ul>

<h3>Conference Day 1</h3>

<p><strong>Cars that drive themselves:</strong> One of the most interesting talks on the first day was the keynote was <a href="http://robots.stanford.edu/">Sebastian Thurn</a>. I had attended his talk before at SFU when he had come over just after winning the DARPA challenge n 2005. I was impressed then and I&#8217;m continued to be impressed by the quality of work that he and his team puts out. This time he spent a lot of time describing the challenges in the next DARPA Urban Challenge. Two interesting points from his talk:</p>

<ul>
<li>the occupancy of freeways is really bad. With automated cars, we can improve the &#8220;throughput&#8221; of the freeways.</li>
<li>you can think of these automated cars as train carriages, only attaching/detaching themselves as required.</li>
</ul>

<p><strong>Mobile Media Processing: </strong> Three really impressive technologies.</p>

<ul>
<li>a 300 mW single chip television from <a href="http://www.telegent.com/">Telegent</a>. Most of the world is still on analog television. Take technology from the 50s and apply the latest process and methodologies to it.</li>
<li>voice enhancer based on human audio system from <a href="http://www.audience.com/">Audience</a>. Based on the Fast Cochlea Transform, they can pick out just your voice and segment it from the background. Enables really advanced voice manipulation techniques &#8212; useful for ransom negotiation. Maybe not.</li>
<li>Nvidia Tegra. Really cool HD video in the palm of your hand. </li>
</ul>

<p><strong>Supercomputing:</strong> The reason I was there.</p>

<ul>
<li><p>Cell broadband engine: For some reason I was finding it difficult to stay focussed as this was not x86. </p></li>
<li><p>Anton: Written about this <a href="http://ergodicity.iamganesh.com/2007/08/10/anton/">earlier</a>. They can do a nonbonded computation in one clock cycle! That&#8217;s awesome. They get massive speedups with careful organization of memory and treating their hardware as a massive stream processor. All of the messy conditionals and correction terms are treated separately. Interesting to see that they get better performance by doing more computation and reducing unpredictability.</p></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2008/09/hot-chips-20-reflections/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hot Chips 20</title>
		<link>http://ergodicity.iamganesh.com/2008/08/hot-chips-20/</link>
		<comments>http://ergodicity.iamganesh.com/2008/08/hot-chips-20/#comments</comments>
		<pubDate>Thu, 21 Aug 2008 07:56:57 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Activity]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[de shaw]]></category>
		<category><![CDATA[dot products]]></category>
		<category><![CDATA[larrabee]]></category>
		<category><![CDATA[silicon valley]]></category>
		<category><![CDATA[zymeworks]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/?p=256</guid>
		<description><![CDATA[I&#8217;ll be at Stanford for the next few days for Hot Chips 20, a symposium on high performance chips. Sessions I&#8217;m particularly interested in:


D.E. Shaw&#8217;s specialized ASIC for molecular dynamics which I&#8217;ve written about earlier and IBM&#8217;s PowerXCell powering Roadrunner.
Upcoming architectures: AMD&#8217;s 780G and Intel&#8217;s Nehalem (dot products of special interest to me.)
Chips tuned for [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ll be at Stanford for the next few days for <a href="http://www.hotchips.org/hc20/">Hot Chips 20</a>, a symposium on high performance chips. Sessions I&#8217;m particularly interested in:</p>

<ul>
<li>D.E. Shaw&#8217;s specialized ASIC for molecular dynamics which I&#8217;ve written about <a href="http://ergodicity.iamganesh.com/2007/08/10/anton/">earlier</a> and IBM&#8217;s PowerXCell powering <a href="http://en.wikipedia.org/wiki/IBM_Roadrunner">Roadrunner</a>.</li>
<li>Upcoming architectures: AMD&#8217;s 780G and Intel&#8217;s Nehalem (<a href="http://ergodicity.iamganesh.com/2007/03/29/streaming-instructions/">dot products of special interest to me</a>.)</li>
<li>Chips tuned for network or IO (Sun&#8217;s Rock, Fujitsu&#8217;s SPARC64VII and Intel&#8217;s Tukwila.)</li>
<li>Algorithmic content: Roofline models for automatic tuning of kernels (good addition to Demmel&#8217;s talk on the future of linear algebra from MMDS.)</li>
<li>Intel&#8217;s Larrabee: response to &#8220;<a href="http://www.engadget.com/2008/04/10/ce-oh-no-he-didnt-part-lv-nvidia-ceo-says-were-going-to-ope/">the can of whoop-ass</a>&#8221; (detailed <a href="http://softwarecommunity.intel.com/UserFiles/en-us/File/larrabee_manycore.pdf">architectural paper</a> from SIGRAPH.) </li>
<li>CUDA: useful for a class of algorithms (based on memory access.)</li>
</ul>

<p>I&#8217;m going to be trying something new this time &#8212; live blogging. I&#8217;ll try to push constant updates to my twitter stream : <a href="http://twitter.com/gane5h">gane5h</a>. </p>

<p>I&#8217;ll be staying at the Sheraton in Palo Alto. Drop me a line if you want to meetup for a chat.</p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2008/08/hot-chips-20/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Fast Determinant Calculation</title>
		<link>http://ergodicity.iamganesh.com/2008/08/fast-determinant-calculation/</link>
		<comments>http://ergodicity.iamganesh.com/2008/08/fast-determinant-calculation/#comments</comments>
		<pubDate>Thu, 14 Aug 2008 22:54:00 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[algorithm]]></category>
		<category><![CDATA[determinant]]></category>
		<category><![CDATA[linear algebra]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[matrix decomposition]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/?p=310</guid>
		<description><![CDATA[Someone came to me recently for some help on computing determinants. Determinants of 2&#215;2 and 3&#215;3 and maybe even 4&#215;4 matrices are trivial and you can easily hard code the expanded symbolic form. Determinants of bigger matrices are much harder and require careful thought. 

First some rules:


Singular matrices have determinant zero. Simple, but often overlooked.
If [...]]]></description>
			<content:encoded><![CDATA[<p>Someone came to me recently for some help on computing determinants. Determinants of 2&#215;2 and 3&#215;3 and maybe even 4&#215;4 matrices are trivial and you can easily hard code the expanded symbolic form. Determinants of bigger matrices are much harder and require careful thought. </p>

<p>First some rules:</p>

<ul>
<li>Singular matrices have determinant zero. Simple, but often overlooked.</li>
<li>If you require a spectral decomposition at a later stage, the easiest is the product of the eigenvalues (the <a href="http://matrixcookbook.com/">matrix cookbook</a> is a handy reference.)</li>
<li>For triangular matrices, it&#8217;s the product of the diagonal entries.</li>
<li>The determinant of the product of two matrices is the product of their determinants.</li>
</ul>

<p>Using the above rules, the first step is to factor your matrix into two triangular matrices. The LU decomposition achieves this. In general, a permutation matrix allows all ones on the diagonal of L. If you only care about the absolute value of the determinant, simply computing the product of the diagonal values of U gets you the determinant.</p>

<p>If your matrix is symmetric and positive definite, then you can use a specialization known as Cholesky decomposition which makes L the transpose of U (doing half the work with half the storage.) The determinant is then the product of the square of the diagonal elements. Diffusion tensors are covariance matrices (symmetric, positive definite) and so this decomposition always applies.</p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2008/08/fast-determinant-calculation/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Modern Massive Data Sets Reflections</title>
		<link>http://ergodicity.iamganesh.com/2008/08/modern-massive-data-sets-reflections/</link>
		<comments>http://ergodicity.iamganesh.com/2008/08/modern-massive-data-sets-reflections/#comments</comments>
		<pubDate>Wed, 13 Aug 2008 22:54:57 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Activity]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[algorithm]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[mmds]]></category>
		<category><![CDATA[stanford]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/?p=311</guid>
		<description><![CDATA[The workshop was a blast! I had an incredible time getting up to speed on the latest and greatest in data analysis research. It was quite humbling to brush shoulders with some of top folks pushing the frontiers of science. There were also many opportunities to network where I could get a peek at the [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.stanford.edu/group/mmds/">workshop</a> was a blast! I had an incredible time getting up to speed on the latest and greatest in data analysis research. It was quite humbling to brush shoulders with some of top folks pushing the frontiers of science. There were also many opportunities to network where I could get a peek at the motivations behind some of the projects presented.</p>

<p>Each day had a theme:</p>

<ul>
<li>Data Analysis and Data Applications</li>
<li>Networked Data and Algorithmic Tools</li>
<li>Statistical, Geometric, and Topological Methods</li>
<li>Machine Learning and Dimensionality Reduction</li>
</ul>

<p>The breadth of topics was quite exhaustive. I mostly pushed my own agenda: streaming algorithms. There&#8217;s so much to write here that I won&#8217;t even attempt to. </p>

<p>Besides streaming algorithms, the presentations on mathematical topics were really interesting. Some of it I&#8217;ve previously seen from my day-to-day work, some of it was new. Of particular interest to me were the following:</p>

<ul>
<li>Graph Sparsification : Never seen anything like this before.</li>
<li>Massive Terrain Data : Real smart use of offline datastrutures.</li>
<li>Symmetries in point cloud data : I&#8217;m intimately familiar with this style of mathematics from my <a href="http://ergodicity.iamganesh.com/projects/fluid-match/">previous work</a>.</li>
<li>Pathway Analysis in Protein Folding : Puts bread on the table.</li>
<li>Intersection SVMs : Didn&#8217;t know this was a well known concept in machine learning known as the <a href="http://en.wikipedia.org/wiki/Kernel_trick">kernel trick</a>. Goes by <a href="http://en.wikipedia.org/wiki/Reproducing_kernel_Hilbert_space">Reproducing Kernel Hilbert Space</a> in my neck of the woods and also a precursor to the above mentioned image matching algorithm.</li>
<li>Manifold regularization : Fréchet means anyone?</li>
<li>Sufficient Dimension Reduction</li>
<li>Semi-definite programming : Some mathematical insights to a couple of engineering problems (where &lt;1e-4 is good enough) that&#8217;s making my life difficult.</li>
<li>Spectral Algorithms</li>
<li>Matrix/Tensor Factorization</li>
<li>Future of Parallel Linear Algebra</li>
</ul>

<p>Thanks to my friend Krishna who let me sleep on the floor in his house, thereby saving me from the grips of boredom.</p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2008/08/modern-massive-data-sets-reflections/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Modern Massive Data Sets</title>
		<link>http://ergodicity.iamganesh.com/2008/06/modern-massive-data-sets/</link>
		<comments>http://ergodicity.iamganesh.com/2008/06/modern-massive-data-sets/#comments</comments>
		<pubDate>Fri, 20 Jun 2008 02:42:38 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[algorithm]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[data streams]]></category>
		<category><![CDATA[mmds]]></category>
		<category><![CDATA[randomization]]></category>
		<category><![CDATA[stanford]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/?p=207</guid>
		<description><![CDATA[I&#8217;m excited about the Workshop on Modern Massive Data Sets I&#8217;ll be attending next week at Stanford. 

The 2008 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2008) will address algorithmic, mathematical, and statistical challenges in modern statistical data analysis. The goals of MMDS 2008 are to explore novel techniques for modeling and analyzing [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m excited about the <a href="http://www.stanford.edu/group/mmds/">Workshop on Modern Massive Data Sets</a> I&#8217;ll be attending next week at Stanford. </p>

<blockquote>The 2008 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2008) will address algorithmic, mathematical, and statistical challenges in modern statistical data analysis. The goals of MMDS 2008 are to explore novel techniques for modeling and analyzing massive, high-dimensional, and nonlinearly-structured scientific and internet data sets, and to bring together computer scientists, statisticians, mathematicians, and data analysis practitioners to promote cross-fertilization of ideas.</blockquote>

<p>These are my notes (disclaimer: I&#8217;m no expert, corrections are welcome.)</p>

<p>Most algorithm engineers thus far are happy with algorithms that are linear, i.e., <img src='http://ergodicity.iamganesh.com/wp-content/latexrenderer/pictures/a6e200f05467fb069772a59842a9f43d.gif' title='\mathcal{O}(n)' alt='\mathcal{O}(n)' align='middle' /> in the number of data points in the dataset. With web-scale data, linear is not good enough: <strong>sub-linear</strong> is required. These streams of data are typically unbounded and do not fit in main memory. They cannot even be stored, as it is infeasible to go back and reload them.</p>

<p>The <em>Frequency Problem</em> is used as a model problem over data streams, for example, computing means/variances, medians, top-k frequent items, distinct elements etc. One approach is to approximate the computation over the stream via random sampling. </p>

<p>There are four topics that keep recurring when looking at proofs of streaming algorithms using randomization:</p>

<ul>
<li><a href="http://en.wikipedia.org/wiki/Markov's_inequality">Markov&#8217;s inequality</a></li>
<li><a href="http://en.wikipedia.org/wiki/Chebyshev's_inequality">Chebyschev’s inequality</a></li>
<li><a href="http://en.wikipedia.org/wiki/H%C3%B6lder's_inequality">Holder’s inequality</a></li>
<li><a href="http://en.wikipedia.org/wiki/Chernoff_bound">Chernoff bounds</a></li>
</ul>

<p>This is really key. As an example, suppose we wanted to run some complicated algorithm (offline) over the packets flowing through a router. As this would a huge number of packets, we won&#8217;t have the luxury of storing all packets, but only a subset. By sampling the packets with probability <code>p</code>, we can estimate the amount of memory required to store the packets as the expectation value of the <a href="http://en.wikipedia.org/wiki/Binomial_distribution">binomial variable</a> with parameter <code>p</code> and <code>n</code>, where <code>n</code> is the number of packets going through the router.</p>

<p>The use of one of the bounds over the other is a matter of how tight the bound is. The Markov inequality only uses the expectation value of the random variable while the other use higher moments. They do this by using the Markov inequality to a moment generating function <code>exp(tX)</code> of the random variable <code>X</code>.</p>

<p>You&#8217;ll want to check the <a href="http://www.stanford.edu/group/mmds/draft.html">preliminary program</a> for the kind of topics that are going to be covered. While you are at it, check out the <a href="http://www.stanford.edu/group/mmds/mmds2006.html">workshop webpage from 2006</a>. I&#8217;ll go find something to wipe the drool off my chin&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2008/06/modern-massive-data-sets/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The SVD</title>
		<link>http://ergodicity.iamganesh.com/2008/02/the-svd/</link>
		<comments>http://ergodicity.iamganesh.com/2008/02/the-svd/#comments</comments>
		<pubDate>Mon, 04 Feb 2008 08:36:34 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[linear algebra]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[strang]]></category>
		<category><![CDATA[svd]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/2008/02/04/the-svd/</guid>
		<description><![CDATA[A couple of months back, Prof. Gilbert Strang had come to SFU as part
of the distinguished speakers series (as I had written earlier.)
After the talk, I got a chance to chat with the guru. I owe a lot of
my understanding of linear algebra to his books and his kick-ass
animations (check out these eigen-analysis demos,) and [...]]]></description>
			<content:encoded><![CDATA[<p>A couple of months back, Prof. Gilbert Strang had come to SFU as part
of the distinguished speakers series (as I had written <a href="http://ergodicity.iamganesh.com/2007/01/19/the-perks/">earlier</a>.)
After the talk, I got a chance to chat with the guru. I owe a lot of
my understanding of linear algebra to his books and his kick-ass
animations (check out these <a href="http://web.mit.edu/18.06/www/">eigen-analysis demos</a>,) and so I
asked him about the single most important topic in linear
algebra. Without hesitation, he immediately responded: &#8220;<strong><em>the SVD</em></strong>!&#8221;</p>

<p>The Singular Value Decomposition is the swiss-knife of linear
algebra. Every matrix Y can be factored into three matrices: U,
S, and V as</p>

<p><img src='http://ergodicity.iamganesh.com/wp-content/latexrenderer/pictures/b479757b77125513998fc2f2b61c6680.gif' title='Y = U S V^t' alt='Y = U S V^t' align='middle' /></p>

<p>U and V are orthogonal matrices and S is a diagonal matrix. Some
uses of this factorization of the matrix Y: calculating 2-norms,
Frobenius norms, ranks, null spaces and ranges, (pseudo)inverses and
determinants, and by extension solving systems of equations (exact,
over- and under- determined), eigenvalues and eigenvectors, and
approximations.</p>

<p>Almost every problem in engineering becomes an optimization problem
(as in reducing the error under some norm) and the method of least
squares makes extensive use of the SVD. </p>

<p>The stage has been set for the next few posts of mine.</p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2008/02/the-svd/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fitting data with Python</title>
		<link>http://ergodicity.iamganesh.com/2008/02/fitting-data-with-python/</link>
		<comments>http://ergodicity.iamganesh.com/2008/02/fitting-data-with-python/#comments</comments>
		<pubDate>Sun, 03 Feb 2008 07:33:15 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Physics]]></category>
		<category><![CDATA[least-squares]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/2008/02/03/fitting-data-with-python/</guid>
		<description><![CDATA[I&#8217;ve recently become a heavy user of the numerical capabilities of
Python. I&#8217;ve written about my experiments before, but now I&#8217;m
writing production quality code with numpy and
matplotlib.



The above is an actual plot that I created for some Hall
measurements I was doing. I was supposed for find functional
relationships between temperature and majority charge carriers, which
in my case [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently become a heavy user of the numerical capabilities of
Python. I&#8217;ve written about my experiments <a href="http://ergodicity.iamganesh.com/2007/07/13/numerical-python/">before</a>, but now I&#8217;m
writing production quality code with <a href="http://numpy.scipy.org/">numpy</a> and
<a href="http://matplotlib.sourceforge.net/">matplotlib</a>.</p>

<p><img src='http://ergodicity.iamganesh.com/wp-content/uploads/2008/02/temp_dep_mobility.png' alt='Mobility Temperature Plot' /></p>

<p>The above is an actual plot that I created for some <a href="http://en.wikipedia.org/wiki/Hall_effect">Hall
measurements</a> I was doing. I was supposed for find functional
relationships between temperature and majority charge carriers, which
in my case were electrons because of the n-type doping. The simple
case was a least squares fit: <code>scipy.optimize.leastsq</code> to the
rescue. The more complicated part was solving a non-linear equation
for roots and then doing a least squares fit. The root-finding module
in scientific python provides lots of options.
At this point, I can confidently say that this environment has more
features than Octave. </p>

<p>Just today, I wanted to use the Fourier method on a differential
equation (plug: the advantages of which are <a href="http://ergodicity.iamganesh.com/2007/04/22/pseudospectral-methods/">here</a>) and numerical
python with <code>fft</code>, <code>fftshift</code> and <code>fftfreq</code> are exact substitutes for
their Matlab equivalents. You can also put actual LaTeX equations on
plots, which is a major plus.</p>

<p>That is all.</p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2008/02/fitting-data-with-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Unit Tests in C</title>
		<link>http://ergodicity.iamganesh.com/2008/02/unit-tests-in-c/</link>
		<comments>http://ergodicity.iamganesh.com/2008/02/unit-tests-in-c/#comments</comments>
		<pubDate>Sat, 02 Feb 2008 06:47:01 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[unit tests]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/2008/02/02/unit-tests-in-c/</guid>
		<description><![CDATA[I strongly think that unit tests are absolutely necessary to keep the
overall quality of ones code high. I&#8217;ve written about Unit Tests
before, so I&#8217;ll continue on that trend.

Most high performance numerical codes use some variant of Fortran or
just raw C (which is my staple language.) In that case, as Micheal
Feathers rightly points out, unit tests [...]]]></description>
			<content:encoded><![CDATA[<p>I strongly think that unit tests are absolutely necessary to keep the
overall quality of ones code high. I&#8217;ve written about Unit Tests
<a href="http://ergodicity.iamganesh.com/2007/02/12/group-theory-and-unit-testing/">before</a>, so I&#8217;ll continue on that trend.</p>

<p>Most high performance numerical codes use some variant of Fortran or
just raw C (which is my staple language.) In that case, as <a href="http://beautifulcode.oreillynet.com/2008/01/when_c_collides_with_unit_test.php">Micheal
Feathers</a> rightly points out, unit tests can &#8220;collide&#8221; with C.</p>

<p>I&#8217;ll want to expand on his solution of <em>link seams</em>. </p>

<p>GNU/Linux (and other Unix-like operating systems I think), provide a
mechanism of loading shared object files at runtime. Using <code>dlopen(3)</code>
and <code>dlsym(3)</code> you can re-create much of the reflection functionality
that the Java community enjoys. So by loading your unit test
dynamically at runtime, and having a predetermined symbol that behaves
as the entry point, you can nicely wrap your module/function with a
unit test. Interface this with <code>ctest</code> from the <a href="http://www.cmake.org/HTML/Index.html">cmake</a> project,
and you are set. If you want to have a private function whose symbol isn&#8217;t visible on the outside, make it <code>static</code>. The <code>nm</code> utility will
give you a list of public symbols in an object.</p>

<p>I&#8217;m pretty sure this is supported across the board. I remember doing
this under DOS using <a href="http://www.delorie.com/djgpp/">DJGPP</a> (remember that?) so this isn&#8217;t exactly novel.</p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2008/02/unit-tests-in-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Radial Fields</title>
		<link>http://ergodicity.iamganesh.com/2007/09/radial-fields/</link>
		<comments>http://ergodicity.iamganesh.com/2007/09/radial-fields/#comments</comments>
		<pubDate>Wed, 26 Sep 2007 19:48:11 +0000</pubDate>
		<dc:creator>ganesh</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[dtmri]]></category>

		<guid isPermaLink="false">http://ergodicity.iamganesh.com/2007/09/26/radial-fields/</guid>
		<description><![CDATA[For some work I was supposed to do later this term, I wanted a few
synthetic vector/tensor fields. Something more complicated than a
regular planar field.

I&#8217;ve used radial and tangential fields in electromagnetics (think
solenoids and inductors) countless times, so it should be trivial to
generate one, shouldn&#8217;t it? Unfortunately, I was getting mixed up in
the minus signs somewhere [...]]]></description>
			<content:encoded><![CDATA[<p>For some work I was supposed to do later this term, I wanted a few
synthetic vector/tensor fields. Something more complicated than a
regular planar field.</p>

<p>I&#8217;ve used radial and tangential fields in electromagnetics (think
solenoids and inductors) countless times, so it should be trivial to
generate one, shouldn&#8217;t it? Unfortunately, I was getting mixed up in
the minus signs somewhere and for the life of me couldn&#8217;t find out
where. I checked and rechecked my math. I checked the usage of
<code>atan2(3)</code> instead of <code>atan(3)</code>. Finally, I had to resort to
generating them by rotating the Cartesian basis and regenerating the
tensors from the spectral components. Hacky, cludgy and doesn&#8217;t follow
the <a href="http://en.wikipedia.org/wiki/Don't_repeat_yourself">DRY principle</a> but this is just a test case. Yay tangential field:</p>

<p><img class="gallery" src='http://ergodicity.iamganesh.com/wp-content/uploads/2007/09/tensor.png' alt='tensor.png' /></p>

<p>To check the robustness of my algorithms later on, I need to add some
noise to the fields and see how well they perform. Plagued by partial
volume effects, diffusion tensor data are inherently very noisy, so
it&#8217;ll be good to include noise as part of the algorithm development
process. Right now, modeling noise in tensors is a very complicated
process because the tensors themselves are built through a linear
regression from diffusion weighted images. The noise is
definitely <strong>not</strong> gaussian. Gaussian noise is easy, and that&#8217;s what
I&#8217;m doing now until I fully understand how noise is transformed
through the regression.</p>

<p>Another matter of complication is that regularizing (or
denoising/smoothing) these fields is an active area of
research. Extensions to the standard anisotropic edge-preserving
filters like Perona-Malik are non-trivial (at least to me.) One of the
technicalities of diffusion tensors is that they are positive
definite. The positive definiteness is a physical manifestation as
diffusion can only be zero at absolute zero (a great story for
sci-fi.) I picked one of the many extensions just to test my workflow
and here are the results. Pretty good I&#8217;d say for all the
complications.</p>

<p><img class="gallery" src='http://ergodicity.iamganesh.com/wp-content/uploads/2007/09/tensor15.png' alt='tensor15.png' /> <img class="gallery" src='http://ergodicity.iamganesh.com/wp-content/uploads/2007/09/tensor15s.png' alt='tensor15s.png' /></p>

<p>Lastly, opensource visualization tools seem to trip over the simplest
of tasks. <a href="http://mayavi.sourceforge.net/">MayaVi</a> seems to be the only one that can update the VTK
pipeline when the source data changes. In other programs I have to
rebuild the pipeline each time. As you can imagine, this gets tiring
really fast.</p>

<p><img class="gallery" src='http://ergodicity.iamganesh.com/wp-content/uploads/2007/09/mayavi.png' alt='MayaVi tensor visualization' /></p>
]]></content:encoded>
			<wfw:commentRss>http://ergodicity.iamganesh.com/2007/09/radial-fields/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
