Archive for December, 2006

Compressed Sensing

Posted in Computing 3 years, 8 months ago

Suppose you want to acquire data in R^n, then Nyquist’s sampling theorem places a lower bound on the number of samples required. Mathematically, with a basis set \phi_n we can represent any function as an infinite sum:

f(x) = \Sigma c_n \phi_n

where for an orthonormal basis, the coefficients are given as:

c_n =  < f(x), \phi_n >

In reality, we represent the samples in an appropriate basis and ignore values below some threshold. So, if we are going to approximate signals anyway, why not make a smaller number of measurements rather than each point as required by the sampling theorem? This is exactly the point of Compressed Sensing.

One of the pioneers of Compressed Sampling, David Donobo gives an example:

For certain classes of images with m pixels, choosing an appropriate set of linear functions, with a reasonable margin of reconstruction error lets us get away with only n non-adaptive non-pixel samples for faithful recovery. The size of n is dramatically smaller than m.

The bound on n is n = O(m^{1/4}\log^{5/2}(m)).

A news article in SIAM News further explains how this reconstruction is accomplished by a l_1 norm minimization. It also talks about how cutting down the number of samples required has profound advantages — savings in money, time or even tissue damage (as in the case of X-rays.) It also has an example of a numerical experiment where they perfectly reconstruct a 512×512 image from just 512 Fourier coefficients, that is with over 95% of the data missing.

The compressed sensing website at Rice is a great resource.

WikiSearch by Wikiasari

Posted in Web 3 years, 8 months ago

Most search engines rank results based on their “authority.” Google for example, does this by the number of incoming high quality links. Considering the fact that Wikipedia these days is a source of high quality material, you find Wikipedia as the top result of many search terms.

Google’s results these days are less than optimal. Spammers have become quite sophisticated in their methods. Results are cluttered with iframe results, shell websites, and other useless listings.

Wikiasari is a new venture by the same guys behind Wikipedia. This is exactly what I’ve been looking forward to – an engine that prefers Wikipedia over other sources.

The Times quotes Jimmy Wales, the founder of Wikipedia:

Google is very good at many types of search, but in many instances it produces nothing but spam and useless crap. Try searching for the term ‘Tampa hotels’, for example, and you will not get any useful results.

The engine is built over Lucene, which is also behind Beagle, the Free indexer for Linux desktops. They also say the index will be released under the GFDL.

Auctioneer Auctions

Posted in Economics 3 years, 8 months ago

I had previously written about conducting Vickerey-type auctions honestly. I also wrote about why Google doesn’t have to be completely honest to take advantage of the current hype around adwords, although temporary. This is the cynic in me speaking up, with absolutely no proof to back my claims.

More recently, people have been talking about Google being dishonest in a different way. “Google cheats” some say. The basic premise behind the argument is that Google favors its own product advertisements over competitors. That was then refuted by a Google employee. This was enough to satisfy most people [[I don't really care either way.]].

If you’re like me, I don’t really remember URLs anymore. I press “Ctrl+K” to get to the search box in firefox and type in what I want to get to the site. These include “google calendar,” “sfu webmail,” “slashdot” amongst others. But, if you type “yahoo calendar” into Google, you get this (click to expand):

Google search screenshot

I don’t know what to say. “Why have they become pimps?” asks Philipp Lenssen. Gold.