Pictures everywhere…
Many times in a week, a bunch of us CVITians settle near the coffee shop or in the corridor beside CDE and engage ourselves in lively chats about umpteen number of things. Often we wonder about the impact that our dear Vision is going to have in near future.. an year back I would write them off as delusions born out of our obsession with the greatness of our ‘field’.
But no longer!!, with the emergence more and more companies like riya, flickr, zooomr, pixsy. .. I think Vision technologies will be the next big thing on web. Its only natural.. after all, visual information simply enhances humans understanding and appreciation of things, enriches the experience.. be it learning or fun and makes a lasting impression. Hence the importance of images, photos, visual aids in teaching, colorful business presentations [and of course item numbers in movies.. :)].
Having said that, where is vision research now? I am reminded of one of oure professor’s comment [another 80-20 rule?]: ‘80% of vision problems are solved, but the remaining 20% need to be solved for the other solutions to be applied in practice’.. perhaps true. And sadly the remaning 20% so far proved quite a hardnut to crack. Till recently research was only in labs, in high-end products, in the movie industry. I was under the impression that ‘Visual Information Technology’ , as we call it, would require a very long time to have big impact… say as big as the search in text domain.
And yet, Riya rocked from the very inception, Flickr clicked even with manual annotation and tagging.. their latest geo-tagging is rocking even more [perhaps Y! will hit back soon.. riding on this new success.. but for that lucky aquisition we would have seen a G-monopoly by now]. I think what catalyzed this materialization was that people realized this : the data cant be ignored just because the technology to process it isnt mature enough. The fact that even tricks to get visual data annotated through humans using games like ESP, Peekaboom and those on Flickr saw so much success and attention shows the importance of indexing visual data.. no matter how!!. The latest wikimapia thing is in same genre. An automated way to index visual data is still miles away from us: Content Based Image Retrieval systems were never really of any use in practice.. Even the great Zisserman quipped about his ‘bag of words’ model : ‘I agree its not yet ready to be used in practice’. But future is unbounded.. nething is possible.. So folks, that’s where you should work : get visual data indexed.. whatever be the method.. research on image matching, extraction of semantic concepts or may be even user interfaces [so that manual tagging can get faster and easier]. Same with videos.. its not extremely funny to think of a ‘visual content management’ company which employs thousands of people to tag videos and images.
But thats not the only direction of research.. Microsoft’s photosynth says it all. This is a concept on which the ICCV-05 contest was based on [we were the only institute to participate from India]. Say you walked around the TajMahal pulling snaps as you went around it. The basic challenge is to organize the snaps such that the you can view the full panorama and then perhaps extract the 3D structure and let you walk through a model of the Taj. The speed at which M$ had it materialized in to a product is amazing though [kudos to and UWash.. also an example how pumping in money in to universities can help.. :). Learn something Indian Coroprates ]… now think of it… You can enjoy the beauty of great wall of China or Eiffel tower without ever going there [its 3D.. not a video.. mind you!!]
Then comes the old horse, document analysis [both handwritten and printed].. heavily underrated and underappreciated, it thrives nevertheless. Ironically, a foreigner realizes its potential of OCRs better than we do and comes here to work !! God, do we Indians ever do nething better than worshipping the west. I was told Vardhman of a Kai-Fu-Lee talk : he always talks of Google for China/MSR for China and not Google-China/MSR-China!!. I have a very sincere request to people at our insti. working on OCR for Indian Languages : take pride in your work… even Paul Viola, David Stork work in this area and on similar problems.
Note that this is not an exhaustive list of prospective areas in vision.. but I listed some domains where research which was for long restricted to labs and high-end products could now come so close to peoples lives and could change the way they think of the web and information.
Now what? LOTD : M$ gets yet another jolt
September 2, 2006 at 10:01 am
[...] Thoughts of an Outlier my life, interest, tech news, relationships etc. « Pictures everywhere… [...]
September 3, 2006 at 2:07 am
nice and comprehensive post! finally a promise fulfilled :p .. it would be
really nice to see some of those things happen .