I’ve finally found the time to give some thought to the auto-tagging fanciful notion.
I posted about this a few weeks ago because I really hoped somebody knew of an application of this kind that I could use with Flickr. My plea was followed by total silence. So here’s my idea for it.
It could use some sort of bookmarklet. The bookmarklet would call a server page with the referrer url or ideally directly with the content of the page. Here there’s a fork in the path: Referrer url means that the server page/app will have to read the content of the page (again) and somehow parse it to extract meaningful information to search for the tags. On the other hand the content in a JS variable would leave all the processing work up to the JavaScript on the client machine.
The downside of the former is that it’d probably use allot of your valuable server bandwidth whereas the latter would practically leave your server unoccupied but would require an awful lot of JavaScript code, which is a royal pain in the backside.
Once you’ve got the content you’re interested in out of the referring webpage Yahoo’s term extraction APIs could turn out to be a useful thing. With Yahoo’s APIs you can get all the terms in your referring webpage ordered by importance (according to Yahoo).
What you have to do now is a search, maybe an I’m-feeling-lucky search on Google for “wikipedia <relevant terms>”. If you’ve found a document on Wikipedia, or the site of your choice, you can go back and leverage Yahoo APIs to get the meaningful terms of that document.
Few questions still remain, how do you do that with images which don’t have a title or description yet (Flickr)? How do you extract the REALLY significant terms from a huge result?
On the technical side it’d be fantastic to be able to leave most of the processing to the client through JavaScript to save server time and bandwidth.
Any thoughts or suggestion?













We have an auto-tagging solution for RSS feeds and blogs. See http://wizag.com
Will this be useful to what you are looking for?