The Future is Now...
Wednesday, November 23, 2011 at 8:34AM As mentioned in a previous post, I've been listening to some 'tech' podcasts recently. It on these podcasts that I first heard about 'Siri', the cloud-based voice recognition software that Apple acquired a few years ago and made available in its current incarnation on the iPhone 4S (Apple's official perhaps somewhat idealized advertisement video can be found here). Having been very disappointed with previous voice recognition software1, I didn't really pay much attention to it.
Then I ws biking home from work, listening to some music through my bluetooth headset, and thought I'd give the system a try. My phone sits in a pocket on my biking jacket, and activating Siri requires only holding down the iPhone's 'home' button for a few seconds. I tried a bunch of different ways of saying things, and while I got some odd results at first, I began to figure out things that work consistently.
For example, I can say 'shuffle playlist jogging' and it'll start playing that particular playlist. 'Next song' skips ahead, 'pause/resume music' do just that, etc. You can actually say things like 'Play album Thank You' or 'Play podcast Giant bomb' and it'll actually do it. Furthermore, if I get a notification tone while biking I can say 'Notifications' and it'll read off any new emails, txt messages, tweets, etc. I've received. Maybe I'm easily impressed, but this is pretty Star Trek to me2. Oh, and is there anything better than a lab timer that you can talk to? ('Set timer, 45 minutes').
The one area where Siri isn't so great, at least so far, is in taking dictation. That's not exactly fair - it's quite accurate if you stick to common, well-defined words. However, as a scientist who uses a lot of non-standard verbiage, Siri's not particularly useful; it does incorporate some good ideas though. For one, it highlights all of the words that it detects as ambiguous - this includes both misunderstood words and homonyms. You can then click on them and select from a list of best guesses or edit them yourself. This is not particularly useful as a hands-free feature, though. I can see some utility in being able to fire off quick acknowledgement replies to emails or text messages, but I doubt that many screenplays will be written by dictation.
It's not clear from the Apple website, nor the dreaded Wikipedia how Siri works, but according to the Tested.com podcast, the system actually offloads data-processing to Apple's 'cloud' servers where the heavy crunching is done. The results are then sent back as text to your phone (in practice this takes a matter of seconds)3. If this is the case, it's an interesting glimpse into the future of processing, where your PC will be naught more than a box connected to the internet and the heavy lifting will all be done on servers.
Stuff like this makes me think that current/future generations take technological progress for granted. In our parent's first 30 years of life, they went from what: AM to FM radio? Teletype to fax machine? Mimeograph to photocopier? Frustrating wheels to power steering? I'm exaggerating a bit, but it's difficult to understate that we've gone from the Sears Wishbook of my youth to buying stuff on Amazon.com over 3G Wireless internet on our cell phones, or looking up stuff in a 12 volume encyclopedia to searching libraries of text on Google.
I challenge you to go back and read sci-fi books from the 70s. In much of the imagination of those authors, we're already living hundreds of years into the future4.
1Did anyone ever try Dragon Naturally Speaking back in the day? This was a piece of sofware that would allegedly allow you to dictate Microsoft Word documents. I tried demos of a few versions and remember being stunned by how poorly the software worked - It's not exactly useful if I have to go back and correct the dictation by hand every 5 or 6 words.
2Incidentally, the pocket I use for my phone happens to be on the upper-left side of my chest. Thus activating Siri basically involves me 'tapping' a virtual communicator badge thingie...
3This explanation seems a bit weird as it means that you require an active data connection to tell the phone to play music. It may, however, only do such cloud processing for internet related searches and not for on device lookups. This cloud thing also may explain why the system doesn't work on previous iPhones, even when 'hacked in' - Apple can tell the phone's model when the servers are accessed and thus it can reject the data.
4For instance, in the classic sci-fi book The Mote in God's Eye (1974), the authors, Larry Niven and Jerry Pournelle, predict that by the year 3,000 we'd have invented faster-than-light travel. Yet we'd still lack wireless networking as well as contemporary laptop computers. The sci-fi novels that impress me are the rare ones that really 'nail' it. I'd still rank Neuromancer (1984) among the finest in this tradition, as well as the odd, yet interesting, Snow Crash (1992). This being said, I'm not the most well-versed in classic sci-fi. I'm always looking for recommendations though!
Carlo |
6 Comments |
Musings,
Technology 



