Author Topic: Anyone gonna try this out? Amazon echo (Read 19523 times)

vulcanjedi · « **on:** November 06, 2014, 03:12:03 PM »

http://www.amazon.com/oc/echo

Haddood · « **Reply #1 on:** November 06, 2014, 03:19:14 PM »

I was just reading about it ... Sounds good but not as customizable as VC ... ( though I believe lots of functions will be added to control lights ... Etc.)

The price is to much for my taste and probably like siri won't understand a thing without Internet. However, it might be the perfect mic to hack and connect to VC when price drops to 50 ...

vulcanjedi · « **Reply #2 on:** November 06, 2014, 03:35:38 PM »

well I meant for the mic features and you could tether via BT to pipe to VC potentially.
I dont expect Alexa to be as customizable as Vox. But for a better open air mic this seems interesting.

squatingyeti · « **Reply #3 on:** November 06, 2014, 11:14:23 PM »

Yeah, the ability to possibly place it in the room and use as a mic is what has me intrigued. At the same time, the likelihood amazon will have it open at all is like a fire tablet not being locked out of google play.

Phobophile · « **Reply #4 on:** November 10, 2014, 01:28:17 PM »

Yeah, I doubt it could be used as a wireless microphone/speaker "from the box". They keep talking about their cloud, so I predict some VIP paid services in the future and further monetizing.
But the construction is interesting, it's about time someone made a microphone like that. Only we should probably wait for some custom firmware.

bobsj2000 · « **Reply #5 on:** January 27, 2015, 12:56:40 AM »

Here is a post that I found with a video demo

http://blog.zfeldman.com/2014-12-28-using-amazon-echo-to-control-lights-and-temperature/

Thoughts?

keithj69 · « **Reply #6 on:** January 28, 2015, 04:08:02 PM »

I saw this on reddit. Asked about saying stop at the end. His response "Haha yes, unfortunately - otherwise the Echo will try to interpret the command itself. It's a workaround!". Having to say stop each time is a deal breaker for me. Maybe in 6 month's it will be better.

jitterjames · « **Reply #7 on:** January 28, 2015, 05:46:12 PM »

The way I see it this device is of no interest to me whatsoever "out of the box", but at the same time this is probably the only way it could be of any use to anyone. I don't have any interest in using the text that their cloud is going to return because the recognition accuracy on that is going to be very bad. It's the same reason I don't use Google's speech recognition in VoxWav. Their recognition is great for web searches but not so great for a command utility. I get much better results using VoxCommando's predetermined grammars, which is why VoxWav feeds the actual audio to VC instead of sending Google's recognized text.

If someone can figure out how to use the array microphone on the echo, and it is a better quality/price than other array microphones, then it will be of some interest to me.

Just my opinion!

yokel · « **Reply #8 on:** June 26, 2015, 03:04:47 PM »

I know a dev that recently got accepted to the echo program. This is how third party developers have to phrase things. Hey Alexa, tell X app to do X command to X device name. I don't have high hopes of them ever allowing this device to becoming anything more than a glorified egg timer.

RickyD333 · « **Reply #9 on:** July 07, 2015, 12:21:18 PM »

I just had a chance yesterday to use the Echo and I was BLOWN away! The microphone on this thing is ridiculously good. I was standing past 30ft away from the Echo and it was still taking commands. It can even hear you perfectly while music is blasting through the thing (and I mean perfectly!).

If this thing can be modified to be used simply as a mic for Voxcommando, or maybe get VoxWav to work on it, it would be I think one of the best things to have for anyone into home automation. I cannot get over how impressive that microphone is. I have I think a really great microphone, the Acoustic Magic Array I Microphone, and the Echo blows it out of the water (for about $100 less even!). Someone has got to figure out a way to incorporate it with Vox. If I had the skill-set to do so, I would do it myself.

Haddood · « **Reply #10 on:** July 07, 2015, 04:10:16 PM »

Quote from: RickyD333 on July 07, 2015, 12:21:18 PM

I was standing past 30ft away from the Echo and it was still taking commands.

That is pretty standard with good quality mics ... The Beyer Dynamics I have, can hear whispers from about 10 feet....

Quote from: RickyD333 on July 07, 2015, 12:21:18 PM

It can even hear you perfectly while music is blasting through the thing (and I mean perfectly!).

That is echo cancelling, kinect can do that through vox commando, it will cancel any sound playing through the PC

My setup is a gentner xap800 (20$ from eBay) with 4 boundary mics (one in each room - 60$ From eBay) ... The only downside is wiring through the walls

I guess what really impressed you is the quality of the recognition ... With giant computing power behind the scene for sure the SR engine beats the hell out of MS SR on a PC ...
I guess if we could hook up VC to one of those cloud SR engines we will all be blown away regardless of the mic quality

tobiastobindev · « **Reply #11 on:** September 04, 2015, 03:37:34 PM »

Hi all,
I looked through the posts and I don't see much interest in using an Echo as a mic for VC. Despite that I would still like to run this by you. If I could successfully use VC with Echo, it has the potential to be very cool. The Kinect drives me crazy and VoxWav is by far the best way to communicate but that requires using the phone which ideally I'd like to just be able to speak and not get the phone out.

To keep it short as possible I'm not going to go into details on the Alexa Skills Kit (what Amazon setup to allow developers to extend what the Echo can do).

Consider this an experiment. Will it work kind of thing. Skipping over how Echo works for custom development I will say this. There is code on GitHub where a person has bypassed the whole Amazon way and is just having the Echo send back what was heard as text to his HTTP endpoint on his home network. From there he parses the text to figure out the intent looking for keywords. From there a module is loaded to handle the speech designed for the particular intent. My thoughts right now are on using this with Emby. So here is what I am thinking.

I say Alexa, ask Emby to open (that is sent to Amazon AWS, a Lambda function I create does what I want, and the text is sent back to my listening web server). Alexa can do conversation so she responds, 'ok, which room?'. I say 'game room' or 'bedroom' (I have multiple instances of Emby and VC). This will then give my code on my endpoint the info it needs to know which address and port to send commands to the VC UDP listener. Next, I say Alexa, ask Emby to launch Media Browser Theater. The text comes back to my endpoint, which now knows what is going on and where to send, and evaluates the text to clean it up if needed and send to the VC listener. Hopefully, VC then opens Emby theater. Without grinding through the details, that is the basic idea. Triggering a particular intent, coding on my end (the http listener box) to know how to deal with the text coming back, and then trying to send the proper text to the VC listener which will then do what it does.

Technically this should be possible, as far as I know. Speaking from the Amazon side of things at least. All that I said can be done. I do not know yet how well the recognition of dictation speech will work out. I also realize that what code I need to develop may be a pain but based on the examples I've reviewed, I've seen where others are doing this with (according to them) good success. I know that the text going into VC has to be exactly correct to work.

You may say, instead of doing that with an Echo, why don't you do this or this. I am open to ideas. I'm also open to 'did you think about this?' or 'it probably won't work because...'. But in addition my Emby goal, I am helping a person in another state who has all kinds of home automation devices and wants me to do many things for him utilizing these devices. I'm doing it for fun because it is a great learning opportunity for me and with all that he has, there is plenty of room for experimentation.

He has asked me to allow him to control any of 6 Sonos systems with Amazon Echo. I've already seen where people have done this. But I'm hoping to create a web service for the home that doesn't just do one thing based on something someone else did, but look at the big picture and come up with an extensible foundation for the home web service where the Echo can be used in home automation (beyond the built in support). Actually, I'd be extremely happy just to accomplish being able to send viable speech commands as text to VoxCommando.

I'd like to leverage VC again for his Sonos request. So this post is sort of being driven by trying to solve his request, but at the same time I'd love to work with Emby for myself. And I know that in both cases there is the issue of speaking to a device while tv or music is playing, and how hard that can/will be. If you tell the Echo to play Pandora station and then you want to say something else to it, it will hear you. But I imagine that will be tougher once the sound is coming from somewhere other than the echo device. Not to ramble, but she has impressed me with her ability to hear with the tv on, or the dogs all barking, or other noise. Things that I could never manage to do with the kinect. So the Echo seems to be good at picking out my voice in all of the noise.

Thank you all and again, thank you for VoxCommando. It is a truly awesome program.

tobias.

shanekuz · « **Reply #12 on:** September 16, 2015, 08:21:49 PM »

I thought i would share my experience with VoxCommando and the echo as its the best thing i have setup.

I use the Echo as its voice recognition is the best i have ever seen, i use it with the amazon-echo-ha-bridge https://github.com/armzilla/amazon-echo-ha-bridge This project emulates a HUE bridge and then passes an HTML request on to turn devices on or off.

I use this gateway to call Voxcommando HTTP tasks so i can say "Alexa turn on the rumpas room TV" and it calls one HTML request to VoxCommand which then, Turns on my projector, turns on my pioneer amp, selects the correct source etc turns on my wemo controlled sub etc.

This allows awesome automation as the echo is simply the speech engine and its the best thing i have seen to do this. Also this allows me once VoxCommando gets to the point where its speech recognition is as good and im sure one day it will get there i wont have to redevelop everything as its only the font end that will change

tobiastobindev · « **Reply #13 on:** September 19, 2015, 02:23:50 PM »

Hi shanekuz,
Just wondering if you have any code you could share with me? Maybe not... I was kind of wondering if you have setup a node server type setup to route the voice commands to the appropriate VC destination. Also, is there an ISY device in your setup? Possibly running the beta v5 firmware? And if so, just wondering if you have setup a node server with that. I've been waiting for them to release their demo code but since you are functional I just wondered if you had anything I could look at to maybe help give me a jump start.

All in all glad to hear it is working? Are you finding the speech recognition to be pretty accurate. I've not been able to do anything since around the time of my above post. I had to go out of town for a training course and I just go back.

If I come up with anything that would be of value to you I'll be happy to share.

Thanks,
tobias.

tobiastobindev · « **Reply #14 on:** September 20, 2015, 04:22:46 PM »

@shanekuz,
I was re-reading your post and noticed something. And, please understand I am only saying this to clarify, this is not meant as anything negative. VoxCommando does not need to improve in speech recognition. It relies on other components, both software and hardware, for speech recognition. Vox itself processes the recognized speech, and does so very nicely, but really the quality of the recognition comes down to things like microphones, MS technologies for speech recognition, or any other software for that matter (CMU Sphinx or something like that).

Dictation speech recognition is extremely difficult to have high accuracy and high confidence, and correctness, with today's technologies. Without something behind the speech to say 'I think I heard this ... is this ... in the list? Then recognition becomes very difficult. Google, Amazon, Siri, I would imagine they do so well because 1) the person it typically speaking directly into the mic right in front of their face (good input to begin with) 2) they probably have developed sophisticated matching algorithms to help turn the babble into something recognizable. Point being, Vox itself does not need to improve in recognition, everything else does.

If you google around you can come up with your own little test to see what I am talking about. If you use MS speech recognition (not Cortana), and say 'open Word', it will probably open Word. But if you open Word and then speak a letter to someone (especially if you do not train it first, and are not sitting close to the mic) watch how horrible it is at figuring out what you are saying. And if you want to dig even deeper, read some of the articles on the net about how speech recognition on computers words. It is a very complex topic and amazing that it works as well as it does period.

All that being said, yes, I would imagine as time rolls by these technologies will get better and better to the point where some of the issues I mentioned will not be such an issue.

Your response to me suggestions that you are having pretty good success with Echo and Vox. A proper Alexa Skill will have an utterances file, which as we now know, greatly improves the chance of recognition. Vox takes care of generating massive amounts of speech recognition into xml (tv shows, movies, music (artist, album, song)). This is wonderful. But I wonder how good Echo will work with no utterance file and purely grabbing the speech, turning it to text, parsing it and then sending it to Vox. Any mistake and the system won't work. I suppose the best thing I can do is just try. I'll be surprised if it works anywhere near as good as VoxWav though.

Thank you again for your post and sharing.

tobias.

Author Topic: Anyone gonna try this out? Amazon echo (Read 19523 times)

vulcanjedi

Anyone gonna try this out? Amazon echo

Haddood

Re: Anyone gonna try this out? Amazon echo

vulcanjedi

Re: Anyone gonna try this out? Amazon echo

squatingyeti

Re: Anyone gonna try this out? Amazon echo

Phobophile

Re: Anyone gonna try this out? Amazon echo

bobsj2000

it would seem that someone has already started to figure it out

keithj69

Re: Anyone gonna try this out? Amazon echo

jitterjames

Re: Anyone gonna try this out? Amazon echo

yokel

Re: Anyone gonna try this out? Amazon echo

RickyD333

Re: Anyone gonna try this out? Amazon echo

Haddood

Re: Anyone gonna try this out? Amazon echo

tobiastobindev

Re: Anyone gonna try this out? Amazon echo

shanekuz

Re: Anyone gonna try this out? Amazon echo

tobiastobindev

Re: Anyone gonna try this out? Amazon echo

tobiastobindev

Re: Anyone gonna try this out? Amazon echo