The Amp Hour Electronics Podcast

A weekly show about the trends in the electronic industry.

  • For Us
    • Donate
    • Link Here!
    • Suggest
      • Guest Suggestions
      • Story Suggestions
      • Feature My Workbench!
    • Advertising
  • For You
    • Episode Index
    • Guest Episodes
    • Buy Stuff
  • About
  • Email
  • Facebook
  • LinkedIn
  • RSS
  • Twitter
  • YouTube
You are here: Home / Guest Appearance / #258 – An Interview with Bertrand Irrisou and Gerald Friedland of Audeme

#258 – An Interview with Bertrand Irrisou and Gerald Friedland of Audeme

Play

Podcast: Play in new window | Download

Subscribe: Apple Podcasts | RSS

IMG_6870

Welcome Bertrand Irrisou and Gerald Friedland of Audeme!

  • Audeme has a kickstarter for the Movi platform, a speaker independent, cloudless speech detection Arduino shield.
  • The Movi also solves a range of privacy issues, as no audio goes to the cloud. This is in contrast to the Samsung TVs sending voices to 3rd party companies.
  • Dave has done a video on a voice detection chip from the 80s, the VCP200. There was also a similar version called the SPO256.
    [tube]https://www.youtube.com/watch?v=kFth9K_IvwA[/tube]
  • Chris mentions that there is a big difference between a modern speech synth and a Speak and Spell.
  • ELIZA was a 1970s NeuroLinguistic Programming product that provided a surprisingly cogent speech engine. 
  • All of these chips are based on phonemes, which allow a conversion from sound to actual word meanings.
  • Audeme is using the Allwinner A13 chip which has a single core Cortex A8. It also has a built in synthesizer.
  • They are running Debian on board, which allows low level driver control and high level software (via Linux).
  • Chris has seen something similar on his mobile, the MotoX. It has a coprocessor that listens for audio cues.
  • Algorithms are open source, will go online when released.
  • Audeme will be at the Embedded System Conference later this month. 
  • Users can expect a response time of 0.5 sec normally.
  • The audio front end has an Automatic Gain Control op amp, which can implement echo cancel if an external mic array is used.
  • The Movi was recently used to control the Romibo robot. It needed to “listen” through a set of fur.
  • The expected frequency response of the Movi is 50 – 16 kHz (in contrast to a 300-3kHz response for a telephone)
  • The processing uses MFCC. This is in combination with a DCT -> Mel spacing -> DCT cycle.
  • Using the Movi with a push To talk (PTT) helps to reduce overall errors. Callsigns (like Siri) also remove false triggering. Chris wants to  talk to it like JARVIS (from Iron Man).
  • The board has a peak power requirement of 3W.
  • The interface, including for the Arduino, uses a RS232 protocol for the interface. This means it can be used for any platform, not just Arduino. 
  • The code requires is mostly C/C++ for low level, Python for glue, Shell scripting for OS stuff. It’s a wide variety of programming languages and platforms.

You should consider backing this project via their Kickstarter page. You can also see a variety of videos of this board in action on their Facebook page.

 

Comments

  1. James says

    July 20, 2015 at 7:24 am

    What’s happened to the three-word titles? 🙁

    • Chris Gammell says

      July 20, 2015 at 10:20 am

      Finally someone noticed! After 256 alliterative episodes, we decided to retire the practice. We were running out of words and the time to make a title kept expanding. We’ll take suggestions for new ways to title episodes.

      • dentaku says

        July 20, 2015 at 1:40 pm

        I noticed that when you interviewed FBZ.
        You could just title the show using something interesting that was said in that episode. That’s what Elicia does.

Copyright © 2023