SCHEDULE AT-A-GLANCE

DAY1. Wednesday, May 1 15:00-17:00(PDT)
at ONE Summit 2024(On site + virtual)


2 hour on-site discussion
  Agenda(Idea)
    Presentation session
    Akraino 2024 activities
    Collaboration with other communities

SAN JOSE MCENERY CONVENTION CENTER
Room 113 at first floor



Day1. 

Wednesday, May 1

Zoom Link: https://zoom.us/j/98538301700?pwd=RXlFdHpZRDlHTzFaVFRnakw2b0F5QT09

Recording: TBD

Time(UTC-7)Topics
15:00-15:10

Welcome note
Yin Ding TSC Chair
Fukano Haruhisa TSC Co-Chair

15:10-15:30

Jeff Brower,
CEO, Signalogic

Small Language Model for Device AI Applications

Device AI applications running at the AI Edge on very small form-factor devices (for example pico ITX), and without an online cloud connection, need to perform automatic speech recognition (ASR) under difficult conditions, including background noise, urgent or stressed voice input, and other talkers in the background. For robotics applications, background noise may also include servo motor and other mechanical noise. Under these conditions, efficient open source ASRs such as Kaldi and Whisper tend to produce "sound-alike" errors, for example:

  in the early days a king rolled the stake
  
which contains two (2) sound-alike errors that must be corrected to "in the early days a king ruled the state". Sound-alike errors are particularly problematic for robotics applications in which the robot OS requires precise API commands, for example a robotaxi has stalled and must be instructed to "move forward 20 feet, to the right 10 feet, raise the hood, and turn off the engine". A first responder may use a portable backpack device and give commands "get off the road in that turn-out up ahead and shut it down" or similar. Any sound-alike errors in voice commands make translation to machine APIs problematic.

To address this issue, Signalogic is developing a Small Language Model (SLM) to correct sound-alike errors, capable of running in a very small form-factor and under 10W, for example using two (2) Atom CPU cores. The SLM must run every 1/2 second and with a backwards/forwards context of 3-4 words. Unlike an LLM, a wide context window, domain knowledge, and extensive web page training are not needed.


15:30-15:50

Hidetsugu Sugiyama, 

Chief Technology Strategist - Global TME, Red Hat


15:50-16:20Vijay Pal, Predictive Maintenance of Hardware

16:20-17:00

・Discussion about Akraino 2024 activities

・Collaboration with LF Edge AI Edge and Edge lake

Closing





Call for proposal

NoNameCompanye-mailPresentation titleAbstractPreferred Time ZoneComments
1Jeff BrowerSignalogicjbrower at signalogic dot comSmall Language Model for Device AI Applications

Device AI applications running at the AI Edge on very small form-factor devices (for example pico ITX), and without an online cloud connection, need to perform automatic speech recognition (ASR) under difficult conditions, including background noise, urgent or stressed voice input, and other talkers in the background. For robotics applications, background noise may also include servo motor and other mechanical noise. Under these conditions, efficient open source ASRs such as Kaldi and Whisper tend to produce "sound-alike" errors, for example:

  in the early days a king rolled the stake
  
which contains two (2) sound-alike errors that must be corrected to "in the early days a king ruled the state". Sound-alike errors are particularly problematic for robotics applications in which the robot OS requires precise API commands, for example a robotaxi has stalled and must be instructed to "move forward 20 feet, to the right 10 feet, raise the hood, and turn off the engine". A first responder may use a portable backpack device and give commands "get off the road in that turn-out up ahead and shut it down" or similar. Any sound-alike errors in voice commands make translation to machine APIs problematic.

To address this issue, Signalogic is developing a Small Language Model (SLM) to correct sound-alike errors, capable of running in a very small form-factor and under 10W, for example using two (2) Atom CPU cores. The SLM must run every 1/2 second and with a backwards/forwards context of 3-4 words. Unlike an LLM, a wide context window, domain knowledge, and extensive web page training are not needed.

PDT
2






3






4






5






6






7






8






9






10



















  • No labels