W3-STT-Smart ATM Bot
This week's assignment is to create an STT based Imperfect Robot. I partnered with Lu Lyu for this project. We talked about what defines imperfect interaction as thought starters. We think there are two directions to make an insufficient interaction. Too complicated or oversimplified. Too complicated interaction means using repeated conversation for a straightforward question, for example, asking separated questions of the street address, room number, city, state, and zip code, instead of asking the full address at once. Oversimplified interaction means jumping to a conclusion without confirmation, like giving away user information without a passcode. So we landed on the idea of building an ATM bot that is imperfect in both ways. The bots trust the user as long as they said "yes," but the ATM confirms cash mount only by increments of 100 and only takes one 20-dollar bill at a time.
We first analyzed and laid out the user flow of a simple ATM.
Insert card > take card
Enter passcodes
Select service
Withdraw
Choose amount
Choose account
Deposit
Insert cash
Choose account
Confirmation
Then we composed our imperfect ATM conversation script and user flow.
Answer for account holder name
“What’s the name of the account.”
”xxx”
Confirmation
“Are you xxx?”
“yes”
Select service
Withdraw
“Do you want to withdraw or deposit?”
“Withdraw”
“Say money to withdraw 20 dollars”
”Money” “Money” “Money” “Money”
Deposit
“Say much how you are depositing and insert cash”
”313”
”100 dollar received” “200 dollar received” “300 dollar received” “313 dollar received”
We decided to use the HTML format because it's written in javascript. We built our code on top of the basic STT sample code and the basic TTS sample code.
We created a button for each step of the interaction and used a step counter to track the user's question. In each stage, we hide the previous button, display the new one, and have the machine speak and check voice input for further interactions.
We are not sure if our way of writing is the efficient one. It works, but we feel we shouldn't have to have so many buttons. And we wish we could have the audio continually listening. Therefore pressing the button before talking is not creating a barrier to interaction