A Guide on Preventing Interruptions from ChatGPT's Voice Feature

I was recently waiting for my nails to dry and didn’t want to smudge the paint, when it dawned on me that this would be the perfect opportunity to test some voice-only artificial intelligence features. Silicon Valley car owners are having long conversations with ChatGPT as they drive around, and I wanted to try chatting hands-free before meeting with two OpenAI product leads later that day.

Even though chatbots can be helpful for brainstorms, speaking back-and-forth with ChatGPT was like collaborating with an over-caffeinated friend who can’t stand even a second of silence. I was valiantly fighting against the artificial intelligence tool to finish a single, complete thought before it cut me off.

Me: I wrote a newsletter called AI Unlocked last year for our readers. In that newsletter, I …

ChatGPT: Tell me more about your newsletter and what specific question you have in mind related to it.

Only a couple minutes into the experiment, I experienced synthetic-speech-induced stage fright and pleaded with the chatbot for more time, asking for it to give me a second to think between sentences. The chatbot encouraged me to slow down, though the quick cadence of its responses remained unchanged.

When I mentioned the anxiety I experienced while chatting with the AI to Joanne Jang, a model behavior lead for ChatGPT, she explained it’s an aspect of the user experience the company is trying to fix within the AI model. “In our ideal world, the model would actually be a little bit better at detecting when you’re done. So, if you’re not done with your sentence, then it wouldn’t cut you off,” Jang says. “This is something that we’re trying to figure out, and we know that it’s a pain point for our users.”

With the caveat that you shouldn’t do this while driving, she suggested a simple solution for users: Just tap on the screen. As long as you have one finger free, you can tap and hold the large circle in the center of the app during conversations with the ChatGPT. Keep your finger there as you’re speaking to avoid any bot interruptions; let it go whenever you’re actually wrapped up with your vocal prompt.

While Nick Turley, a ChatGPT product lead, said he prefers using the back-and-forth conversation feature, available in the app by touching the headphone icon, he recommends another method of audible interaction for users who need more time and want to slow things down a bit, or who just find the default rhythm of the AI conversation to be awkward.

In the mobile app, tap on the microphone icon next to the headphones. Say whatever you’d like to use in your prompt, and then hit the blue area to stop the recording when finished. ChatGPT will convert the audio to text and add it to the prompt field. After you press Send, listen to ChatGPT’s response by long-pressing on the output, then selecting Read Aloud. This slowed-down process is a pleasant way to interact vocally with the AI tool at your own pace, for those who might get stressed out by the service’s rapid verbal responses.

Despite flaws, the tool is already more engaging than any interaction I’ve had with a previous-generation voice assistant, like Siri or Alexa. Since the launch of Siri over a decade ago, voice assistants have continued to improve, but they have failed to dramatically transform how users interact with technology day-to-day. I’m still typing up this article on a laptop, not orating my thoughts to Alexa. Similarly, I use my Google Nest Mini for playing music and setting kitchen timers, and that’s about it.

OpenAI’s two product leads seem eager to usher in ChatGPT’s voice assistant era. “We hope to evolve it more and more toward an assistant,” says Turley. “So, that means giving you more natural ways to talk to it.” It’s quite likely that ChatGPT will soon be able to match my conversational cadence and quell the pesky interruptions. The company recently announced a separate Voice Engine model that can re-create anyone’s voice with just a small snippet of audio. For example, a sales professional might be able to set up an AI voice assistant that fields incoming calls using their speech style, or mourning relatives could create a synthetic imitation of a deceased loved one’s voice.

Although ChatGPT is a dominant player in the AI chatbot ecosystem, OpenAI is not the only company with a unique, AI-powered voice assistant. For example, Google Assistant got a generative AI makeover last year. Rabbit and Humane are both dabbling with the idea of AI-focused hardware that uses voice commands as a primary mode of interaction. Another startup, Hume, recently launched a preview of emotion-centered software, called the Empathic Voice Interface, that attempts to match the AI’s emotional outputs to the tone it detects in your vocal prompts; if you’re acting silly or somber, it switches moods to mirror yours.

Will advances in generative AI lead to another breakthrough moment of increased utility for voice assistants? Back in 2018, WIRED senior reporter Lauren Goode wrote about the awkwardness of Amazon’s Alexa: “When these things do become more useful, we probably won’t notice it happening. Instead, the tech will just evolve around us.” Maybe I won’t recognize the significance of voice assistants until they’re part of my everyday routine, but I’ll notice immediately whenever they stop cutting me off.

A Guide on Preventing Interruptions from ChatGPT’s Voice Feature

ChatGPT’s voice feature is an incredible tool that allows users to interact with OpenAI’s language model using spoken commands. It opens up a whole new dimension of possibilities for communication and interaction. However, like any technology, it can sometimes lead to interruptions or unexpected behavior. In this guide, we will explore some tips and strategies to prevent interruptions when using ChatGPT’s voice feature.

1. Be clear and concise:
When giving commands or asking questions, it is important to be clear and concise in your speech. Avoid using ambiguous or convoluted sentences that may confuse the model. Instead, use simple and direct language to convey your message effectively.

2. Use explicit wake words:
ChatGPT’s voice feature requires a wake word to activate the model. By default, the wake word is “Hey GPT.” However, you can customize it to something else if you find that the model is frequently getting activated unintentionally. Choosing a unique and less common wake word can help reduce false activations.

3. Pause before and after commands:
To minimize interruptions, it is helpful to pause for a brief moment before issuing a command and after receiving a response. This allows the model to process your input and respond appropriately without cutting off or interrupting its output.

4. Utilize context:
Providing context is crucial for a smooth conversation with ChatGPT. When interacting with the voice feature, try to include relevant information or refer back to previous statements to help the model understand the context of your conversation. This can reduce misunderstandings and prevent unnecessary interruptions.

5. Experiment with different phrasings:
If you find that ChatGPT frequently interrupts or misunderstands your commands, try experimenting with different phrasings or sentence structures. Sometimes, rephrasing your question or command can make it clearer for the model to understand and respond accurately.

6. Adjust the temperature and max tokens:
ChatGPT’s voice feature allows you to adjust the temperature and max tokens settings. Temperature controls the randomness of the model’s responses, while max tokens limits the length of the response. By fine-tuning these settings, you can tailor the behavior of the model to suit your preferences and reduce the likelihood of long-winded or unexpected interruptions.

7. Provide feedback to OpenAI:
OpenAI actively seeks user feedback to improve their models. If you encounter frequent interruptions or unexpected behavior while using ChatGPT’s voice feature, consider providing feedback to OpenAI. This helps them understand the challenges users face and work towards enhancing the user experience.

In conclusion, ChatGPT’s voice feature is an exciting addition to OpenAI’s language model, enabling users to interact with the model using spoken commands. By following these tips and strategies, you can prevent interruptions and enhance your experience when using ChatGPT’s voice feature. Remember to be clear and concise, use explicit wake words, provide context, experiment with phrasings, adjust settings, and provide feedback to OpenAI. Enjoy seamless and uninterrupted conversations with ChatGPT!