BlenderBot 3, a chatbot with 175B parameters that can learn through real-time interactions with people “in the field,” was released as open source by Meta AI Research. In comparison to the prior BlenderBot version, BlenderBot 3 has a 31% higher rating from human judges.
On the Meta AI blog, a post describing the chatbot was published. The pre-trained language model OPT-175B is the foundation of BlenderBot 3. As a result, the bot’s tendency to “hallucinate” is less and talks are more coherent. The bot can also retrieve information from Internet searches and conversational long-term memory. Individuals in the United States can talk with the bot on an interactive demo site that Meta has created to assist collect more training data. To stop malevolent users from starting toxic conversations, Meta has also included multiple technologies for spotting foul language and trolling. As reported by the Meta team:
“We are committed to sharing organic conversational data collected from the interactive demo system as well as model snapshots in the future. We hope this work will help the wider AI community spur progress in building ever-improving intelligent AI systems that can interact with people in safe and helpful ways.”
It has been demonstrated that large pre-trained language models, like GPT-3, make solid chatbot foundations, especially when tweaked using dialog-oriented datasets. BlenderBot 2.0, which added the capacity to combine Internet search results and long-term conversation memory to the language model, was released by Meta in 2021. This significantly increased the bot’s factual consistency.
One drawback of the current strategy is that the datasets used for fine-tuning are created by researchers working using crowd-sourced labour, which inevitably restricts the volume and breadth of the data. The objective of Meta is to gather “organic interactions” with users using a chat interface that is open to the public. The Meta researchers have created a number of technologies to lessen the impact of toxic users and enhance the chatbot’s capacity to learn from these encounters, despite the hazards associated with this strategy.
First, there is a “dislike” button on the chatbot interface that consumers can press if they don’t like the response the bot provided. This information is then used to train upcoming bot generations using a novel method called Director. The language generation model in this system has a language classifier that prevents the formation of negative word sequences. Meta has researched various methods for identifying hostile or troll user input and reducing its impact on training data. To test troll recognition techniques, Meta created the SafetyMix benchmark in conjunction with this research.
Three alternative iterations of BlenderBot 3 were produced by the Meta researchers, each with a different set of language model parameters. The ParlAI chatbot framework website offers the 3B and 30B parameter versions, as well as source code and training data sets. Currently, access to the complete 175B parameter model is restricted and only provided upon request to specific researchers.