Q&A: Peter Fairfax, Head of Data Science
We spoke to Peter Fairfax, Data Science Manager at Brandwatch, to find out more about ChatGPT and the Large Language Models (LLM) that are driving these new developments, as well as what Brandwatch and our industry can expect in the future.
Hi Peter, these developments using ChatGPT are very exciting! Can you tell us more about ChatGPT and how it works?
“ChatGPT is a language model, as there are many others. Simply put, a language model analyzes the probability of words being spelled, based on existing texts that have been shown to the model. For example, suppose we have the following quote: Will Smith likes to eat [EXEMPLE].
A language model can compare the probability that [EXEMPLE] it is made up of many words, like spaghetti or cats. Will Smith is not a cat-eating psychopath and recently there has been an online fascination with raw AI-generated videos showing him eating spaghetti. A model trained from this data will likely think spaghetti is more likely to be eaten than cats.
The predictive text on your phone is based on a simple language model. Bigger models may be more complicated, but the point is that language models are powered by word probabilities.”
Why is ChatGPT so important if it’s just another language model?
“For several reasons. First, this is a particularly large and sophisticated model. As language models get larger and are fed with more data, they tend to be able to solve larger and more complex tasks. For example, ChatGPT and GPT-4 can write code and answer complex questions. Bigger and more trained models can do more abstract and complicated tasks and help us unlock other features like we will be launching soon. To me, the newest LLM feature shows that we are working see AI outperform humans on some linguistic tasks, such as the linguistic equivalent of Deep Blue beating the best human chess champion in 1997.
Unfortunately for machines, this means that from now on they’ll probably spend more time answering questions about lost packages than enjoying playing board games, but that’s fine by me.”
There is a lot of advertising, but what are the limits?
“ChatGPT is incredibly good in many areas, but even the best technology has weaknesses. In the case of ChatGPT, it lacks up-to-date knowledge and sometimes struggles with calculations.
Also, the internet is sometimes a dark and strange place, where people share all kinds of opinions. This material can end up in LLM training data, so they sometimes repeat opinions that we feel are disreputable or wrong.
How will Brandwatch technology work with more sophisticated LLMs in the future?
“As I said, these models aren’t perfect, but they complement our technology.
ChatGPT has no idea what’s going on right now: its training data ends in September 2021. Brandwatch Consumer Research (BCR) receives up to 50,000 new documents per second and enriches each of them with sentiment, location, entities based on GPT and all other metadata in minutes.
ChatGPT can struggle with arithmetic, confidently giving plausible but wrong answers. In contrast, Brandwatch is designed to analyze data at scale and create insights using AI, statistics and complex aggregations. Our new features combine the best of ChatGPT and Brandwatch, so you get reliable real-time quantitative insights that are even easier for the user to understand.
One example is AI-powered conversational insights. This lets you click on any data point in our dashboard, such as peaks or segments in your data, and now with ChatGPT’s power to summarize large amounts of text, you’ll get a succinct, natural language overview of what’s driving fashion.
More broadly, I expect we’ll see a shift towards chatbot-like functionality for non-experts, allowing people to interact with all of our data without having to know all the tricks to learn lessons from the noise. Behind the scenes, this will require more information to be explained in text form, and I really hope this has the side effect of improving accessibility for blind and visually impaired people.”
What will be the impact of LLMs on the future of our industry?
“It is impossible to predict, but I would say in advance: more ways of interacting, a perfect combination of digital AI and language models and fostering human creativity.
We should be able to make social listening available to more people and leverage technologies like AI-powered research. It’s good that you no longer need to be a Boolean expert to write queries, but the new LLMs will also make it easier and faster to synthesize information and tell stories.
Statistics will always have a key role to play and the world of AI beyond language models. They will continue to be the backbone for real-time analytics and monitoring, as well as long-term forecasting, and will link more closely to language models.
ChatGPT and other LLMs are capable of producing human sounding content extremely effectively and will continue to improve. The companies with the best long-term future are the ones that listen carefully to their users and think creatively about the fundamental problems that AI can solve, as well as the parts of the job that users want to spend the least time on. .
There will be risks along the way, and we shouldn’t get caught up in the hype to the point where we forget our ethical and scientific standards. As data scientists, we have ethical obligations in terms of accuracy and robustness. But there will also be a lot of serendipity, and I’m really excited to see what amazing things we’re now able to build and who we can help along the way.”