What is ChatGPT and how does it work?

What is ChatGPT?

ChatGPT is an artificial intelligence (AI) technology that successfully combines supervised, transfer and reinforcement learning, allowing it to comprehend human language and respond to queries with human-like responses. Ever since its launch by OpenAI in November 2022, ChatGPT has generated a lot of buzz in the news, and general discussion, around the application and its transformative potential. As with every new technology, the discussions have been a healthy mix of technological advancement, hype and scepticism. We are already seeing ChatGPT rapidly climb to the peak of inflated expectations of the Gartner hype cycle.

Gartner Hype Cycle (Credit: Gartner)

However, what comes next is the trough of disillusionment, which then leads to significant headwinds in adoption of the technology. We believe that having a more balanced view and realistic understanding of ChatGPT’s capabilities from the outset could help smooth out the peak of hype and create a shallower trough of disillusionment. To that end, we believe that everybody, regardless of their area of work, needs to develop at least a rudimentary understanding of how this application works and hence what its strengths and limitations are because failure to do so, will inevitably result in disappointment, misuse, or failure to leverage the potential of ChatGPT.

How does ChatGPT work?

A simple approach to understanding ChatGPT is to deconstruct its name. The last part of the name, GPT, stands for Generative Pretrained Transformers, which is an advanced large language model that leverages deep learning algorithms. Despite what the name suggests, it belongs in the NLP (Natural Language Processing) domain within AI.

GPT uses NLP techniques such as self-attention and deep learning to generate human-like text. It does this by looking at the existing words in a sentence and predicting the next words in a conversation-based context. What is notable here is that the model builds its response not based on reasoning but word and text patterns which it has seen in its training data.

‘Chat’ simply relates to the ability of the model to provide responses in a conversational manner.

Now that we understand what ChatGPT encompasses, let us look at how it was trained.

ChatGPT is based on a large language model called GPT3.5. GPT3.5, like many other large language models, was trained using a deep learning technique called transformer architecture. The training process for these models involves exposing the model to vast amounts of text data, such as books, articles, and websites, to learn the patterns and relationships between words and sentences. This pre-training procedure is a form of self-supervised learning, as the correct next word can be determined by simply looking at the dataset.

ChatGPT was created by leveraging GPT3.5 and training it further with a batch of prompts (i.e., commands or questions), submitted by test users and professionally annotated to provide appropriate answers for the defined questions. For this, OpenAI’s human trainers provided conversations in which they played as both the user and the AI assistant.

Next, a machine learning technique called Reinforcement Learning from Human Feedback (RLHF), was used to train ChatGPT to simulate dialogue, answer follow-up questions, admit mistakes, challenge incorrect premises, and reject inappropriate requests. The training process involved generating a large number of responses from the model and then human evaluators rating the quality of each response (Step 2 in Figure 1). The feedback from the evaluators was used to update the model's parameters and improve its performance over time (Step 3 in Figure 1).

One thing to note is that the model determines the best response based on what words it sees going together in the training text – it does not reason logically (more on this in the example below). Ergo, the responses from the model are very dependent on the variety of text that it has seen while training.

Learning on the job – Equally, if not more important to the success of these models is how they continue to learn and improve over time. This is done using reinforcement learning, which depends on feedback to determine which answers are of a high calibre for a given context based on a variety of criteria (such as accuracy, relevance to context, helpfulness, privacy for individuals and avoiding offensive content). Users of ChatGPT would have noticed how the application, having responded to a user’s query, asks for feedback regarding its response. This user feedback is collated and curated by OpenAI and used in improving the model using Reinforcement Learning from Human Feedback (RLHF), which uses a system of reward and punishment to train a model.

Let us see some of these concepts in action.

ChatGPT Strengths & Weaknesses

In a couple of months, ChatGPT has generated enough hype for a lifetime and the internet is awash with articles about its strengths, weaknesses, possibilities, and dangers. In the interest of brevity, we will only talk about a couple here.

Possibly the greatest strength of ChatGPT is its ability to learn and improve continuously, based on the feedback received from millions of people from across the world who access the application. The 175 billion parameters in ChatGPT mean that it can derive granular insights from the training examples & feedback and improve its reasoning.

In terms of complexity, ChatGPT is far ahead of its competition in terms of response complexity. The application can continue a conversation over several questions providing credible, if not always accurate, answers.

On the other hand, the application suffers from a tendency to fabricate or “hallucinate” – this is integral to the technology; any large language model that’s given an input or a prompt and asked to create the most appropriate response based on the text that it has seen, will from time-to-time hallucinate and imagine things which did not happen. Witness the news report below – ChatGPT has written a detailed news report about an event which never happened in the first place. This, of course, carries risks depending on the intended purpose of using the application.

Also, the application is vulnerable to coordinated misinformation attacks such as by armies of bots providing false information, impacting the content which the application outputs. This risk will be magnified as more people become dependent on ChatGPT for getting information or creating content. Governments and organisations need to be vigilant and alive to the threat of malicious actors and hostile nation states using this tool for their own nefarious purposes.

Finally, we think that the free access model for ChatGPT is not sustainable because of the very high training and online inference costs. Conversely, a paid usage model would inevitably adversely impact the usage of ChatGPT. We believe that OpenAI needs to consider a combination of commercial measures to generate revenue from ChatGPT as well as re-architecting the application to reduce its operating costs, to reach a sustainable operating model for the application.

ChatGPT - Looking forward

ChatGPT is no doubt a significant technological innovation, which has vaulted NLP and large language models into general consciousness and generated high expectations about how it can improve our personal and professional lives. A brief scan of the internet will reveal a myriad list of possibilities ranging from automating students’ school homework to replacing Google to passing bachelors’ degree exams and there is no doubt that it will have many uses within the research and development arena - Sagentia Innovation is already in conversation with a number of clients about applications of large language models. However, we need to ensure that the enormous possibilities of ChatGPT do not blind us to the shortcomings of the application and most critically, some of the risks that accompany it.

In the next blog, we will dive deeper into some of the possibilities as well as look closer to home to see how Sagentia Innovation’s own business may be transformed by this new technology.

Demystifying ChatGPT

What is ChatGPT? How does ChatGPT work? What are ChatGPT strengths and drawbacks?

Authors: Pradipto Biswas, Diogo Mota, and Vinod Munirajaiah

What is ChatGPT?

How does ChatGPT work?

ChatGPT Strengths & Weaknesses

ChatGPT - Looking forward

You might also be interested in...

Demystifying ChatGPT

What is ChatGPT? How does ChatGPT work? What are ChatGPT strengths and drawbacks? Authors: Pradipto Biswas, Diogo Mota, and Vinod Munirajaiah

What is ChatGPT?

How does ChatGPT work?

ChatGPT Strengths & Weaknesses

ChatGPT - Looking forward

You might also be interested in...

What is ChatGPT? How does ChatGPT work? What are ChatGPT strengths and drawbacks?

Authors: Pradipto Biswas, Diogo Mota, and Vinod Munirajaiah