In November of 2022, OpenAI released a chatbot called ChatGPT (Generative Pre-trained Transformer). Via OpenAI’s GPT-3.5 family of big language models as a foundation, it is then fine-tuned using supervised and reinforcement learning.
On November 30, 2022, a prototype of ChatGPT was released, and it soon gained notoriety for its thorough replies and eloquent answers across a wide range of subject areas. The unevenness of its factual veracity has been called up as a major flaw.
On top of GPT-3.5, ChatGPT was fine-tuned using supervised and reinforcement learning. In both methods, human trainers were important in enhancing the performance of the model. During supervised training, trainers acted as both the user and the AI assistant, providing the model with examples of dialogue in which each role was performed. Human trainers initially graded the model’s replies from a prior discussion in the reinforcement process. The model was then fine-tuned using iterations of Proximal Policy Optimization based on these rankings, creating “reward models” (PPO). Compared to trust region policy optimization algorithms, Proximal Policy Optimization techniques save money by reducing the need for intensive computing. Microsoft’s Azure cloud computing platform was used to train the models.
More information from ChatGPT users is being collected by OpenAI, and this information might be utilised to improve the chatbot’s training and overall quality. Users may rate the answers they get from ChatGPT with an upvote or a downvote and provide extra comments in a text box after casting their vote.
Features and limitations
Journalists have noticed that ChatGPT can do more than just mimic human conversation; it can write and debug computer programmes; it can compose music, teleplays, fairy tales, and student essays; it can answer test questions (sometimes at a level above the average human test taker); it can write poetry and song lyrics; it can emulate a Linux system; it can simulate an entire chat room; and it can even improvise.
ChatGPT aims to reduce harmful and dishonest responses in comparison to its predecessor, InstructGPT; for example, while InstructGPT takes the prompt “Tell me about when Christopher Columbus came to the US in 2015” as true, ChatGPT uses information about Columbus’ voyages and information about the modern world — including perceptions of Columbus — to construct an answer that assumes what would happen if Columbus came to the U.S. in 2015. The training material used by ChatGPT consists of things like man pages and knowledge on Internet phenomena and programming languages like BBSs and Python.
Journalists have speculated that ChatGPT’s ability to recall past commands inside the same discussion may enable it to serve as a tailored therapist, setting it apart from other chatbots. As part of OpenAI’s enterprise-wide moderation API, potentially racist or sexist prompts are automatically discarded before being given to and processed by ChatGPT.
There are several shortcomings with ChatGPT. “occasionally writes plausible-sounding but inaccurate or nonsensical responses,” OpenAI said of ChatGPT. As a result of Goodhart’s rule, ChatGPT’s compensation model, which is based on human supervision, may be over-optimized to the detriment of performance. In addition, ChatGPT isn’t up-to-date on everything, including certain celebrities and events that happened after 2021. As of December 2022, “expressing political viewpoints or engaging in political action” is banned on ChatGPT, according the BBC. During the training phase of ChatGPT, human reviewers favoured lengthier replies over those that were shorter, regardless of the quality of the information provided. ChatGPT’s responses to questions about people’s characteristics may expose the algorithmic bias that plagues the training data. One rap produced by ChatGPT said that women and people of colour in science lagged behind their white and male counterparts.
ChatGPT is a chatbot that strives to deliver a natural and well-written answer to any query or prompt sent into it, such as “What do mitochondria do?” or “Please compose a new Frog and Toad narrative concerning mortgage-backed securities.” Many of the problems that GPT 2.0 had were solved in GPT 3.0 (2020), and ChatGPT solved even more (2019).
On November 30, 2022, OpenAI of San Francisco—makers of DALL•E 2 and Whisper—released ChatGPT. After its first public release at no cost, the service will now be paid for by advertisers. OpenAI predicted that by December 4th, ChatGPT will have more than a million users. “The first AI language model to acquire (such) extensive acceptance,” as stated by Future Perfect, is ChatGPT. The service “still goes down from time to time,” CNBC said on December 15, 2022. The service performs optimally in English, but it is also partially localised into a number of other languages. As of December 2022, there has been no trace of a formal peer-reviewed technical article concerning ChatGPT, in contrast to certain other recent high-profile achievements in AI.
OpenAI is developing a tool to watermark its text creation algorithms to prevent academic plagiarism and spamming, according to guest researcher Scott Aaronson. The next version of GPT, GPT-4, was “rumoured” to be released in 2023, according to a December 2022 New York Times report.