What changes OpenAI’s GPT-3 and other models brought to us

Photo by Pablo Pacheco on Unsplash

In June last year, GPT-3 released by OpenAI, it is composed of 175 billion parameters, and training cost tens of millions of dollars, it was the largest artificial intelligence language model ever produced. From answering the questions to writing articles and poems, and even writing slang language everything is covered.

The full name of GPT-3 is Generative Pretrained Transformer-3 (Generative Pretrained Transformer-3). This is the third series of generating pretraining converters, which is more than 100 times that of GPT-2 in 2019.

In GPT-3 there are 175 billion parameters, the second largest language model has 17 billion parameters. The OpenAI team praised GPT-3 for being so good that it is difficult for people to distinguish the news articles it generates. However, this type of large-scale language model has also a business pursuit. Google uses similarly to this type of model to improve its search results and language translation. Technology companies such as Facebook, Microsoft, and Nvidia are also developing language models. The code of GPT-3, which represents strong artificial intelligence, has never flowed out because OpenAI chose to use it as a commercial service.

Currently, developers in the team of GPT-3 are testing the capabilities of the model, including summarizing legal documents, providing answers to customer service queries, proposing computer code, running text-based role-playing games, and more.

Although GPT-3 is versatile, it does not solve the problems that plague other text-generating programs. OpenAI Chief Executive Sam Altman said on Twitter in July last year, “It still has serious weaknesses and sometimes makes some very stupid mistakes. Although GPT-3 observes the statistical relationship between the words and phrases it reads, but don’t understand its meaning.”GPT-3 is an immature new thing, and it needs to be continuously human-oriented. Just like a small chatbot, if it is allowed to speak impromptu, GPT-3 can spew hate speech that generates racism and sexism.

Researchers have some ideas on how to solve potential harmful biases in language models. Instilling common sense, causal reasoning, or moral judgment into the model is still a huge research challenge.

175 billion parameters to trillions of parameters, gradually “expanding” language model. The neural network language model is a mathematical function inspired by the way neurons are connected in the brain.

Train by guessing the fuzzy words in the text they look at and then adjust the connection strength between neurons to reduce guessing errors. As computing power increases, these language models become more complex.

In 2017, researchers developed a time-saving mathematical technique called Transformer, which allows many processors to train in parallel. The following year, Google released a large Transformer-based model called BERT, which caused a sensation among other models using this technology.

In January this year, Google released a model containing 1.6 trillion parameters, but it is a “sparse” model, which means that the workload for each parameter is small.

In order to better guess words, GPT-3 absorbs any pattern it can absorb. This allows it to recognize grammar, article structure, and writing genre. Give it some examples of tasks or ask it a question, and it can continue the topic.

In June last year, GPT-3 released by OpenAI, it is composed of 175 billion parameters, and training cost tens of millions of dollars, it was the largest artificial intelligence language model ever produced. From answering the questions to writing articles and poems, and even writing slang language everything is covered.

The full name of GPT-3 is Generative Pretrained Transformer-3 (Generative Pretrained Transformer-3). This is the third series of generating pretraining converters, which is more than 100 times that of GPT-2 in 2019.

In GPT-3 there are 175 billion parameters, the second largest language model has 17 billion parameters. The OpenAI team praised GPT-3 for being so good that it is difficult for people to distinguish the news articles it generates. However, this type of large-scale language models has also a business pursuit. Google uses similarly to this type of model to improve its search results and language translation. Technology companies such as Facebook, Microsoft, and Nvidia are also developing language models. The code of GPT-3, which represents strong artificial intelligence, has never flowed out because OpenAI chose to use it as a commercial service.

Currently, developers in the team of GPT-3 are testing the capabilities of the model, including summarizing legal documents, providing answers to customer service queries, proposing computer code, running text-based role-playing games, and more.

Although GPT-3 is versatile, it does not solve the problems that plague other text-generating programs. OpenAI Chief Executive Sam Altman said on Twitter in July last year, “It still has serious weaknesses and sometimes makes some very stupid mistakes. Although GPT-3 observes the statistical relationship between the words and phrases it reads, but don’t understand its meaning.”GPT-3 is an immature new thing, and it needs to be continuously human-oriented. Just like a small chatbot, if it is allowed to speak impromptu, GPT-3 can spew hate speech that generates racism and sexism.

Researchers have some ideas on how to solve potential harmful biases in language models. Instilling common sense, causal reasoning, or moral judgment into the model is still a huge research challenge.

175 billion parameters to trillions of parameters, gradually “expanding” language model. The neural network language model is a mathematical function inspired by the way neurons are connected in the brain.

Train by guessing the fuzzy words in the text they look at and then adjust the connection strength between neurons to reduce guessing errors. As computing power increases, these language models become more complex.

In 2017, researchers developed a time-saving mathematical technique called Transformer, which allows many processors to train in parallel. The following year, Google released a large Transformer-based model called BERT, which caused a sensation among other models using this technology.

In January this year, Google released a model containing 1.6 trillion parameters, but it is a “sparse” model, which means that the workload for each parameter is small.

In order to better guess words, GPT-3 absorbs any pattern it can absorb. This allows it to recognize grammar, article structure, and writing genre, in this GPT-3 demonstrates a new paradigm of artificial intelligence.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store