Autoregressive Language Modeling {Self-Sup}
Description
In autoregressive language modeling, which is used in models such as GPT, the model predicts the next word in a sentence given all the preceding words. It's trained to maximize the likelihood of a word given its previous words in the sentence.
Formula
The objective of an autoregressive language model is to maximize:
where w_i is the current word, w1,...., and wiโ1 are the previous words, and ฮธ represents the model parameters.