Language Modeling

What is Language Modeling?

  • Task of building a predictive model of language.

  • A language model is used to predict two types of quantities.

    • Probability of observing a sequence of words from a language.

      e.g., Pr(Colorless green ideas sleep furiously) = ?

    • Probability of observing a word having observed a sequence.

      e.g., Pr(furiously | Colorless green ideas) = ?

Why is Language Modeling Useful?

  • Machine translation

  • Handwriting recognition

  • Spelling correction

  • More generally

Formal Task Definition

A language model is something that specifies the following two quantities, for all words in the vocabulary (of a language).

  1. Probability of a sentence or sequence

    Pr(w_1, w_2, w_3, …, w_k)

    e.g., Pr(I, love, food) =/= Pr(love, I, food)

  2. Probability of the next word in a sequence

    Pr(wk | w_1, w_2, …, w_k-1)

Why is language modeling hard?

Strawman solution

Other basic solutions

How to evaluate language models

Advanced solutions