Part-of-Speech Tagging

Summary

  • Languages generally have a small set of closed class words that are highly frequent, ambiguous, and act as function words, and open-class words like nouns, verbs, adjectives. Various part-of-speech tagsets exist, of between 40 and 200 tags.
  • Part-of-speech tagging is the process of assigning a part-of-speech label to each of a sequence of words.
  • Two common approaches to sequence modeling:
    • a generative approach, HMM tagging
      • The probabilities in HMM taggers are estimated by maximum likelihood estimation on tag-labeled training corpora
      • The Viterbi algorithm is used for decoding, finding the most likely tag sequence
      • Beam search is a variant of Viterbi decoding that maintains only a fraction of high scoring states rather than all states during decoding
    • a discriminative approach, MEMM (Maximum Entropy Markov Model)tagging
      • This tagger train logistic regression models to pick the best tag given an observation word and its context and the previous tags, and then use Viterbi to choose the best sequence of tags.
  • Modern taggers are generally run bidirectionally

(Mostly) English Word Classes

Part-of-speech can be divided into two broad supercategories:

  • Closed class types

    • Those with relatively fixed membership, such as prepositions - new prepositions are rarely coined

    • Generally are function words

      like of, it, and, or you, which tend to be very short, occur frequently and often have structuring uses in grammar

    • Closed classes differ more from language to language

    • Some of the important closed classes in English include:

      • prepositions - on, under, over, near, by, at, from, to, with

        prepositions occur before noun phrases

      • particles - up, down, on, off, in, out, at, by

        a particle resembles a preposition or an adverb and is used in combination with a verb

      • determiners - a, an, the

        a closed class that occurs with nouns, often marking the beginning of a noun phrase

      • conjunctions - and, but, or, as, if, when

        conjunctions join two phrases, clauses, or sentences

      • pronouns - she, who, I, others

        pronouns are forms that often act as a kind of shorthand for referring to some noun phrase or entity or event

        • personal pronouns - refer to persons or entities (you, she, I, it, me, etc.)
        • possessive pronouns - forms of personal pronouns that indicate either actual possession or more often just an abstract relation between the person and some object (my, your, his, its, one’s, our, their)
        • wh-pronouns - used in certain question forms, or may also act as complemetizers
      • auxiliary verbs - can, may, should, are

        • English auxiliaries include the copula verb be, the two verbs do and have, along with their inflected forms, as well as a class of modal verbs
      • numerals - one, two, three, first, second, third

      • interjections - oh, hey, alas, uh, um

      • negatives - no, not

      • politeness markers - please, thank you

      • greetings - hello, goodbye

      • existential - there

  • Open class types

    • Nouns and verbs are open classes - new nouns and verbs like iPhone or to fax are continually being created or borrowed

    • Four major open classes occur in the languages of the world:

      • nouns

        Open class nouns fall into two classes:

        • Proper nouns - in written English, proper nouns are usually capitalized
        • Common nouns - are divided in many languages into count nouns and mass nouns
      • verbs - refer to actions and processes

      • adjectives - includes many terms for properties or qualities

      • adverbs - is rather a hodge-podge in both form and meaning

The Penn Treebank Part-of-Speech Tagset

Part-of-Speech Tagging

HMM Part-of-Speech Tagging

Markov Chains

The Hidden Markov Model

The components of an HMM tagger

HMM tagging as decoding

The Viterbi Algorithm

Working through an example

Extending the HMM Algorithm to Trigrams

Unknown Words

Maximum Entropy Markov Models

Features in a MEMM

Decoding and Training MEMMs

Bidirectionality

Part-of-Speech Tagging for Other Languages

Reference

https://web.stanford.edu/~jurafsky/slp3/

Vocabularies