Brill Tagger - Your Art History Reference Guide!

ArtHistoryClub Information Site on Brill Tagger Art History Art History Search        Art History Browse             News        Gallery        Forums        Articles        Weblinks        welcome to our free resource site for all art history lovers!

Brill Tagger

The Brill Tagger was exposed by Eric Brill in his 1993 PhD thesis [1]. It can be summarised as an "error-driven transformation-based tagger". It is

  • error-driven in the sense that is recourses to supervised learning
  • transformation-based in the sens that it uses rules

Algorithm

The algorithm goes as follow:

  • Initialisation:
    • Known words (in vocabulary): assigning the most frequent tag associated to a form of the word
    • Unknown words (out of vocabulary) :
      • Proper noun if capitalised and simple noun else (1992)
      • Learning or guessing rules on the same basis as contextual rules (1994)
  • Learning Phase
    • Iteratively compute the error score of each candidate rule (difference between the number of errors before and after applying the rule)
    • Select the best (higher score) rule.
    • Add it to the rule set and apply it to the text.
    • Repeat until no rule has a score above a given threshold (that is, untill applying new rules leaves the text in the same state, which is then supposed to be the final state of the tagging).

Rules

Lexical rules are used for the initialisation, and contextual rules are used to correct the tags.

  • Lexical rules: wordtag IF Condition (example: identification of suffixes like "-tion")
  • Contextual rules: tag1tag2 IF Condition (example: "preceding/following tag is X", "preceding/following word is w")
Last updated: 01-04-2007 01:18:57
The contents of this article are licensed from Wikipedia.org under the
GNU Free Documentation License. See original document.
Art History Search | Art History Browse | Contact | Legal info