Severin Perez


Reference: TF-IDF

October 11, 2020
Term frequency-inverse document frequency is a means of assigning weight to a search term when comparing individual documents within a corpus. It is an improvement on the bag-of-words model in that it considers the relative rarity of a term within a larger corpus.

Reference: Stemming

August 23, 2020
In natural language processing, stemming is the process of reducing a word to its stem form. Typically, stemming is used as part of an NLP pipeline in order to reduce all words in a text to their stems so that they can be analyzed together.

Reference: Root Word

August 23, 2020
A root word is a word with no prefixes or suffixes, meaning that it is the primary lexical unit of a set of words and represents the principle semantic meaning of the set.

Reference: Lemmatization

August 21, 2020
Lemmatization is the process of reducing a word to its lemma (canonical form). In natural language processing, a lemmatizer may be used to reduce all words in a given text to their lemmas, which makes comparative analysis possible based on canonical forms.

Tag: linguistics (p. 1)
© Severin Perez, 2021