Skip to main content

NLP (MOC)

Notes

NLP or "natural language processing" is the method by which we "teach" a computer to read text. Since computers don't understand words, we have to convert text into numbers. The challenge is to convert it in such a way that "somehow" not only explains the meaning of the word, but perhaps only maintains it's role within a sentence, and the connection it has to the general meaning of it.

The best way to do that is through TF-IDF, a method of converting text into a vector. The other is a Naive Bayes classifier .

Types of Analysis

With NLP, you can:

  1. Text Classification - For example to identify and categorize text as either "spam" or not spam
  2. Text Generation - The basis for all Chat AI models, that generate text based on a prompt
  3. Sources/References/Sentiment Analysis - To analyze whether a text (perhaps a review) is either positive, negative, or neutral
  4. Topic Modeling - To group text by topic, for example news articles into political, economics, etc.

Techniques

Most common features for NLP:

  1. Named Entity Recognition - To detect popular names such as companies within the text
  2. Regex - To search for matches within the text based on a special pattern. Also see pattern matching
  3. Tokenization - To break town a sentence into base components (which can then be converted into a vector)

Courses

natural language processing course NLP with python

Websites

Other MOC

Overview

Join the Journey

Philosopher's Code offers practical philosophy for everyday life

Unsubscribe at any time