Introduction to Text Mining

An overview of text mining tools and techniques.

Regular Expressions

Regular expressions (regex) are a powerful method for describing complex patterns in strings allowing the user to perform more specific tasks when processing string data. Regular expressions comprise a pattern-matching language R can use to parse strings based on complex criteria. However, it is important to note that regular expressions can be difficult to work with. We would recommend using a regular expression tester to check patterns before attempting to run regex in R.

The following Cheat Sheet is a good reference for regular expression syntax.