Skip to Main Content
Florida Tech Evans Library Logo

Introduction to Text Mining

An overview of text mining tools and techniques.

str_replace(string, pattern, replacement)

str_replace() will match a pattern within a string and replace it with a new string specified by the user. Note: str_replace will only replace the first matched pattern in the string. To match all patterns in a string, use str_replace_all(). 

Arguments: 

string: the string or character object you want to run the function on. 

pattern: the pattern to search the string for. (should be contained in quotation marks: "pattern").

replacement: the string you would like to replace the matched pattern with. (should be contained in quotation marks: "replacement"). 

# This will get executed each time the exercise gets initialized library(stringr) play = c("Hamlet", "Romeo & Juliet", "Romeo & Juliet", "The Merchant of Venice", "King Henry IV", "Julius Ceasar", "MacBeth", "King Lear", "Richard III", "Julius Ceasar", "Othello", "Twelfth Night", "Hamlet", "Timon of Athens", "King Lear", "As You Like It", "Measure for Measure", "Twelfth Night") quote = c("What a piece of work is man! how noble in reason! how infinite in faculty! in form and moving how express and admirable! in action how like an angel! in apprehension how like a god! the beauty of the world, the paragon of animals", "What's in a name? That which we call a rose by any other name would smell as sweet", "Tempt not a desperate man", "The devil can cite Scripture for his purpose", "A man can die but once", "But, for my own part, it was Greek to me", "Double, double toil and trouble; Fire burn, and cauldron bubble", "Nothing will come of nothing", "Now is the winter of our discontent", "Cowards die many times before their deaths; the valiant never taste of death but once", "I am one who loved not wisely but too well", "If music be the food of love play on", "We know what we are, but know not what we may be", "We have seen better days", "I am a man more sinned against than sinning", "All the world's a stage, And all the men and women merely players: They have their exits and their entrances; And one man in his time plays many parts", "Some rise by sin, and some by virtue fall", "Some are born great, some achieve greatness, and some have greatness thrust upon them" ) shakespeare = data.frame(play, quote) # str_replace(string, pattern, replacement) will replace the first matched pattern in a string. # Try replacing "Romeo & Juliet" with "R & J" in the 'play' vector of the shakespeare dataframe. # str_replace(string, pattern, replacement) will replace the first matched pattern in a string. # Try replacing "Romeo & Juliet" with "R & J" in the 'play' vector of the shakespeare dataframe. str_replace(shakespeare$play, "Romeo & Juliet", "R & J") test_function("str_replace") success_msg("Good work! Remember, str_replace() only replaces the first matched string. To replace every match in a string, use str_replace_all()")
Use(str_replace(shakespeare$play, "Romeo & Juliet", "R & J"))

str_replace_all(string, pattern, replacement)

str_replace() will match a pattern all specified patterns within a string and replace it with a new string specified by the user.

Arguments: 

string: the string or character object you want to run the function on. 

pattern: the pattern to search the string for. (should be contained in quotation marks: "pattern").

replacement: the string you would like to replace the matched pattern with. (should be contained in quotation marks: "replacement"). 

String Case Functions

The stringr package contains a number of functions that allow users to change the case type of character objects. Below you will find a list of these functions. 

str_to_lower(string): Transform all characters in a string to lower-case letters. 

str_to_upper(string): Transform all characters in a string to upper-case letters. 

str_to_title(string): Capitalize the first character in a string following a punctuation mark. 

# This will get executed each time the exercise gets initialized library(stringr) play = c("Hamlet", "Romeo & Juliet", "Romeo & Juliet", "The Merchant of Venice", "King Henry IV", "Julius Ceasar", "MacBeth", "King Lear", "Richard III", "Julius Ceasar", "Othello", "Twelfth Night", "Hamlet", "Timon of Athens", "King Lear", "As You Like It", "Measure for Measure", "Twelfth Night") quote = c("What a piece of work is man! how noble in reason! how infinite in faculty! in form and moving how express and admirable! in action how like an angel! in apprehension how like a god! the beauty of the world, the paragon of animals", "What's in a name? That which we call a rose by any other name would smell as sweet", "Tempt not a desperate man", "The devil can cite Scripture for his purpose", "A man can die but once", "But, for my own part, it was Greek to me", "Double, double toil and trouble; Fire burn, and cauldron bubble", "Nothing will come of nothing", "Now is the winter of our discontent", "Cowards die many times before their deaths; the valiant never taste of death but once", "I am one who loved not wisely but too well", "If music be the food of love play on", "We know what we are, but know not what we may be", "We have seen better days", "I am a man more sinned against than sinning", "All the world's a stage, And all the men and women merely players: They have their exits and their entrances; And one man in his time plays many parts", "Some rise by sin, and some by virtue fall", "Some are born great, some achieve greatness, and some have greatness thrust upon them" ) shakespeare = data.frame(play, quote) sonnet = "ShAlL i COmparE tHEe to a SUmmEr's dAy?" # Stringr provides a number of functions for raising and lowering character case. # this command will show you the variable 'sonnet' contains several case-errors. print(sonnet) # Use the following functions on the pre-loaded 'poem' string: # str_to_upper(), str_to_lower(), & str_to_title(). # Stringr provides a number of functions for raising and lowering character case. # this command will show you the variable 'sonnet' contains several case-errors. print(sonnet) # Use the following functions on the pre-loaded 'poem' string: # str_to_upper(), str_to_lower(), & str_to_title(). str_to_upper(sonnet) str_to_lower(sonnet) str_to_title(sonnet) test_function("str_to_upper") success_msg("Good Job! As you can see, these functions make it easy for users to reformat the case of their string data.")
Use()