Extract nth word from text string
To extract the nth word in a text string (i.e. a sentence, phrase, or paragraph) you can so with a formula that combines 5 Excel functions: TRIM, MID, SUBSTITUTE, REPT, and LEN. IN the example shown, the formula in D5, copied down, is:
The result in column D is the nth word of the text in column B, where n is given in column C.
In this example, the goal is to extract the nth word from the text string given in column B. Excel does not currently have a function dedicated to this task, so this formula is a workaround. At the core, this formula takes a text string with spaces, and "floods" it with additional space by replacing each space in the original text with a large number of space characters using SUBSTITUTE and REPT.
Note: the number of spaces used in the formula is just an arbitrarily large number that will work in all cases. Using the LEN function to calculate the length of the original text string, and using this as the number of spaces, ensures that the formula will work even for very long words in the original text.
Working from the inside out, the SUBSTITUTE function is used to replace all single spaces with a larger number of spaces:
The REPT function and LEN functions are is used to provide a variable number of spaces:
LEN returns 25, so the REPT function returns a string of 25 spaces:
REPT(" ",25) // returns " "
The result from REPT is returned directly to the SUBSTITUTE function:
=SUBSTITUTE(B5," "," ")
SUBSTITUTE then replaces every single space in the text from B5 with 25 spaces. You can think of the result at this point as "islands" of words floating in a sea of space :)
="Better the devil you know"
The result from SUBSTITUTE is returned directly to the MID function. Then the formula uses the MID function to extract the nth word. The start)_num is worked out with this snippet:
(N-1)*LEN(B5)+1 // returns 51
This code is designed to calculate the correct starting point to extract text, taking into account the fact that we've already flooded the text with extra space characters. We are just repeating the logic used previously to calculate the number of spaces to use between words. The n-1 adjustment is needed because the extra space occurs between words, i.e. the 3rd word occurs after the 2nd block of space, the 4th word occur after the 3rd block of space, etc. The length of the original text string is 25 characters and n is 3, so we have:
=(3-1)*25+1 =2*25+1 =51
The num_chars argument is provided as the original length of B5, which is 25. With this information, the MID function extracts 25 characters, beginning with character 51. The result returned directly to the TRIM function:
=TRIM(" devil ")
TRIM strips the space and returns the final result: "devil".
Text to Columns
Don't forget that Excel has a built-in Text to Columns feature that can split text according to the delimiter of your choice. If you just need to get the 3rd word from a lot of text strings, this formula may be more convenient (and dynamic), but Text to Columns is still useful in many situations.