In this example, the goal is to split the text strings in column B into an array of values. In a problem like this, we need to identify the delimiter, which is the character (or characters), that separate each value we want to extract. In this case, each text string in column B contains comma-separated values, so a comma (",") is the delimiter. For example, the comma-separated text in cell B5 looks like this:
There are three basic approaches to solving this problem:
- Use the TEXTSPLIT function as shown in the worksheet (Excel 365)
- Use the FILTERXML function (Legacy Excel, Windows only)
- Use Excel's Text-to-Columns feature (any version)
This article explains the first two approaches in detail.
If you are using Excel 365, the easiest solution is to use the TEXTSPLIT function as shown in the worksheet above. TEXTSPLIT is a great way to solve this problem, because it is so simple to use. Although TEXTSPLIT can take up to six separate arguments, in this case, we only need to provide the first two arguments, text and col_delimiter. The formula in cell D5, copied down, is:
Notice we need to provide the comma as text surrounded by double quotes (","). With this configuration, TEXTSPLIT splits the text in cell B5 into five separate values and returns these values in a horizontal array:
This array is returned to cell D5, and the five values spill into the range D5:H5.
Older versions of Excel do not have a function dedicated to splitting text to an array, similar to the PHP explode function, or Python split method. As a workaround, you can use the FILTERXML function, after first adding XML markup to the text. The formula looks like this:
To use FILTERXML, we need XML, so the first task is to add XML markup to the text. We are going to arbitrarily make each field in the text a <y> element, enclosed with a parent <x> element. We start with the SUBSTITUTE function here:
The result from SUBSTITUTE is a text string like this:
To ensure well-formed XML tags and to wrap all <y> elements in a parent <x> element, we prepend and append more XML tags like this:
This yields a text string like this (line breaks added for readability)
"<x> <y>Jim</y> <y>Brown</y> <y>33</y> <y>Seattle</y> <y>WA</y> </x>"
This text is delivered directly to the FILTERXML function as the xml argument, with an Xpath expression of "//y":
Xpath is a parsing language and "//y" selects all <y> elements. The result from FILTERXML is a vertical array like this:
Because we want a horizontal array in this instance, we wrap the TRANSPOSE function around FILTERXML:
The result is a horizontal array like this:
In older versions of Excel, you can enter this formula as a multi-cell array formula in D5:H5. In Excel 365, the array will spill into the range D5:H5 automatically.
I learned the FILTERXML trick from Bill Jelen in a MrExcel video. FILTERXML is not available in Excel on the Mac, or in Excel Online. This is a nerdy workaround for difficult problems in older versions of Excel.The new TEXTSPLIT function in Excel 365 is a much easier way to solve this problem.
Extract nth item
With either option above, you may want to extract just the nth value from the text string. To do that, you can use the INDEX function. For example, to extract just the age with TEXTSPLIT, you can use:
=INDEX(TEXTSPLIT(B5,","),3) // get age
Notice we have simply nested the original formula inside the INDEX function. With FILTERXML, the formula looks like this:
If you don't want to use FILTERXML, or can't because you are using Excel on a Mac, this example shows another way to split text with a delimiter. This approach requires a bit more setup.
Formulas work great when you need a solution that is dynamic, because formulas will update automatically if data changes. However, if you only need a one-off manual process, you can also use Excel's Text-to-Columns feature.