Summary

To use a regex pattern in an XLOOKUP formula, you can enable "regex match" as the match mode and then provide a regex pattern in the lookup value. In the worksheet shown, the formula in F5 looks like this:

=XLOOKUP("[A-Z]{3}"&F4&"[A-Z]{2}",B5:B16,C5:C16,,3)

This formula matches the number entered in cell F4 against the product codes in column B using a regex pattern. With the number 56 in cell F4, the result is $46.00, the correct price for product code PQR56DE.

Regex support was added to XLOOKUP in December 2024, so this feature is only available in Excel 365 for now. For an overview of regex in Excel see see Regular Expressions in Excel.

Generic formula

=XLOOKUP(regex_pattern,lookup_array,return_array,,3)

Explanation 

In this example, the goal is to look up the correct price of the product number entered in cell F4 using the product codes in column B as the lookup values. This problem is trickier than it looks. Each product code begins with 3 uppercase letters and ends with 2 uppercase letters. In the middle of the product code is a number between 2 and 4 digits. This is the number we want to use for a lookup value. Let's look at how to solve this problem with XLOOKUP and the newly released "regex match" feature. To provide some context on why you might need to use regex, let's start by looking at some XLOOKUP formulas that don't work. All formulas below refer to the worksheet shown above. Download the workbook above and follow along.

Simple XLOOKUP formula

By default, XLOOKUP will perform an exact match. If we search for 56, XLOOKUP will simply return a #N/A error because 56 does not appear as a lookup value:

=XLOOKUP("56",B5:B16,C5:C16) // returns #N/A
  • lookup_value - "56"
  • lookup_array - B5:B16 (the range containing the product codes)
  • return_array - C5:C16 (the range containing the prices)

Note that I've enclosed the "56" in double quotes because the product codes are text values, not numbers. But whether we look for the number 56 or the text "56", the result will be #N/A because XLOOKUP is performing an exact match, and those values do not exist by themselves as product codes.

XLOOKUP with a wildcard match

If you know a little more about XLOOKUP, you might wonder if we can use a wildcard match. Indeed, you can enable a "wildcard character match" by providing the number 2 for match_mode argument like this:

=XLOOKUP("56",B5:B16,C5:C16,,2) // enable wildcard match

The wildcard match doesn't do anything by itself without wildcards, so the formula above will also return #N/A. To actually use a wildcard match, we need to provide some wildcards. A typical approach involved adding some asterisks (*) like this:

=XLOOKUP("*56*",B5:B16,C5:C16,,2) // returns 78
  • lookup_value - "*56*"
  • lookup_array - B5:B16 (product codes)
  • return_array - C5:C16 (prices)
  • if_not_found - omitted
  • match_mode - 2 (wildcard match)

The asterisks (*) are wildcards that will match "zero or more characters". What we are doing here is looking for some number of characters, then "56," then some number of characters. This seems like it should work. However, the formula returns 78, which is not correct. Why? Well, if "56" only appeared once in the product codes, it would work. However, "56" appears inside three product codes: KP563MN, QR56DE, and HJ3456TU.  What's happening here is that XLOOKUP is matching the first product code that contains 56, KP563MN. Standard XLOOKUP behavior when there are multiple matches. Can we make XLOOKUP find the right code? Well, we could take the wildcard match idea a bit further by switching from "*" to the "?" like this:

=XLOOKUP("???56??",B5:B16,C5:C16,,2) // returns 46

The "?" wildcard means "one character of any kind".  The literal meaning of this wildcard pattern is "3 characters, followed by 56, followed by 2 characters. In fact, the formula above does return the correct price for PQR56DE, which is $46.00. However, there are some problems with this approach:

  1. Although PQR56DE has 2 letters ("DE") after the 56, not all product codes follow this pattern. Some have two letters at the end, and some have three letters.
  2. We could have a product code like ABC5612XY,  which will match the pattern above because the "?" will match any character, even numbers. The "?" wildcard makes no distinction between letters and digits.

In summary,  with basic wildcards, we don't have a good way to create a pattern that will reliably match all product codes. It's time to roll out the big guns: Regular Expressions, called "regex" for short. 

What is regex? Regex, short for Regular Expressions, is a powerful tool for pattern matching in text data. Using a combination of metacharacters, literal characters, character classes, and quantifiers, you can define complex search patterns to extract, validate, or manipulate text data. Regular Expressions have been around for decades, but only recently arrived in Excel. The main benefit of regex in Excel is the ability to work with text very precisely without resorting to complicated formulas that are hard to create and maintain. If you are new to Regular Expressions, see this overview.

XLOOKUP with a regex match

To enable a regex match in XLOOKUP, provide 3 for the fifth argument, called match_mode. Going back to the original example above, once we enable regex, we have this formula:

=XLOOKUP("56",B5:B16,C5:C16,,3) // regex enabled

Its interesting to note that this formula by itself returns $78.00. This is the price for KPX563MN, the first product code that contains "56". In other words, just by enabling regex, we get a working "contains" type match. This is cool because it means we can get XLOOKUP to do a "contains" type match by providing 3 for match_mode. We don't need to use any wildcards or special symbols. However, the result above is incorrect for the same reason we saw above: Three codes contain 56, and XLOOKUP matches the first code that contains 56.

Actually, we don't even need to provide 56 as a text value like "56", because it will get evaluated as text automatically inside in the regex engine. We'll get the same result if we use the number 56 as the lookup value.

To get the formula working correctly, we need to beef up the regex pattern so that we aren't accidentally matching the wrong product code. The first step is to add a pattern to match the beginning of the code. As noted above, the product codes always start with 3 uppercase characters. We can enforce this pattern in our lookup formula by adding "[A-Z]{3}" to the start:

=XLOOKUP("[A-Z]{3}56",B5:B16,C5:C16,,3) // returns 78

The {3} is a quantifier specifying 3 uppercase letters A-Z. This is already quite a bit more robust than our best wildcard formula above because XLOOKUP will now only match the 56 when it comes directly after 3 uppercase letters. Unfortunately, this formula also returns an incorrect result because KPX563MN is the first code to pass this test. To prevent this problem, we need to add a pattern to match the uppercase letters at the end of the product code. This is tricker. We know the code will end with uppercase letters, but there might be 2 characters, and there might be 3. This is where the power of regex patterns starts to really shine. To match 2-3 uppercase letters at the end, we can add "[A-Z]{2,3}" to the end of the pattern. The formula now looks like this:

=XLOOKUP("[A-Z]{3}56[A-Z]{2,3}",B5:B16,C5:C16,,3) // returns 46
  • lookup_value - "[A-Z]{3}56[A-Z]{2,3}" (regex pattern)
  • lookup_array - B5:B16 (product codes)
  • return_array - C5:C16 (prices)
  • if_not_found - omitted
  • match_mode - 3 (regex match)

The quantifier {2,3} means a minimum of 2 and a maximum of 3. The translation of the full pattern is "3 uppercase letters A-Z, followed by 56, followed by 2-3 uppercase letters A-Z". With this adjustment, the formula finally returns the correct result of 46. Better yet, we can use the same pattern to match any of these codes using only the number part as the lookup value. We now have a working solution, but of course, we don't want the number 56 hardcoded in the formula. The final step is to adjust the formula to get the lookup number from cell F4. We do this by concatenating a reference to F4 inside the regex pattern like this:

=XLOOKUP("[A-Z]{3}"&F4&"[A-Z]{2,3}",B5:B16,C5:C16,,3)

This is the final working formula seen in the worksheet shown. When a user types a new valid number in cell F4, the formula will return a new result.

To learn more about the symbols available for regex patterns, see the cheat sheet near the bottom of this page.

Dave Bruns Profile Picture

AuthorMicrosoft Most Valuable Professional Award

Dave Bruns

Hi - I'm Dave Bruns, and I run Exceljet with my wife, Lisa. Our goal is to help you work faster in Excel. We create short videos, and clear examples of formulas, functions, pivot tables, conditional formatting, and charts.