Explanation
In this example, the goal is to check for "strong" passwords. What makes a password strong depends on the rules it must follow. In this case, a strong password must meet the following six conditions:
- At least 8 and not more than 15 characters long
- Contains at least one uppercase (A-Z) letter
- Contains at least one lowercase (a-z) letter
- Contains at least one number (0-9)
- Contains at least one punctuation character
- Does not contain whitespace
Traditionally, this has been a difficult problem in Excel because there is no simple way to implement the logic. Each rule must be checked with a combination of functions, resulting in a large and unwieldy formula. However, with the introduction of the REGEXTEST function, this problem can be solved in a straightforward fashion by implementing all rules in a single regular expression, or "regex" for short.
REGEXTEST function
Excel's REGEXTEST function tests for the existence of text defined by a given pattern. Regex patterns are very flexible and can be configured to match numbers, email addresses, dates, and other values that have an identifiable structure. The result from REGEXTEST is TRUE or FALSE. For example, the formulas below show how REGEXTEST can be used to test the text in A1 for a number or an uppercase character:
=REGEXTEST(A1,"[0-9]") // test for a number
=REGEXTEST(A1,"[A-Z]") // test for an uppercase character
You can read more about REGEXTEST here.
Strong password validation
The problem in this example is to test various passwords to determine if they are strong. Specifically, we want to check that each password passes the following six rules:
- Is at least 8 and not more than 15 characters long
- Contains at least one uppercase (A-Z) letter
- Contains at least one lowercase (a-z) letter
- Contains at least one number (0-9)
- Contains at least one punctuation character
- Does not contain any whitespace characters
To validate these rules using the REGEXTEST function, we need to supply an appropriate regex pattern. In the worksheet shown, the formula used to perform this test looks like this:
=REGEXTEST(B5,"^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[^\w\s])[^\s]{8,15}$")
The first argument, text, comes from cell B5. The second argument, pattern, is the regular expression used to validate a strong password that follows the rules above. The REGEXTEST function is simple, but you can see that the regex pattern is not :) Here is how the pattern breaks down:
- ^: Start of string.
- (?=.*[A-Z]): At least one uppercase letter.
- (?=.*[a-z]): At least one lowercase letter.
- (?=.*\d): At least one digit.
- (?=.*[^\w\s]): At least one special character (not a word character or space).
- [^\s]{8,15}: At least 8 to 15 characters, none of which are whitespace.
- $: End of string.
The key to this pattern is the "positive lookahead" syntax: (?=...), a special way to check that a pattern exists in a string without including it in a match.
Lookahead syntax
As mentioned above, a "lookahead" in regex is a way to assert that a certain pattern exists in a string without "consuming characters" (i.e., without including them in the match). Lookaheads are useful when you want to check for specific conditions within a string without actually capturing those conditions in the match result. Regex provides both positive and negative lookahead:
- Positive Lookahead: (?=...) - The pattern inside (...) must exist ahead.
- Negative Lookahead: (?!...) - The pattern inside (...) must not exist ahead.
For example, to check for the presence of at least one uppercase letter (A-Z) somewhere ahead in a string, you can use
(?=.*[A-Z])
This lookahead ensures the condition is true, but it does not text in the match. Lookaheads do not move the cursor forward in the string or consume characters. They just check if the required pattern is present. Because lookaheads do not consume characters, they can be combined to enforce classic AND-style logic. The pattern used above includes four positive lookaheads:
- (?=.*[A-Z]): Ensures at least one uppercase letter somewhere in the string.
- (?=.*[a-z]): Ensures at least one lowercase letter somewhere in the string.
- (?=.*\d): Ensures at least one digit somewhere in the string.
- (?=.*[^\w\s]): at least one punctuation character somewhere in the string.
If any condition is false, the pattern will fail, and REGEXTEST will return FALSE. Finally, after these lookaheads, the pattern [^\s]{8,15} checks for a string length between 8 and 15 characters. This pattern allows any character except whitespace. However, because the lookaheads have already been verified, we know the password contains uppercase, lowercase, numbers, and punctuation.
Note: When you use multiple lookaheads, they check for conditions without moving the cursor or consuming characters. Each lookahead runs independently, and the regex engine evaluates them all before proceeding, so the order doesn't matter. However, when we check the length with [^\s]{8,15}, this pattern does consume characters and needs to occur at the end of the pattern.
Formula for older versions of Excel
Can you perform validation like this with a formula in an older version of Excel? Yes, but it's more work. One way to do it is to enter the letters a-z (lowercase) in a range somewhere, then name the range "letters". Then, you can use a brute-force formula like this:
=AND(
LEN(B5)>=8,
LEN(B5)<=15,
COUNT(FIND({0,1,2,3,4,5,6,7,8,9},B5))>=1,
COUNT(FIND(letters,B5))>=1,
COUNT(FIND(UPPER(letters),B5))>=1,
COUNT(FIND({"!","@","#","$","%","^","&","*","(",")","_","+"},B5))>=1,
ISERROR(FIND(" ",B5))
)
At a high level, we are using the AND function to run multiple tests to validate the password in cell B5. If all tests return TRUE, the AND function will return TRUE. Otherwise, AND will return false. In all, we run seven tests. First, we check that the length of the password is at least 8 characters with the LEN function:
LEN(B5)>=8
Next, we check that the length is less than or equal to 15 characters:
LEN(B5)<=15
If both tests return TRUE, we know the password is between 8-15 characters. Next, we use the FIND function with the COUNT function to test for at least one number:
COUNT(FIND({0,1,2,3,4,5,6,7,8,9},B5))>=1
FIND is configured to look for the numbers 1-9 inside the text in cell B5. When FIND locates a number, it returns its numeric position. When a number is not found, FIND returns a #VALUE error. The COUNT function then returns a count of numeric positions. If the count is greater than or equal to 1, the expression returns TRUE. Otherwise, it returns FALSE. For a more detailed explanation, see this page.
The test for a lowercase letter is performed in the same way, with COUNT and FIND:
COUNT(FIND(letters,B5))>=1,
Because the named range "letters" contains all letters, a-z, you can think of it like this:
COUNT(FIND({"a","b","c"..."x","y","z",B5))>=1,
Essentially, we are asking FIND to find all 26 letters in the range. Since FIND is case-sensitive, it will only match lowercase letters. As before, when FIND locates a match, it returns its numeric position. When a match is not found, FIND returns a #VALUE error. The COUNT function returns a count, which we check with >=1. The test for an uppercase letter works the same way. The only difference is that we use the UPPER function to uppercase all letters before FIND begins its search:
COUNT(FIND(UPPER(letters),B5))>=1
The punctuation test is the same, but this time, we hardcode the characters we to look for into the formula as an array constant:
COUNT(FIND({"!","@","#","$","%","^","&","*","(",")","_","+"},B5))>=1
You can customize this list as desired. Another option is to create a named range "punctuation" that contains all special characters and use that instead of the array constant. Finally, we test for whitespace with this snippet:
ISERROR(FIND(" ",B5))
If FIND locates a space, it will return its position, otherwise it will return #VALUE!. We want to see the error at this point because it confirms no space characters in the password. ISERROR converts the error into TRUE. If FIND does locate a space, it returns a numeric position and ISERROR returns FALSE.
Again, this is an array formula and must be entered with control + shift + enter in Excel 2019 and older. You can avoid this requirement if you harcode the 26 lower and 26 uppercase characters into the formula instead of using the named range "letters".