Explanation
In this example, the goal is to test if a given range contains duplicate values and return TRUE if duplicates exist and FALSE if not. This is essentially a counting problem and the solution is based on the COUNTIF function, which counts values in a range that meet supplied criteria. The formula used in E5 is:
=OR(COUNTIF(data,data)>1)
where data is the named range B5:B16.
Background study
Below are related links to help you understand how this formula works:
- What is an array formula? - 3 min video
- What is an array? - 3 min video
COUNTIF function
Working from the inside-out, the core of the formula is based on the COUNTIF function:
COUNTIF(data,data)
Here, data (B5:B16) is given for both range and criteria. Typically, criteria is supplied as a single value, but in this case data contains 12 values. The result is that COUNTIF returns 12 counts (one for each value) in a single array like this:
{1;1;2;1;1;1;1;2;1;1;1;1}
In "modern" versions of Excel that support dynamic arrays, you can enter the COUNTIF formula above as a standalone formula and you will see the results spill onto the worksheet. Most values in the array are 1 but notice that the third value and eighth value are 2, which indicate duplicate values. The value 155 occurs twice at these positions in the range, which is why the count for that number is 2.
In this particular problem, we don't care about the specific numbers returned by COUNTIF, we only care if any number is greater than 1. Therefore, we use the greater than operator (>) to check the result:
=COUNTIF(data,data)>1
={1;1;2;1;1;1;1;2;1;1;1;1}>1
The result is an array of TRUE and FALSE values.
{FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE}
This is the information we need to solve the problem. If any value in the array is TRUE, it means we have duplicates. If all values are FALSE, it means there are no duplicates. To check the array, we can use the OR function.
OR function
The array above is returned directly to the OR function:
=OR({FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE})
The OR function returns TRUE if any given argument evaluates to TRUE, and returns FALSE only if all supplied arguments evaluate to FALSE. The final result is TRUE, since at least one value in the array is TRUE.
Note: COUNTIF will automatically ignore empty cells in this configuration, which return a count of 0.
SUMPRODUCT alternative
To avoid an array formula that requires Control + Shift + Enter, you can use the SUMPRODUCT alternative below:
=SUMPRODUCT(--(COUNTIF(data,data)>1))>0
As above, the counts are checked for any numbers greater than 1, resulting in an array of TRUE and FALSE values. The double negative (--) converts the TRUE and FALSE values to 1s and 0s, and the result is delivered to SUMPRODUCT, which returns the total. Finally, the total from SUMPRODUCT is checked against zero. Since the total is greater than zero, the formula returns TRUE as a final result.
This is still an array formula but the SUMPRODUCT function handles the array operation natively, so it is not necessary to use Control + Shift + Enter.
Count duplicates
To count the number of duplicates in the range you can adapt the formula like this:
=SUMPRODUCT(--(COUNTIF(data,data)>1))
Note: this is also an array formula, but because SUMPRODUCT function can handle the array operation natively, it is not necessary to use control + shift + enter.
The configuration of COUNTIF is the same as the original formula above, so we end up with the same array of TRUE and FALSE values. The double negative (--) converts the TRUE and FALSE values to 1s and 0s, and the result is delivered to SUMPRODUCT, which returns the sum of the array:
=SUMPRODUCT({0;0;1;0;0;0;0;1;0;0;0;0}) // returns 2
Note: the SUM function will also work fine in place of SUMPRODUCT, but the formula be entered with control + shift + enter in older versions of Excel.