Explanation
In this example, the goal is to sum Amounts in column C by state using the two-letter codes in column E. Note the states are abbreviated, "CA" is California, "FL" is Florida, "TX" is Texas, and "WA" is Washington. The challenge in this case is that the state abbreviations are embedded in a text string. This means we need to construct criteria that performs a "contains" type match. To solve this problem, you can use either the SUMIF function or the SUMIFS function with a wildcard. If you need a case-sensitive formula, you can use the SUMPRODUCT function with the FIND function. All three approaches are explained below.
Note: this example pulls together a number of ideas, which makes it more advanced. If you find this example challenging, this formula is a simpler introduction to the same idea, without using text from another cell for criteria.
Data in an Excel Table
For convenience, all data is in an Excel Table named data in the range B5:C16. This allows us to use a structured reference like data[Amount] to refer to the Amount column without using an absolute reference. It also means that the formula will continue to work correctly if new data is added to the table.
Wildcards
Excel functions like SUMIF and SUMIFS support the wildcard characters "?" (any one character) and "*" (zero or more characters), which can be used in criteria. Wildcards allow you to create criteria such as "begins with", "ends with", "contains 3 characters" and so on. The table below shows some possible wildcard configurations. For the data in this problem, we want to use the "Cells that contain text in A1" pattern, which uses two asterisks (*) concatenated to a cell reference like "*"&A1&"*".
Target | Criteria |
---|---|
Cells with 3 characters | "???" |
Cells equal to "xyz", "xaz", "xbz", etc | "x?z" |
Cells that begin with "xyz" | "xyz*" |
Cells that end with "xyz" | "*xyz" |
Cells that contain "xyz" | "*xyz*" |
Cells that contain text in A1 | "*"&A1&"*" |
Note that wildcards are enclosed in double quotes ("") when they appear in criteria. Also notice that cell references are not enclosed in double quotes. Instead, they are concatenated to wildcards.
SUMIFS solution
One way to solve this problem is with the SUMIFS function. SUMIFS can handle multiple criteria, and the generic syntax for a single condition looks like this:
=SUMIFS(sum_range,range1,criteria1)
Notice that the sum range always comes first in the SUMIFS function. Following this pattern, the formula in cell F5 of the worksheet shown looks like this:
=SUMIFS(data[Amount],data[Location],"*, "&E5&" *")
The sum range is data[Amount] and the criteria range is data[Location]. The criteria itself is the tricky part of this formula:
"*, "&E5&" *" // criteria includes comma and space
Notice the text and wildcard (*) are both enclosed in double quotes ("") and we are including a comma and space (", ") on the left and a space (" ") on the right to prevent false matches. These extra characters make this criteria more specific, so that we don't accidentally match "CA" in another word, like "cat" or "FL" in "flood". This is important partly because the SUMIFS and SUMIF function are not case-sensitive. Also notice that the cell reference E5 is not enclosed in double quotes. Instead, it is concatenated to wildcards on either side. We can't use criteria like "*, E5 *" because that will match only the literal text ", E5 ".
When Excel evaluates the SUMIF formula, it will retrieve the value from E5 ("CA"), perform concatenation, and assemble the criteria into a single text string like this:
=SUMIFS(data[Amount],data[Location],"*, CA *")
When the formula is entered in cell F5, it returns $600, the sum for amounts in California ("CA"). As the formula is copied down, it returns a sum of amounts for each state in column E.
SUMIF solution
This problem can also be solved with the SUMIF function, where the equivalent formula is:
=SUMIF(data[Location],"*, "&E5&" *",data[Amount])
Note that sum_range comes last in the SUMIF function. However, the criteria is identical to what we used in SUMIFS above. The result returned by SUMIF is the same, $600 in cell F5 for the sum of amounts in California.
Case-sensitive solution
As mentioned above, the SUMIF and SUMIFS functions are not case-sensitive. If you need a case-sensitive solution, you can use a formula based on the SUMPRODUCT function and the FIND function like this:
=SUMPRODUCT(--ISNUMBER(FIND(E5,data[Location]))*data[Amount])
Inside SUMPRODUCT, the left side of the expression tests for state with ISNUMBER and FIND:
--ISNUMBER(FIND(E5,data[Location]))
The FIND function is case-sensitive by default and returns the position of find_text as a number when found, and a #VALUE! error when not found. We do not need to use a wildcard character like (*) because FIND automatically performs a "contains" type search for a substring. Because there are 12 values in data[Location], FIND returns an array of 12 results like this:
{#VALUE!;14;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;12;10;#VALUE!;#VALUE!}
Notice that FIND returns numbers for the second, ninth, and tenth rows in the data. These are the rows where the state is "CA". The ISNUMBER function converts the results from FIND into TRUE and FALSE values, and the double negative (--) converts the TRUE and FALSE values to 1s and 0s. Inside the SUMPRODUCT we now have:
=SUMPRODUCT({0;1;0;0;0;0;0;0;1;1;0;0}*data[Amount])
Note: technically, the double negative (--) is unnecessary in this formula, because multiplying the TRUE and FALSE values by the numeric values in data[Amount] will automatically convert TRUE and FALSE values to 1s and 0s. However, the double negative does no harm and it makes the formula a bit easier to understand, because it signals a Boolean operation.
When the two arrays are multiplied together, the zeros in the first array act like a filter to cancel out amounts for all states except "CA":
=SUMPRODUCT({0;225;0;0;0;0;0;0;200;175;0;0})
With just one array to process, SUMPRODUCT sums the values in the array and returns 600 as a final result. As the formula is copied down, we get a case-sensitive sum of amounts for each state abbreviation in column E.
One thing you might notice in this formula is that we are not concatenating the comma and space to the cell value E5, because it is not likely that the uppercase "CA" will match the wrong text. However, we could add this step to make the search more specific:
=SUMPRODUCT(ISNUMBER(FIND(", "&E5&" ",data[Location]))*data[Amount])
As mentioned above, there is no need to use a wildcard in this case because FIND automatically looks for find_text as a substring that might appear anywhere in the text. For a more detailed explanation of FIND + ISNUMBER see this article.
Note: In Excel 365, you can replace SUMPRODUCT with the SUM function. To read more about this, see Why SUMPRODUCT?