Explanation
In this example, the goal is to calculate a sum for any given group ("A", "B", or "C") across all three months of data in the range C5:E16. In other words, we want to perform a "sum if" with a data range that contains three columns. For convenience, data (C5:E16) and group (B5:B16) are named ranges. In the article below, we look at several approaches to this problem:
- Why the SUMIFS and SUMIF won't work.
- A solution based on SUM + FILTER
- A solution based on SUMPRODUCT and Boolean algebra
In the latest version of Excel, the FILTER option (#2) is easy and intuitive. In Legacy Excel, you can use solution #3.
SUMIFS won't work
You might be tempted to solve this problem with the SUMIFS function or the SUMIF function. After all, it seems simple enough - we need to check if group (B5:B16) is equal to "A" or "B" or "C", then sum the corresponding data in columns C, D, and E. In fact, we can easily use SUMIFS to calculate a sum for a given group on one month of data. For example, to calculate a sum for group "A" in January, we can use a formula like this:
=SUMIFS(C5:C16,group,"A") // returns 168
However, if we try to expand sum_range to include all three columns in data (C5:E16), we'll get a #VALUE! error:
=SUMIFS(data,group,"A") // returns #VALUE!
Why? The reason is that SUMIFS expects sum_range to be the same size as criteria_range. When we try to use the 1-column range group (B5:B16) with the 3-column range data (C5:E16), SUMIFS returns an error. What about the SUMIF function? If we give the older SUMIF function the entire data range and the same criteria, we don't get an error, but we do get an incorrect result:
=SUMIF(group,"A",data) // returns 168
This happens because SUMIF assumes that sum_range is the same size as range. Basically, SUMIF resizes sum_range to match the range argument. This kind of "silent failure" is dangerous, because the result seems reasonable but is in fact incorrect. You may not like formula errors, but at least they tell you something is wrong :)
SUM with FILTER
In the current version of Excel, a nice solution is the SUM function with the FILTER function. This is the approach used in the worksheet shown, where the formula in cell H5, copied down, is:
=SUM(FILTER(data,group=G5))
And data (C5:E16) and group (B5:B16) are named ranges. Inside the SUM function, the FILTER function is configured to filter the data in C5:E16 with a simple logical expression:
FILTER(data,group=G5)
Because cell G5 contains "A", and group (B5:B16) contains 12 values, the expression returns an array with 12 TRUE and FALSE values like this:
{TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE}
Notice the first four values in the array are TRUE, which corresponds to the first 4 rows in the data, which are in group A. This array is returned to the FILTER function as the include argument, and FILTER uses this array to select the first 4 rows of data (C5:E16).
The result from FILTER is delivered directly to the SUM function as a single array:
=SUM({58,41,48;37,46,32;38,48,38;35,59,46})
SUM returns a final result of 526, the sum of the 12 numbers in the array returned by FILTER. As the formula is copied down, it calculates a different sum for each group, using the value in column G for group.
SUMPRODUCT function
You can use the SUMPRODUCT function to solve this problem in Legacy Excel. The formula looks like this:
=SUMPRODUCT(--(group=G5)*data)
Working from the inside out, the logical expression on the left tests all values in group with the value in G5 ("A"):
--(group=G5) // test all group values
Because we are comparing one value to the 12 values in group, we get an array that contains 12 TRUE and FALSE values:
--{TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE}
We then use a double-negative (--) to convert the TRUE and FALSE values to 1s and 0s. At this point, the result inside of SUMPRODUCT looks like this:
=SUMPRODUCT({1;1;1;1;0;0;0;0;0;0;0;0}*data)
Note: technically, the double negative (--) is unnecessary in this formula, because multiplying the TRUE and FALSE values by the numeric values in data will automatically convert TRUE and FALSE values to 1s and 0s. However, the double negative does no harm and it makes the formula a bit easier to understand, because it signals a Boolean operation.
After multiplication, we have a single array in SUMPRODUCT like this:
=SUMPRODUCT({58,41,48;37,46,32;38,48,38;35,59,46;0,0,0;0,0,0;0,0,0;0,0,0;0,0,0;0,0,0;0,0,0;0,0,0})
Notice the Boolean array acts like a filter to "cancel out" values not associated with group "A". Only the 12 values in group A survive the operation, the other 24 values are converted to zeros. With just a single array to process, SUMPRODUCT sums the array and returns a final result of 526.
Note: although this is an array formula, the SUMPRODUCT formula does not need to be entered in a special way with control + shift + enter, since SUMPRODUCT can handle array operations natively.
Other criteria
Although the criteria needed for this example is simple (test for a specific group), you can adapt the criteria to handle more complex scenarios. For example, to perform a "contains" type search for a substring, you could use a FILTER formula like this:
=SUM(FILTER(data,ISNUMBER(SEARCH(G5,group))))
Or a SUMPRODUCT formula like this:
=SUMPRODUCT(--ISNUMBER(SEARCH(G5,group))*data)
These formulas will sum by group treating the value in G5 as a substring. For more information about this particular logic, see this example.