Get page from URL

To extract the page from a URL (i.e. the part of a path after the domain), you can use a formula based on the TEXTAFTER function. In the example shown, the formula in D5 is:

="/"&TEXTAFTER(B5,"/",3)

As the formula is copied down, it returns the part of the path that occurs after the domain name.

Note: TEXTAFTER is a newer function in Excel. In an older version of Excel, you can use a formula based on the MID and LEN functions, as explained below.

Explanation

In this example, we have a list of URLs. The goal is to get the portion of each URL that appears after the domain name. In the current version of Excel, the easiest way to do this is to use the TEXTAFTER function. In an older version of Excel, you can use a formula based on the MID, FIND, and LEN functions. Both approaches are explained below.

TEXTAFTER function

The TEXTAFTER function returns the text that occurs after a given delimiter. The generic syntax for TEXTAFTER supports quite a number of options:

=TEXTAFTER(text,delimiter,[instance_num],[match_mode],[match_end], [if_not_found])

However, most of the inputs are optional and for this problem, we only need to provide the first three arguments:

=TEXTAFTER(text,delimiter,instance_num)

In the worksheet shown, the formula in cell D5 is:

="/"&TEXTAFTER(B5,"/",3)

The TEXTAFTER function is configured with the following inputs:

text - the URL in cell B5
delimiter - a forward slash "/"
instance_num - 3, for the third occurrence of "/"

With the text "https://exceljet.net/formulas" in cell B5, TEXTAFTER splits the string at the third "/" and returns "formulas". Next, a forward slash "/" is prepended to the result from TEXTAFTER with concatenation to create a final result that begins with "/". This last step is necessary because TEXTAFTER does not include the delimiter used to split the text, so it needs to be added back manually if desired.

Legacy Excel

TEXTAFTER is a new function in Excel. In an older version of Excel, you can solve this problem with a formula based on the MID, FIND, and LEN functions:

=MID(B5,FIND("/",B5,9),LEN(B5))

At the core, this formula is extracting characters with the MID function, and using the FIND function to figure out where to begin extracting. First, FIND locates the "/" character in the URL, starting at the 9th character:

FIND("/",B5,9)

This is the "clever" part of the formula. URLs begin with something called a "protocol" (i.e. "http://", "https://", "ftp://", "sftp://", etc.) By starting at the 9th character, the protocol is skipped, and the FIND function returns the location of the third instance of "/", which is the first forward slash "/" after the protocol. With the text "https://exceljet.net/formulas" in cell B5, the third instance of "/" is the 21st character in the URL, so FIND returns the number 21 to the MID function as the start_num argument. At this point, we have:

=MID(B5,21,LEN(B5))

To provide a value for the num_chars argument, we use the LEN function, which returns a count of all the characters in B5. This is a "hack" to keep things simple. LEN will return 29 in this case, the total number of characters in the text "https://exceljet.net/formulas". This means there are only 20 characters remaining after the "//". However, the MID function doesn't care if the number of characters (num_chars) exceeds the remaining string length. MID will just keep extracting characters until the end of the string. In other words, using LEN to provide num_chars is an easy way to give MID a number that is always enough to get the job done. Dropping in the value returned by the LEN function, we now have a formula that looks like this:

=MID(B5,21,29) // returns "/formulas"

The MID function begins extracting at character 21 and extracts all of the remaining text. The final result is "/formulas". Unlike the TEXTAFTER version of the formula above, there is no need to concatenate a "/" to the beginning, since the MID function includes the delimiter in the result.