Summary

To decode a string into its Unicode sequence, you can use the UNICODE function combined with the REGEXEXTRACT, BASE, and TEXTJOIN functions. For example, to convert the string "apple" into its Unicode sequence, you can use the following formula:

=TEXTJOIN(" ",,BASE(UNICODE(REGEXEXTRACT("apple",".",1)),16,4))

This will return the Unicode sequence "0061 0070 0070 006C 0065", which represents the Unicode code points for each character in "apple".

Generic formula

=TEXTJOIN(" ",,BASE(UNICODE(REGEXEXTRACT(str,".",1)),16,4))

Explanation 

In this example, the goal is to convert each character in a text string into its corresponding Unicode code point and display the results as a space-separated sequence. This problem can be solved using several Excel functions working together to extract, convert, and format the Unicode values.

The formula explained

The formula processes the input string through several steps:

=TEXTJOIN(" ",,BASE(UNICODE(REGEXEXTRACT(B6,".",1)),16,4))

Working from the inside out:

  1. REGEXEXTRACT(B6,".",1) - Extracts each character individually {"a";"p";"p";"l";"e"}
  2. UNICODE(REGEXEXTRACT(...)) - Converts each character to its Unicode code point as decimal values {97;112;112;108;101}
  3. BASE(UNICODE(...),16,4) - Converts decimal numbers to hexadecimal (hex) with 4-digit padding {0061;0070;0070;006C;0065}
  4. TEXTJOIN(" ",, ...) - Joins all values with spaces as separators

The advantage of this formula over the standard UNICODE function is that it processes the entire string and returns the complete Unicode sequence for all characters. This makes it better for analyzing and understanding how entire text strings are represented internally.

For example, the standard UNICODE function returns the code point for only the first character of a string.

=BASE(UNICODE("apple"),16,4) // returns 0061

While this formula returns the complete Unicode sequence for all characters:

=TEXTJOIN(" ",,BASE(UNICODE(REGEXEXTRACT("apple",".",1)),16,4)) // returns "0061 0070 0070 006C 0065"

Example #1 - Basic examples

Here are some examples of text strings and their corresponding Unicode sequences:

Basic examples of getting Unicode sequences from text

Note: As of this writing in October 2025, emojis in Excel Online (i.e. the web app) appear in color, but emojis in the desktop version of Excel appear in black and white only.

Example #2 - Text with special characters

A practical use-case of this formula is to decode a string that contains special characters into its corresponding Unicode sequence.

For example, consider the mathematical formula "A = πr²" that describes the area of a circle. This string contains Unicode characters that are harder to type on the keyboard. Using the formula, we can decode the Unicode sequence:

=TEXTJOIN(" ",,BASE(UNICODE(REGEXEXTRACT("A = πr²",".",1)),16,4)) // returns "0041 0020 003D 0020 03C0 0072 00B2"

Breaking down the Unicode sequence the formula returns:

  • 0041 - "A" (Latin capital letter A)
  • 0020 - " " (space)
  • 003D - "=" (equals sign)
  • 0020 - " " (space)
  • 03C0 - "π" (Greek small letter pi - U+03C0)
  • 0072 - "r" (Latin small letter r)
  • 00B2 - "²" (superscript two - U+00B2)

There are two characters of interest: "π" (Greek small letter pi - U+03C0) and "²" (superscript two - U+00B2). If you didn't know what these characters are and wanted to find out more about them, you can use the Unicode value to search for them online. This is also useful to know for going the other way and encoding a string into its Unicode sequence using the UNICHAR function or this formula which is the reverse of this formula.

Example #3 - Text with combining characters

The Unicode standard defines a wide range of characters for expressing the written forms of many languages around the world in addition to emojis, mathematical symbols, and other special characters. Sometimes, this text can contain combining characters that modify how other characters are displayed but don't appear as separate visual elements.

For example, the heart emoji "❤️" is actually a combination of the basic heart symbol "❤" (U+2764) plus a variation selector (U+FE0F) that tells the system to display it as an emoji rather than a text symbol.

=TEXTJOIN(" ",,BASE(UNICODE(REGEXEXTRACT("❤️",".",1)),16,4)) // returns "2764 FE0F"

This is why you see two code points for what appears to be a single character. You can always do a sanity check by checking the length of the text string too.

=LEN("❤️") // returns 2

Now let's look at a more complex example. The mending heart emoji "❤️‍🩹" is a sequence of four Unicode code points:

=TEXTJOIN(" ",,BASE(UNICODE(REGEXEXTRACT("❤️‍🩹",".",1)),16,4)) // returns "2764 FE0F 200D 1FA79"

This is because the mending heart emoji is a sequence of two Unicode code points: "❤️" and "‍🩹". The "❤️" is the heart emoji and the "‍🩹" is the mending heart emoji. These two emojis are combined into a single emoji by the "zero-width joiner" character (U+200D).

Dave Bruns Profile Picture

AuthorMicrosoft Most Valuable Professional Award

Dave Bruns

Hi - I'm Dave Bruns, and I run Exceljet with my wife, Lisa. Our goal is to help you work faster in Excel. We create short videos, and clear examples of formulas, functions, pivot tables, conditional formatting, and charts.