Understanding U+ in Emoji U+1F600

Asked 1 years ago, Updated 1 years ago, 261 views

Smiling smile is U+1F600 in Unicode, but does this U+ mean Unicode?
Does it mean Unicode 1F600?
I'd like to change it to binary and deepen my understanding.

character-code unicode

2022-10-10 01:00

2 Answers

"Unicode 1F600" is correct.
(I don't know the connection between that and "I want to change it to binary and understand it better")

However, if you have any questions, let's look at the original specifications as follows:

If you search with the keyword "unicode specification", you'll find a page called About the Unicode®Standard, and there's a link called Latest Version in the upper left corner, which you can see here as the latest specification.
Unicode®15.0.0
On the left menu, ap15.0.0 Chapters の lists the titles of each chapter and links to PDFs.

Probably the first one to be described is here.
2 General Structure
Page #29 "2.4 Code Points and Characters" ends with the following statement:

When reference to code points in the Unicode Standard, the user practice is to refer to them by their numerical value expressed in hexadecimal, with a "U+" prefix. (See Appendix A, Notational Conventions.)
When referring to code points in the Unicode standard, it is usually referred to as a hexadecimal number with a +U+フィ prefix. (See Appendix A, Conventions.)

The Appendix A mentioned above is this at the beginning of "15.0.0 Appendices and Back Matter".
A Notational Conventions
This PDF is the third page, and the page number is written at the beginning of page 968 as follows:

A.1 Typographic Conventions
Code Points
In running text, an individual Unicode code point is expressed as U+n, where n is four to six hexadecimal digits, using the digits 0–9 and uppercase letters A–F (for 10 through 15, respectively). Leading zeros are omitted, unless the code point would have fewer than four hexadecimal digits—for example, U+0001, U+0012, U+0123, U+1234, U+12345, U+102345.
In running text, individual Unicode code points are represented as U+n.where n is a 4-6 hexadecimal number using digits 0 to 9 and uppercase letters A to F (10 to 15 respectively).Leading zeros are omitted unless the code point is less than four hexadecimal digits. (e.g. U+0001, U+0012, U+0123, U+1234, U+12345, U+102345).

In addition, you will be able to obtain information on Wikipedia's Japanese articles.
Unicode-Wikipedia
The paragraph? in the character set contains the following:

"U+" is followed by four to six hexadecimal digits of the code position when the Unicode code position is written in a sentence.


2022-10-10 01:00

Does U+ mean Unicode?

That's right.
This is not a part of the number, but a mark.According to Re:Origin of the U+nnnnotation, they originally wanted to use the symbol " +" (U+228EMULTISET UNION).

So you can ignore the U+ part when you treat it as a number. For U+1F600, 1F600 in hexadecimal is the code position (code point).


2022-10-10 01:00

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.