byte size and look, read method

I am currently learning about bytes and file processing.

#Hangul is 2 bytes per character (other is 1 byte) If it is 5 letters, it is 10 bytes, and 'Studying Python Language' is 10 letters (including spacing), so fd.read (10)
#filename.look(0) -> first part of the file, filename.look(1) -> current part of the file, filename.look(2) -> last part of the file

#1
filename ='100_text.txt'
with open(filename, encoding='utf-8-sig') as fd:
    fd.read(10)
    a = fd.tell()
a

#--------------------------------------------------------------------------------

#2
with open(filename, 'rb') as fd:
    fd.read(10)
    fd.tell()
    b = fd.tell()
b

#Result value --------------------------------------------------------------------
fd.read(10) of 'Studying Python Language' #1
26 #1 of a
fd.read of b'\xed\x8c\x8c\xec\x9d\xb4\x8d\xac' #2
fd.tell() of 10 #2
fd.tell() of 10 #2

I know that Hangul is 2 bytes per letter, and English, spacing, symbols, etc. are 1 byte.

Reading files works like this in the read method, read

fd.read (10) is 10 bytes, so shouldn't you bring Python?(It's 9 bytes, but how do you match 10 bytes in this case?)

And if you use fd.tell, you'll get the byte value of the current location, but if you look at the value that fd.tell gets in #1, you'll get 26 bytes.

If it's 2 bytes per character, shouldn't the value be 18?

Since I read it in rb, I read it in bye.

fd.read (10) is 10 bytes, so shouldn't you bring only 10 letters? I wonder why the price is so long.

I'm sorry that there are many questions and it's not organized because it's my first time learning, I'd really appreciate it if you could answer my questions!

python byte utf-8

2022-10-18 09:00

1 Answers

You are mistaken.

utf-8 encoded Hangul is 3 bytes in Korean characters.

cp949, euc-kr, etc. have 2 bytes of Korean characters. English lamp is 1 byte.

2022-10-18 09:00

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656