Understanding Byte Limit Processing for String with Character Code UTF-8

Asked 2 years ago, Updated 2 years ago, 44 views

I am developing it in C#.NET Framework 2.0.
I would like to get the number of bytes in UTF-8 character code for a string and delete it until it falls below the limit when the value exceeds the configured number of bytes limit.
I have considered the following method, but since the number of loops increases in proportion to the size of the string, I am considering a faster and safer way to use the c# function.
Could you please let me know if there is a good way?
Thank you in advance.

string input = "string you want to limit the number of bytes";
            UTF8 Encoding utf8 = new UTF8 Encoding (false, true);
            int byteSize=utf8.GetByteCount(input);
            int maxByteSize=15;
            string output = null;
            if(byteSize>maxByteSize)
            {
                inti = 1;
                while(true){
                    output = input.Substring(0, input.Length-i);
                    if (utf8.GetByteCount(output)<maxByteSize)
                    {
                        break;
                    }
                    i++;
                }
            }
            else
            {
                output = input;
            }
        }

c#

2022-09-30 21:17

1 Answers

How about encoding byte[] once, truncating it to the desired number of bytes, and then decoding it ignoring the half-byte string?

string input = "string you want to limit the number of bytes";
int maxByteSize = 14; // I dared to change the number to an incoherent one.
Encoding encoding = Encoding.GetEncoding (65001,
    EncoderFallback.ExceptionFallback,//ExceptionFallback.
    new DecoderReplacementFallback("")); // Replace errors with empty strings during decoding

bytes[] bytes=encoding.GetBytes(input);
string output = encoding.GetString (bytes, 0, Math.Min (maxByteSize, bytes.Length));


2022-09-30 21:17

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.