I want to cut the Japanese string in C++.

I am using C++11 to receive input of Japanese strings and create a program to cut out and count each character, but this is my first time using Japanese, so I don't have enough knowledge. If possible, I would like to write range_based_for in string.

c++ japanese

2022-09-30 19:17

2 Answers

std::string/ std::wstring/ std::u16string/ std:::u32string is provided, but each deals only with the corresponding character type ().For example, C standard libraries have case-insensitive comparisons such as stricmp, but C++ string does not even have these features. Everything is left to the library user, so if it becomes multi-byte characters, it is also the library user's responsibility to manage UTF-16 using std::u16string, but still Salogate pair is also the user's responsibility.rg/wiki/%E 7%B 0%E 4%BD%93%E 5%AD%97%E 3%82%BB%E3%83%AC%E3%82%AF%E3%82%BF "rel="nfollow no moreferrer">Different character selector does not fit into a single character.

In the end, it is up to the user to decide what is considered a "one character", and the corresponding action will be described.



		
		
			

				

					
				

				
					2022-09-30 19:17



	
		
The C/C++ language standard has no concept of string except literal.

There is only an array of the type that means the character code.

Also, ANSI (Shift-JIS) (also known as MBCS) and Unicode codes are handled differently.
char Nihongo_A[7] = "Japanese"; // Char tried putting Japanese in the array (for Shif-JIS code)
wchar_t Nihongo_W[4] = L "Japanese"; // Attempted to put Unicode Japanese in a wide character array (for example, UTF-16)
Both are in Japanese, but the concept of one character varies depending on the character code, so the number of elements required for the array is different as shown above.

Except for the exception, the wchar_t type means one character, so you can specify an array index to determine one character.
 if(Nihongo_W[1]==L'book'){
  // It was a book.
}
Of course, you can use Range-based for basic arrays.
 for (wchar_t&jc:Nihongo_W) {// 
  jc = L'Ah'; // Make them all 'Ah'
}
Or, there are more advanced "string classes" that have that function, so I should use them.


		
		
			

				

					
				

				
					2022-09-30 19:17
				
			
		
	
			
			If you have any answers or tips



		

	
		Popular Tags
	
	python x 4647
android x 1593
java x 1494
javascript x 1427
c x 927
c++ x 878
ruby-on-rails x 696
php x 692
python3 x 685
html x 656
	


	
		Popular Questions
	
	
	1123 In Java servlet, when SHA-256 sends WW-Authenticate header for digest authentication, the client does not return the result.

	1036 M2 Mac fails to install rbenv install 3.1.3 due to errors

	874 Is there a way to get HttpServletRequest from the controller of spring mvc without adding a parameter to the method?

	1116 /usr/bin/google-chrome:symbol lookup error:/usr/bin/google-chrome: undefined symbol:gbm_bo_get_modifier

	872 GDB gets version error when attempting to debug with the Presense SDK (IDE)