File name containing kanji downloaded on wget is garbled

Asked 1 years ago, Updated 1 years ago, 126 views

I would like to ask you a question because I could not solve the problem by referring to some QAs related to wget.


as stated in the title [Filename containing kanji downloaded on wget is garbled]
I'm looking for a solution.

The wget command recursively downloads the file under the specified URL in bat.
I'm running the bat from CreateProcess on vc++

.

Environment Windows Server 2012 R2 Processing Flow
VC++->bat->wget

The bat command is as follows

Omitted...

rem UTF-8
chcp65001

wget-erobots=off --random-wait --timeout=10 --tries=1
--html-extension-nv-Rexe, zip, css, js, jpg, jpeg, gif, png, mpg, mpg, mpg, au, mp3-x-P
'Download destination directory path'-o 'Standard output log destination file path'
--restrict-file-names=nocontrol 'Download to URL' --no-check-certificate --user-agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"-r-l2-np

rem SJIS
chcp932

Omitted...

The download itself runs fine, but
Check the html download file containing Chinese characters
The file name is garbled.

Example: entanglement← / 粋.html

However, the standard output log saved with the -o option printed the correct filename.

Is there any good solution?

windows command-line

2022-09-30 20:21

1 Answers

If --restrict-file-names=nocontrol is --restrict-file-names=nocontrol, windows, ascii, it may be obtained in at least percent encoded form (which can restore original character information).


2022-09-30 20:21

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.