I would like to ask you a question because I could not solve the problem by referring to some QAs related to wget.
as stated in the title
[Filename containing kanji downloaded on wget is garbled]
I'm looking for a solution.
The wget command recursively downloads the file under the specified URL in bat.
I'm running the bat from CreateProcess on vc++
Environment Windows Server 2012 R2
Processing Flow
VC++->bat->wget
The bat command is as follows
Omitted...
rem UTF-8
chcp65001
wget-erobots=off --random-wait --timeout=10 --tries=1
--html-extension-nv-Rexe, zip, css, js, jpg, jpeg, gif, png, mpg, mpg, mpg, au, mp3-x-P
'Download destination directory path'-o 'Standard output log destination file path'
--restrict-file-names=nocontrol 'Download to URL' --no-check-certificate --user-agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"-r-l2-np
rem SJIS
chcp932
Omitted...
The download itself runs fine, but
Check the html download file containing Chinese characters
The file name is garbled.
Example: entanglement← / 粋.html
However, the standard output log saved with the -o option printed the correct filename.
Is there any good solution?
windows command-line
If --restrict-file-names=nocontrol
is --restrict-file-names=nocontrol, windows, ascii
, it may be obtained in at least percent encoded form (which can restore original character information).
© 2024 OneMinuteCode. All rights reserved.