How do I retrieve text from the C#WebBrowser control while it is hidden?

Asked 2 years ago, Updated 2 years ago, 146 views

Thank you for your help.

It's not very easy to use WebBrowser controls, but you can use them to browse. I am writing a code that extracts only the links of the pages that are listed.
If there is a link in a page, go to that page using
Reboot, etc. Extract links only for similarly browsed pages
I'm trying to create a flow.

There is no problem until I retrieve it using regular expressions, but

To retrieve text from DocumentText in WebBrowser You must open and display the page in Navigate once.
It's not a big deal if it's just one page, but it's all linked
To retrieve a page, browsing and viewing time is really lost.
How can I retrieve text from DocumentText in WebBrowser without displaying it?
Do you have any?

Thank you for your cooperation.

c# webview winforms

2022-09-30 21:18

4 Answers

The WebBrowser control lets you turn off image loading and scripts.
https://social.msdn.microsoft.com/Forums/ja-JP/6ff03e2f-c7e4-4fc7-aa93-40d404d85908/vb2005-webbrowser-?forum=vbgeneralja
is helpful.
I'm using it after a little more modification, but if I have a request, I can publish it, so I'll publish it.


2022-09-30 21:18

If it's just text, why don't you use WebClient.DownloadString instead of the WebBrowser class?


2022-09-30 21:18

If you are concerned about the character code, you can use the HttpClient in the Microsoft.AspNet.WebApi.Client package.
Also, if "a completely different site" refers to https://xn --icknowag8a8de9wpc5ff2k.gamewith.jp/, it is simply a Japanese URL encoded


2022-09-30 21:18

How do I get it from WebClient.DownloadString?

It is a problem that different sites are downloaded in the Japanese domain, but it seems that adding the uri section to the configuration section of App.config will work properly.(Reference: Uri Class)

<configuration>
  <uri>
    <idn enabled="All"/>
    <iriParsing enabled="true"/>
  </uri>
</configuration>

Also, there are many sites that change behavior depending on the user-agent, so it would be better to set the same browser as the target browser.Here are some examples:

var client=newWebClient();
client.Headers.Add("user-agent", "Mozilla/4.0(compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/7.0; .NET 4.0C; .NET 4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; Tablet2); PC
var download=client.DownloadString(newUri("https://~"));


2022-09-30 21:18

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.