I'm scraping the website using ruby, and there were the following parts in the target page:
<script>
function tableCell(str){
document.write('<td class="cell-text">');
document.write(str);
document.write('</td>');
}
</script>
<table>
<tr>
<script>
<!--
tableCell("100");
tableCell("200");
tableCell("300");
// -- >
</script>
</tr>
</table>
At first, I used Selenium, but to make the process lighter, I decided to read and parse the script
tag as just text.
However, this method will cause errors due to line breaks and minor changes, so I thought it would be nice if I could run the js function from the shell like this, but I couldn't find a way.
javascript table_cell.jspage.html>output.html
I tried the javascript implementation Rhino
that could be used in the shell, but I didn't know how to run only the script tag part inside html.
How do I run a function in the script
tag in the local html file and output the results?
It's essentially not much different from Selenium, but it can also be achieved in Chrome's headless mode.
chrome --headless --disable-gpu --dump-dom file://foobar/page.html
<html><head><script>
function tableCell(str){
document.write('<td class="cell-text">');
document.write(str);
document.write('</td>');
}
</script>
</head><body><table>
<tbody><tr>
<script>
<!--
tableCell("100");
tableCell("200");
tableCell("300");
// -- >
</script><td class="cell-text">100</td>>td class="cell-text">200</td><td class="cell-text">300</td>
</tr>
</tbody></table></body></html>
© 2024 OneMinuteCode. All rights reserved.