I'm going to use GAS to create a program that will keep an eye on blog updates and notify you.
https://web.plus-idea.net/2018/04/google-apps-script-xmlservice-parse/
I was able to retrieve the contents of the page of the external site by referring to this, but
The library called XmlService does not parse xml well
I'm in trouble because I can't take out the elements
What you got is
<rdf>
<channel></channel>
<item><link></link><title></item>
<item><link></link><title></item>
:
</rdf>
I think it's not good to visit the other site many times because it's in the form of
I'm running it with the same format as above
rootDoc.getChildren() has a length of 0 and cannot retrieve child elements
I don't really understand the name space, but
Showing the contents of rootDoc
[Element:<rdf:RDF [Namespace:http://www.w3.org/1999/02/22-rdf-syntax-ns#]/>]
So I tried specifying the URL of the namespace in the simulation of the blog, but it didn't work.
What I'm curious about is that the rootDoc itself is arranged because it'
It looks like this, but I don't know how to take out the contents.
Below is the test code
function myFunction(){
const content=`
<rdf:RDF xmlns="http://purl.org/rss/1.0/"xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns:dc="http://purl.org/dc/elements/1.1/"xmlns:content="http://purl.org/rss/1.0/modules/content/"xmlns:cc="http://web.resource.org/cc/"xmlns:atom="http://www.w3.org/2005/Atom"xml:lang="ja">
<channel rdf:about="http://google.com">
...
</channel>
<item rdf:about="http://google.com">
<link>http://google.com</link>
<title> Title> /title>
</item>
<item rdf:about="http://google.com">
<link>http://google.com</link>
<title> Title> /title>
</item>
</rdf:RDF>
`
var xmlDoc = XmlService.parse(content);
varrootDoc=xmlDoc.getRootElement();
Logger.log(rootDoc);
varns=XmlService.getNamespace("rdf", "http://www.w3.org/1999/02/22-rdf-syntax-ns#");
items=rootDoc.getChildren('item',ns);
Logger.log(items.length);
for (vari=0;i<items.length;i++) {
console.log(items[i].getText());
variable=items[i].getChild("title").getText();
varurl=items[i].getChild("link").getText();
var text = title + ' ' + url;
Logger.log(text);
}
}
Run Results
20:51:22 Announcement Run Start
20:51:23 Information [Element:<rdf:RDF [Namespace:http://www.w3.org/1999/02/22-rdf-syntax-ns#]/>]
20:51:23 Information 0.0
20:51:24 Announcement completed
The namespace of this XML document is declared xmlns="http://purl.org/rss/1.0/"
.This should be the namespace that applies to tags without prefixes.
Therefore, use this to get the namespace used by getChildren
.
varns=XmlService.getNamespace("http://purl.org/rss/1.0/")
Other places that use getChild
should work with this ns
.
© 2024 OneMinuteCode. All rights reserved.