Understanding Thumbnail url Retrieval Regular Expressions

Asked 1 years ago, Updated 1 years ago, 39 views

We are currently trying to get the url of thumbnails from the RSS of Tena Bookmark using regular expressions.

<!DOCTYPE html>
<head>
 <metacharset="utf-8">
</head>

<body>
<div class="message"></div>
  <script src="http://d3js.org/d3.v3.js" charset="utf-8"></script>
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
  <script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.4.11/d3.min.js"></script>
  <script src="https://www.google.com/jsapi"></script>

 <script>
 varimagelist=[];
 google.load("feeds", "1");
 function initialize() {
	  varfeed = new google.feeds.feed("http://feeds.feedburner.com/hatena/b/hotentry");

	  feed.setNumEntries(-1);
	  feed.load(function(result){
		  if(!result.error){
			  for (vari=0;i<result.feed.entries.length;i++) {
				  variable=result.feed.entries[i];
				  var first_image=entry.content.match(/(http:){1}[\S_-]+\.(?:jpg|gif|png)/);
				  first_image[0] = first_image[0].replace(/(\.[^.]+$)/, "_l$1");
				  imagelist.push(first_image);
					console.log (imagelist)
			  }
		  }
	  });
  }
			  google.setOnLoadCallback(initialize);
			  console.log (imagelist[0])
 </script>
</body>
</html>

The above is the source you are creating, but if you run this program and look at the console, you'll see

 0: Array [2]
0: "https://cdn-ak.b.st-hatena.com/entryimage/273819903-1450199417_l.jpg"
1: "http:"
index:350
input: "<blockquote title=" Is Mr. Kojima, a famous game developer, established a new company leaving Konami?  : Nihon Keizai Shimbun">cite>img src="https://cdn-ak.favicon.st-hatena.com/?url=http%3A%2F%2Fwww.nikkei.com%2F"alt=">a href="http://www.nikkei.com/article/DGXLZO95184030W5A211C1TI5000/">Is famous game developer Kojima established a new company leaving Konami?  : Nihon Keizai Shimbun </a>/cite>p>>a href="http://www.nikkei.com/article/DGXLZO95184030W5A211C1TI5000/">img src="https://cdn-ak.b.st-hatena.com/entryimage/273819903-1450199417.jpg" alt="Does famous game developer Kojima establish a new company leaving Konami?  : Is Mr. Kojima, a famous game developer of Nihon Keizai Shimbun title=, established a new company leaving Konami?  : Hideo Kojima, a famous game creator known for the Nihon Keizai Shimbun's Metal Gear series, left Konami Digital Entertainment on the 15th.It is expected to set up a new company to continue making games and sell them to Sony Computer Entertainment (SCE)'s PlayStation (PS), a game console. It will set up a new company with its Konami-era subordinates.A representative "Metal Gear" series…</p>p>a href="www.nikkei.com/article/DGXLZO95184030W5A211C1TI5000/://www.nikkei.com/article/DGXLZO95184030W5A211C1TI5000/">img src="http://b.hatena.ne.jp/entry/image/http://http://b.hatena.ne.jp/entry/http" alt="Hatena Bookmark - Famous Game Developer Kojima Leaves Konami Is Established New Company?  - Nihon Keizai Shimbun title= Hatena Bookmark - Is Mr. Kojima, a famous game developer, established a new company leaving Konami?    :日本経済新聞" border="0" style="border:none"></a> <a href="http://b.hatena.ne.jp/append?http://www.nikkei.com/article/DGXLZO95184030W5A211C1TI5000/"><img src="http://b.hatena.ne.jp/images/append.gif" border="0" alt="はてなブックマークに追加" title="はてなブックマークに追加"></a></p></blockquote><img src="http://feeds.feedburner.com/~r/hatena/b/hotentry/~4/y3bfYN4dQBo"height="1" width="1" alt="">"

This url is necessary for such things as

 0: "https://cdn-ak.b.st-hatena.com/entryimage/273819903-1450199417_l.jpg"

Other unnecessary

1: "http:"

It will be acquired up to .
How do I rewrite the regular expression part to get only 0: url?
Thank you for your help.

javascript html regular-expression d3.js

2022-09-29 20:26

1 Answers

If you change entry.content.match(/(http:){1}[\S_-]+\.(?:jpg|gif|png)/) to entry.content.match(/(?:http:){1}[\S_-]+\.(?:jpg|gif|png)/), the first_image[1] disappears.In terms of the regular expression neighborhood, the former () is referred to as the capture parenthesis, and the latter (?:) is referred to as the non-capture parenthesis.The extension already uses non-captured parentheses.

But you may only want to look at first_image[0] without worrying about it.


2022-09-29 20:26

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.