In order to make the above URL look like the following, what kind of regular expression should I write?
Variable values except for the domain of the URL.
Please let me know if you know more.Thank you for your cooperation.
http://www.amazon.co.jp
I would like to remove the fourth and subsequent slash from the above domain or later.
If you want to exclude the fourth slash or later from the URL and match it,
preg_match('#http://(?:/[^/]+){4}#', $url, $matches);
$result=$matches[0];
// If you want to limit the domain
// # http://www\.amazon\.co\.jp(?:/[^/]+){3}#
It matches from http://
until the slash-free block /[^/+]
starts with a slash (including the domain name) continues four times.If you want to remove a query string without a fourth slash, you should use [^/?]
instead of [^/]
.
Well, if it's only for Amazon, you can recognize /dp/xxxx
and match it in detail.
preg_match('#https?://www\.amazon\.co\.jp/(?:[^/]+/)?dp/[A-Z0-9]+#',$url,$matches);
Run Sample https://regex101.com/r/nY8dN2/1
I'm sorry I couldn't do it all at once, but I think it's like this.
<?php
preg_match_all('#(https?:\/\/(?:.*?\/){4})#', 'http://www.amazon.co.jp/%E3%83%8E%E3%83%BC%E3%83%88%E3%83%91%E3%82%BD%E3%82%B3%E3%83%B3-EeeBook-X205TA-WHITE10-Windows10-11-6%E3%82%A4%E3%83%B3%E3%83%81%E3%83%AF%E3%82%A4%E3%83%89/dp/B015DTB87Q/ref=sr_1_1?s=computers&ie=UTF8&qid=1460353489&sr=1-1&keywords=%E3%83%91%E3%82%BD%E3%82%B3%E3%83% B3', $m);
if(isset($m[1][0])){
echo trim($m[1][0], '\/');
}
?>
Results
http://www.amazon.co.jp/%E3%83%8E%E3%83%BC%E3%83%88%E3%83%91%E3%82%BD%E3%82%B3%E3%83%B3-EeeBook-X205TA-WHITE10-Windows10-11-6%E3%82%A4%E3%83%B3%E3%83%81%E3%83%AF%E3%82%A4%E3%83%89/dp/B015DTB87Q
© 2024 OneMinuteCode. All rights reserved.