Xpath query for class matching
Hi all, For the module I am maintaining at drupal, I need to select the element which has a specific class. I am using the following query for it: for eg <div class="foo"></div> is selected by the following query. $class = "foo"; $xpath->query("//*[@class = '".$class."']"); But, this fails in case of multiple classes, i.e. does not select this : <div class="foo bar"> </div> I don't have much experience in writing regex, can someone please help me out with this. One more question: is this url valid : http://xyz.com/path with space.html Looking forward. -- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
nitin gupta wrote:
for eg <div class="foo"></div> is selected by the following query.
$class = "foo"; $xpath->query("//*[@class = '".$class."']");
But, this fails in case of multiple classes, i.e. does not select this :
<div class="foo bar"> </div> The "contains" function would probably work, eg (untested): $xpath->query("//div[contains(@class, '" . $class . "')]");
One more question: is this url valid : http://xyz.com/path with space.html Nope. You can escape the url if you just HAVE to have spaces:
http://n00b.com/path%20with%20space.html but why not use dashes? http://pro.com/path-with-properly-indexed-spaces.html -Dom
Thanks for your help. but this query will probably select "abcfooxyz" as well when "foo" is supplied. (untested), although it will definitely select "foo" in "foo bar" (tested) $xpath->query("//div[contains(@class, '" . $class . "')]") How can we be more specific? @URL: Actually I am maintaining the module feedapi imagegrabber, which downloads images from external websites. Now sometimes the url I parse has spaces, so I am unable to decide whether or not to percentage encode the URL, because percentage encoding will make this URL valid but will break the following URL: http://www.google.com/search?q=hello by converting it to http://www.google.com/search?q%3Dhello Looking forward. -- Regards, Nitin Kumar Gupta http://publicmind.in/blog/ On Sat, Oct 3, 2009 at 1:30 AM, Domenic Santangelo <domenic@workhabit.com> wrote:
nitin gupta wrote:
for eg <div class="foo"></div> is selected by the following query. $class = "foo"; $xpath->query("//*[@class = '".$class."']"); But, this fails in case of multiple classes, i.e. does not select this : <div class="foo bar"> </div>
The "contains" function would probably work, eg (untested): $xpath->query("//div[contains(@class, '" . $class . "')]");
One more question: is this url valid : http://xyz.com/path with space.html
Nope. You can escape the url if you just HAVE to have spaces:
http://n00b.com/path%20with%20space.html
but why not use dashes?
http://pro.com/path-with-properly-indexed-spaces.html
-Dom
nitin gupta wrote:
Thanks for your help. but this query will probably select "abcfooxyz" as well when "foo" is supplied. (untested), although it will definitely select "foo" in "foo bar" (tested)
$xpath->query("//div[contains(@class, '" . $class . "')]")
How can we be more specific? Try this example: http://westhoffswelt.de/blog/0036_xpath_to_select_html_by_class.html
@URL: Actually I am maintaining the module feedapi imagegrabber, which downloads images from external websites. Now sometimes the url I parse has spaces, so I am unable to decide whether or not to percentage encode the URL, because percentage encoding will make this URL valid but will break the following URL: http://www.google.com/search?q=hello by converting it to http://www.google.com/search?q%3Dhello "q=hello" is a query string, not strictly part of the path. I would strip the url to its component parts (parse_url) and encode the path, then re-append the query string.
http://us2.php.net/manual/en/function.parse-url.php http://www.faqs.org/rfcs/rfc1738.html HTH, -D
Looking forward.
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 1:30 AM, Domenic Santangelo <domenic@workhabit.com <mailto:domenic@workhabit.com>> wrote:
nitin gupta wrote:
for eg <div class="foo"></div> is selected by the following query. $class = "foo"; $xpath->query("//*[@class = '".$class."']"); But, this fails in case of multiple classes, i.e. does not select this : <div class="foo bar"> </div>
The "contains" function would probably work, eg (untested): $xpath->query("//div[contains(@class, '" . $class . "')]");
One more question: is this url valid : http://xyz.com/path with
space.html
Nope. You can escape the url if you just HAVE to have spaces:
http://n00b.com/path%20with%20space.html
but why not use dashes?
http://pro.com/path-with-properly-indexed-spaces.html
-Dom
Hi, Thanks, It works like a charm.. When will we get a perfect search engine? @URL: Maybe in hurry, I did not explain it properly. This was exactly what I was doing. look at the project, https://sourceforge.net/projects/absoluteurl/ until I came across this image url on the google images page: http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow... What would you consider here the query string or the path? Anything we do, the URL is going to break unless we do nothing. Any ideas?(may be just for encode the spaces). -- Regards, Nitin Kumar Gupta http://publicmind.in/blog/ On Sat, Oct 3, 2009 at 2:22 AM, Domenic Santangelo <domenic@workhabit.com>wrote:
nitin gupta wrote:
Thanks for your help. but this query will probably select "abcfooxyz" as well when "foo" is supplied. (untested), although it will definitely select "foo" in "foo bar" (tested) $xpath->query("//div[contains(@class, '" . $class . "')]")
How can we be more specific?
Try this example: http://westhoffswelt.de/blog/0036_xpath_to_select_html_by_class.html
@URL: Actually I am maintaining the module feedapi imagegrabber, which downloads images from external websites. Now sometimes the url I parse has spaces, so I am unable to decide whether or not to percentage encode the URL, because percentage encoding will make this URL valid but will break the following URL: http://www.google.com/search?q=hello by converting it to http://www.google.com/search?q%3Dhello
"q=hello" is a query string, not strictly part of the path. I would strip the url to its component parts (parse_url) and encode the path, then re-append the query string.
http://us2.php.net/manual/en/function.parse-url.php http://www.faqs.org/rfcs/rfc1738.html
HTH,
-D
Looking forward.
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 1:30 AM, Domenic Santangelo <domenic@workhabit.com> wrote:
nitin gupta wrote:
for eg <div class="foo"></div> is selected by the following query. $class = "foo"; $xpath->query("//*[@class = '".$class."']"); But, this fails in case of multiple classes, i.e. does not select this : <div class="foo bar"> </div>
The "contains" function would probably work, eg (untested): $xpath->query("//div[contains(@class, '" . $class . "')]");
One more question: is this url valid : http://xyz.com/path with
space.html
Nope. You can escape the url if you just HAVE to have spaces:
http://n00b.com/path%20with%20space.html
but why not use dashes?
http://pro.com/path-with-properly-indexed-spaces.html
-Dom
nitin gupta wrote:
@URL: Maybe in hurry, I did not explain it properly. This was exactly what I was doing. look at the project, https://sourceforge.net/projects/absoluteurl/
until I came across this image url on the google images page:
http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow...
What would you consider here the query string or the path? Anything we do, the URL is going to break unless we do nothing. Any ideas?(may be just for encode the spaces).
scheme: http host: t3.gstatic.com <http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> path: images <http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> query: q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg <http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> That url is some sort of thumbnail maker or something... compare to http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg -D
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 2:22 AM, Domenic Santangelo <domenic@workhabit.com <mailto:domenic@workhabit.com>> wrote:
nitin gupta wrote:
Thanks for your help. but this query will probably select "abcfooxyz" as well when "foo" is supplied. (untested), although it will definitely select "foo" in "foo bar" (tested)
$xpath->query("//div[contains(@class, '" . $class . "')]")
How can we be more specific?
Try this example: http://westhoffswelt.de/blog/0036_xpath_to_select_html_by_class.html
@URL: Actually I am maintaining the module feedapi imagegrabber, which downloads images from external websites. Now sometimes the url I parse has spaces, so I am unable to decide whether or not to percentage encode the URL, because percentage encoding will make this URL valid but will break the following URL: http://www.google.com/search?q=hello by converting it to http://www.google.com/search?q%3Dhello
"q=hello" is a query string, not strictly part of the path. I would strip the url to its component parts (parse_url) and encode the path, then re-append the query string.
http://us2.php.net/manual/en/function.parse-url.php http://www.faqs.org/rfcs/rfc1738.html
HTH,
-D
Looking forward.
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 1:30 AM, Domenic Santangelo <domenic@workhabit.com <mailto:domenic@workhabit.com>> wrote: > > nitin gupta wrote: > > for eg <div class="foo"></div> is selected by the following query. > $class = "foo"; > $xpath->query("//*[@class = '".$class."']"); > But, this fails in case of multiple classes, i.e. does not select this : > <div class="foo bar"> </div> > > The "contains" function would probably work, eg (untested): > $xpath->query("//div[contains(@class, '" . $class . "')]"); > > One more question: is this url valid : http://xyz.com/path with space.html > > Nope. You can escape the url if you just HAVE to have spaces: > > http://n00b.com/path%20with%20space.html > > but why not use dashes? > > http://pro.com/path-with-properly-indexed-spaces.html > > -Dom
Yeah, I understand . it is the thumbnail maker for google images page. But if I want to have a specific script for this conversion, one can never be sure. If you notice, there are two colons in the query, which if percent encoded will render the URL invalid. Thanks for all your inputs. -- Regards, Nitin Kumar Gupta http://publicmind.in/blog/ On Sat, Oct 3, 2009 at 3:06 AM, Domenic Santangelo <domenic@workhabit.com>wrote:
nitin gupta wrote:
@URL: Maybe in hurry, I did not explain it properly. This was exactly what I was doing. look at the project, https://sourceforge.net/projects/absoluteurl/ until I came across this image url on the google images page:
http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow...
What would you consider here the query string or the path? Anything we do, the URL is going to break unless we do nothing. Any ideas?(may be just for encode the spaces).
scheme: http host: t3.gstatic.com<http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> path: images<http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> query: q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg<http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg>
That url is some sort of thumbnail maker or something... compare to http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg
-D
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 2:22 AM, Domenic Santangelo <domenic@workhabit.com>wrote:
nitin gupta wrote:
Thanks for your help. but this query will probably select "abcfooxyz" as well when "foo" is supplied. (untested), although it will definitely select "foo" in "foo bar" (tested) $xpath->query("//div[contains(@class, '" . $class . "')]")
How can we be more specific?
Try this example: http://westhoffswelt.de/blog/0036_xpath_to_select_html_by_class.html
@URL: Actually I am maintaining the module feedapi imagegrabber, which downloads images from external websites. Now sometimes the url I parse has spaces, so I am unable to decide whether or not to percentage encode the URL, because percentage encoding will make this URL valid but will break the following URL: http://www.google.com/search?q=hello by converting it to http://www.google.com/search?q%3Dhello
"q=hello" is a query string, not strictly part of the path. I would strip the url to its component parts (parse_url) and encode the path, then re-append the query string.
http://us2.php.net/manual/en/function.parse-url.php http://www.faqs.org/rfcs/rfc1738.html
HTH,
-D
Looking forward.
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 1:30 AM, Domenic Santangelo <domenic@workhabit.com> wrote:
nitin gupta wrote:
for eg <div class="foo"></div> is selected by the following query. $class = "foo"; $xpath->query("//*[@class = '".$class."']"); But, this fails in case of multiple classes, i.e. does not select this : <div class="foo bar"> </div>
The "contains" function would probably work, eg (untested): $xpath->query("//div[contains(@class, '" . $class . "')]");
One more question: is this url valid : http://xyz.com/path with
space.html
Nope. You can escape the url if you just HAVE to have spaces:
http://n00b.com/path%20with%20space.html
but why not use dashes?
http://pro.com/path-with-properly-indexed-spaces.html
-Dom
You don't encode the query string, just the path. -D nitin gupta wrote:
Yeah, I understand . it is the thumbnail maker for google images page. But if I want to have a specific script for this conversion, one can never be sure. If you notice, there are two colons in the query, which if percent encoded will render the URL invalid.
Thanks for all your inputs.
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 3:06 AM, Domenic Santangelo <domenic@workhabit.com <mailto:domenic@workhabit.com>> wrote:
nitin gupta wrote:
@URL: Maybe in hurry, I did not explain it properly. This was exactly what I was doing. look at the project, https://sourceforge.net/projects/absoluteurl/
until I came across this image url on the google images page:
http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow...
What would you consider here the query string or the path? Anything we do, the URL is going to break unless we do nothing. Any ideas?(may be just for encode the spaces).
scheme: http host: t3.gstatic.com <http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> path: images <http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> query: q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg <http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg>
That url is some sort of thumbnail maker or something... compare to http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg
-D
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 2:22 AM, Domenic Santangelo <domenic@workhabit.com <mailto:domenic@workhabit.com>> wrote:
nitin gupta wrote:
Thanks for your help. but this query will probably select "abcfooxyz" as well when "foo" is supplied. (untested), although it will definitely select "foo" in "foo bar" (tested)
$xpath->query("//div[contains(@class, '" . $class . "')]")
How can we be more specific?
Try this example: http://westhoffswelt.de/blog/0036_xpath_to_select_html_by_class.html
@URL: Actually I am maintaining the module feedapi imagegrabber, which downloads images from external websites. Now sometimes the url I parse has spaces, so I am unable to decide whether or not to percentage encode the URL, because percentage encoding will make this URL valid but will break the following URL: http://www.google.com/search?q=hello by converting it to http://www.google.com/search?q%3Dhello
"q=hello" is a query string, not strictly part of the path. I would strip the url to its component parts (parse_url) and encode the path, then re-append the query string.
http://us2.php.net/manual/en/function.parse-url.php http://www.faqs.org/rfcs/rfc1738.html
HTH,
-D
Looking forward.
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 1:30 AM, Domenic Santangelo <domenic@workhabit.com <mailto:domenic@workhabit.com>> wrote: > > nitin gupta wrote: > > for eg <div class="foo"></div> is selected by the following query. > $class = "foo"; > $xpath->query("//*[@class = '".$class."']"); > But, this fails in case of multiple classes, i.e. does not select this : > <div class="foo bar"> </div> > > The "contains" function would probably work, eg (untested): > $xpath->query("//div[contains(@class, '" . $class . "')]"); > > One more question: is this url valid : http://xyz.com/path with space.html > > Nope. You can escape the url if you just HAVE to have spaces: > > http://n00b.com/path%20with%20space.html > > but why not use dashes? > > http://pro.com/path-with-properly-indexed-spaces.html > > -Dom
That will do till we run into some other exceptions :-). Thanks, I did not think of encoding just the path. -- Regards, Nitin Kumar Gupta http://publicmind.in/blog/ On Sat, Oct 3, 2009 at 3:33 AM, Domenic Santangelo <domenic@workhabit.com>wrote:
You don't encode the query string, just the path.
-D
nitin gupta wrote:
Yeah, I understand . it is the thumbnail maker for google images page. But if I want to have a specific script for this conversion, one can never be sure. If you notice, there are two colons in the query, which if percent encoded will render the URL invalid. Thanks for all your inputs.
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 3:06 AM, Domenic Santangelo <domenic@workhabit.com>wrote:
nitin gupta wrote:
@URL: Maybe in hurry, I did not explain it properly. This was exactly what I was doing. look at the project, https://sourceforge.net/projects/absoluteurl/ until I came across this image url on the google images page:
http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow...
What would you consider here the query string or the path? Anything we do, the URL is going to break unless we do nothing. Any ideas?(may be just for encode the spaces).
scheme: http host: t3.gstatic.com<http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> path: images<http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg> query: q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg<http://t3.gstatic.com/images?q=tbn:7i1D2KAZcCd8yM:http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg>
That url is some sort of thumbnail maker or something... compare to http://www.flash-slideshow-maker.com/images/help_clip_image004.jpg
-D
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 2:22 AM, Domenic Santangelo <domenic@workhabit.com
wrote:
nitin gupta wrote:
Thanks for your help. but this query will probably select "abcfooxyz" as well when "foo" is supplied. (untested), although it will definitely select "foo" in "foo bar" (tested) $xpath->query("//div[contains(@class, '" . $class . "')]")
How can we be more specific?
Try this example: http://westhoffswelt.de/blog/0036_xpath_to_select_html_by_class.html
@URL: Actually I am maintaining the module feedapi imagegrabber, which downloads images from external websites. Now sometimes the url I parse has spaces, so I am unable to decide whether or not to percentage encode the URL, because percentage encoding will make this URL valid but will break the following URL: http://www.google.com/search?q=hello by converting it to http://www.google.com/search?q%3Dhello
"q=hello" is a query string, not strictly part of the path. I would strip the url to its component parts (parse_url) and encode the path, then re-append the query string.
http://us2.php.net/manual/en/function.parse-url.php http://www.faqs.org/rfcs/rfc1738.html
HTH,
-D
Looking forward.
-- Regards, Nitin Kumar Gupta http://publicmind.in/blog/
On Sat, Oct 3, 2009 at 1:30 AM, Domenic Santangelo < domenic@workhabit.com> wrote:
nitin gupta wrote:
for eg <div class="foo"></div> is selected by the following query. $class = "foo"; $xpath->query("//*[@class = '".$class."']"); But, this fails in case of multiple classes, i.e. does not select this
:
<div class="foo bar"> </div>
The "contains" function would probably work, eg (untested): $xpath->query("//div[contains(@class, '" . $class . "')]");
One more question: is this url valid : http://xyz.com/path with space.html
Nope. You can escape the url if you just HAVE to have spaces:
http://n00b.com/path%20with%20space.html
but why not use dashes?
http://pro.com/path-with-properly-indexed-spaces.html
-Dom
participants (2)
-
Domenic Santangelo -
nitin gupta