Page 1 of 1

Search function adds code to urls

Posted: Mon Oct 08, 2018 1:28 pm
by simpleSolutions.dk
CMSimple_XH search function violates urls containing images by adding a span tag with the class "xh_find" inside an image tag

The original url code

Code: Select all

<a href="userfiles/images/billedudlaan/k47_emil_nielsen/wtrmrk/2.jpg"  rel="prettyPhoto[imgalbum0]" title="566 - 50x60 cm<br />"><img src="userfiles/images/billedudlaan/k47_emil_nielsen/thumb/imgalbum_2.jpg" class="thumb" alt="" title="" width="135" height="113"></a>
when searched is changed to

Code: Select all

<a href="userfiles/images/billedudlaan/k47_<span class="xh_find">emil</span>_nielsen/wtrmrk/41.jpg"  rel="prettyPhoto[imgalbum0]" title="562 - 110x135<br />"><img src="userfiles/images/billedudlaan/k47_emil_nielsen/thumb/imgalbum_41.jpg" class="thumb" alt="" title="" width="135" height="105"></a>
Ie. the search argument is enclosed in class xh_find which is correct in most cases but not inside the image tag.

Please check this page for
https://sak.dk/?Billedudlaan:Kollektion ... il_Nielsen
and the same page called from search result list
https://sak.dk/?Billedudlaan:Kollektion ... earch=emil

Re: Search function adds code to urls

Posted: Mon Oct 08, 2018 2:26 pm
by cmb
simpleSolutions.dk wrote:
Mon Oct 08, 2018 1:28 pm

Code: Select all

<a href="userfiles/images/billedudlaan/k47_emil_nielsen/wtrmrk/2.jpg"  rel="prettyPhoto[imgalbum0]" title="566 - 50x60 cm<br />"><img src="userfiles/images/billedudlaan/k47_emil_nielsen/thumb/imgalbum_2.jpg" class="thumb" alt="" title="" width="135" height="113"></a>
This looks like invalid HTML, since there is a <br /> element inside the title of the <a> element. Can you please try without this <br />?

Re: Search function adds code to urls

Posted: Mon Oct 08, 2018 2:33 pm
by cmb
cmb wrote:
Mon Oct 08, 2018 2:26 pm
This looks like invalid HTML, […]
Apparently, this is allowed. :(

Re: Search function adds code to urls

Posted: Mon Oct 08, 2018 5:07 pm
by simpleSolutions.dk
Adding html tags and css classes inside html tags is still a bug.

I used javascript to highlight search results in advanced search plugin and it worked fine without adding html tags.

Re: Search function adds code to urls

Posted: Mon Oct 08, 2018 9:19 pm
by cmb
simpleSolutions.dk wrote:
Mon Oct 08, 2018 5:07 pm
Adding html tags and css classes inside html tags is still a bug.
Yes, it is. I was referring to `title="566 - 50x60 cm<br />"`, which I believe to have been invalid HTML 4, but is apparently valid HTML5.
simpleSolutions.dk wrote:
Mon Oct 08, 2018 5:07 pm
I used javascript to highlight search results in advanced search plugin and it worked fine without adding html tags.
It is not so much about the programming language, but rather about the API. We're using regular expressions, partly for historic reasons, and partly because PHP's DOM support is far from being perfect presently. It might be possible to improve the regular expression, but otherwise we'd likely have to resort to a JS solution, which wouldn't be bad (after all, it can be regarded as progressive enhancement). I'd still prefer to enclose the search words in a <span class="xh_find"> for easier CSS customization.

Re: Search function adds code to urls

Posted: Tue Oct 09, 2018 11:50 am
by simpleSolutions.dk
With small changes in tplfunctions and core.css, I can highlight search results with slightly modified searchhi.js. You can test it on simplesolutons.dk, search for citröen if you also want to test utf-8 compatibility. All content corresponding to keywords on the results pages will be highlightet, but it can be supressed with the class nosearchhi.

[CHANGED: keywords on the page with search result links are not highlighted anymore]

CSS customization is unchanged if it is what worries you.

Interested?

Re: Search function adds code to urls

Posted: Tue Oct 16, 2018 10:03 am
by cmb
Well, I agree that something like searchhi.js[1] (basically walk the DOM, and handle all text nodes[2]) is the way to move forward for XH 1.8 or maybe 2.0. For XH 1.7.3, we may consider the quick fix to replace this line with:

Code: Select all

$patterns[] = '/' . preg_quote($word, '/') . '(?![^<]*[">])/isuU';
This is more restrictive than before, and a few desired highlights might not happen, but it is less jeopardous than a completely different implementation.

Yet another option might be to do nothing for XH 1.7.3 (but only for XH 1.8); after all, the <br /> element in the title is not desirable anyway, since it is shown verbatim in the tooltip.

To not forget about this issue, I've filed https://github.com/cmsimple-xh/cmsimple-xh/issues/366.

Anyhow, what do others think?

[1] I wouldn't use searchhi.js, since it is lacking a license note.
[2] An open question is whether we should only highlight the main contents (as it's now), in which case we'd likely need to wrap the main contents in a recognizable element, which might cause some styling issues.

Re: Search function adds code to urls

Posted: Tue Oct 16, 2018 8:04 pm
by simpleSolutions.dk
Well, the br tag is actually used to separate different text fields in lightbox. I didn't find a better way to do it.

The new regular expression can't handle the following html correctly (showing aditional text but it can highlight text)

Code: Select all

<td class="thumb_container"><a href="./userfiles/images/gallery/wine//wtrmrk/citroen.jpg"  rel="prettyPhoto[imgalbum0]" title=" <span class="xh_find">Citröen</span><br>Label:  Citröen"><img src="./userfiles/images/gallery/wine//thumb/imgalbum_citroen.jpg" class="thumb" alt="" title=" Citröen" width="170" height="170"></a><div class="thumb_description"> <span class="xh_find">Citröen</span><br>Label:  <span class="xh_find">Citröen</span></div></td>


And the searchhi.js scroll content to the first highlighted search result which is fare more user friendly ;) .

Right now I keep searchhi.js as hilglight engine and try to optimize search class to be a bit more user friendly (especially when the query string consists of two or more words).