Search Smith

ColdFusion, SQL queries, and, of course, searching

Archive for March, 2012

ColdFusion: CFAJAXPROXY and the HEAD tag

Posted by David Faber on March 23, 2012

A developer had an interesting problem recently which he related on StackOverflow. Normally, if the <cfajaxproxy> tag is used it inserts JavaScript code immediately after the <head> tag. However, if the <head> tag has attributes, then ColdFusion tries to insert the JavaScript code within the <head> tag (i.e., before the > sign marking the end of the tag):

<html>
<head <script type="text/javascript">/* <![CDATA[ */_cf_loadingtexthtml="<img alt=' ' src='/CFIDE/scripts/ajax/resources/cf/images/loading.gif'/>";
_cf_contextpath="";
_cf_ajaxscriptsrc="/CFIDE/scripts/ajax";
_cf_jsonprefix='//';
_cf_clientid='A0A76C2B1D80B084CEAB5AA874B0CEFE';/* ]]> */</script><script type="text/javascript" src="/CFIDE/scripts/ajax/messages/cfmessage.js"></script>
<script type="text/javascript" src="/CFIDE/scripts/ajax/package/cfajax.js"></script>

<script type="text/javascript">/* <![CDATA[ */
ColdFusion.Ajax.importTag('CFAJAXPROXY');
/* ]]> */</script>

<script type="text/javascript">/* <![CDATA[ */
var _cf_article=ColdFusion.AjaxProxy.init('/test/stuff.cfc','stuff');
/* ]]> */</script>
profile="http://gmpg.org/xfn/11">
<title>Test Page</title>
</head>

At first, I thought this was merely an inconvenience – the developer in question had stated that he was using the id attribute of the <head> tag, which is not valid HTML 4.01 syntax. However, it is valid HTML5 syntax and, what’s more, it doesn’t work properly even with valid HTML 4.01 attributes for the <head> tag, as seen in the code example above. The profile attribute is valid HTML 4.01 syntax and there is no reason that ColdFusion should not detect the attribute. What ColdFusion appears to be doing is searching for the first occurrence of <head, advancing one character (no matter what that character is), and then inserting the JavaScript code. That strikes me as pretty poor programming. The <cfajaxproxy> tag should either search for a regular expression ("(<head>|<head\s[^>]*>)") and insert the code after the match, or traverse the elements of the page’s DOM tree. I now have no qualms with stating that this is a bug. (The <cfajaxproxy> tag was introduced in ColdFusion 8; the bug exists in at least ColdFusion 8 and ColdFusion 9 – I don’t know about ColdFusion 10.)

All that said, how do we work around this issue? Fixing tons of web pages certainly isn’t convenient, but there may be some options. One, you can put in an additional <head> tag (with its closing tag) above the “problematic” one. I have no idea what sort of side effects this may produce. Two, you can put a set of phony tags above the problematic one: <fake<head></fake<head>. Again, I have no idea what side effects this may produce. Your JavaScript code certainly won’t be in the <head> tag if you take this route.

Third, you can add some code into the onRequest() method of your Application.cfc file (yes, there are issues with web services when using onRequest() but those can be addressed):

<cfsilent>
<!--- Grab the requested page. --->
<cfsavecontent variable="local.target_page">
<cfinclude template="#arguments.target_page#" />
</cfsavecontent>

<cfset local.target_page = REReplace(local.target_page, "(<head\s[^>]*>)", "<head>") />
<cfajaxproxy cfc="test.stuff" jsclassname="stuff" />
</cfsilent>
<cfoutput>#local.target_page#</cfoutput>

Oddly, the above works even when the call to <cfajaxproxy> is on the target page instead of in onRequest()! I guess that tag is not actually invoked until the page is actually served. I don’t know how the above will affect the performance of the page, and I don’t know if it will even work with complex code (I tried it with a rather uncomplicated page myself). It does appear to work at first glance, but it’s a workaround. The real issue is that there is a bug.

Update: Bug reported to Adobe.

Posted in ColdFusion | Tagged: , , | 1 Comment »

Solr: Showing faceted search stems in human-readable terms

Posted by David Faber on March 12, 2012

A fascinating question came up on StackOverflow. Suppose you have a Solr core (collection for you ColdFusion peeps) and you want to return the most common terms found in the index. If you facet on a field that has stemming enabled, Solr will return the stems and not the matching terms. Instead, you will see stemmed terms like the following: associ, studi, signific, increas – generally not the sort of thing you want to show to your end users. However, if you use highlighting as well as faceting, fragments or snippets from the fields that match will be returned along with the search results (and along with the facet results), and you can then examine those snippets for the matching terms in a format that is readable by humans. For example, if you do the following –

?q=keyword&facet=true&facet.field=description&hl=true&hl.fl=description&hl.fragsize=0&hl.simple.pre=[&hl.simple.post=]

– then the matching terms will be returned in the highlighting structure wrapped in square brackets. You can then examine those results using regular expressions to pull out the friendly matching terms. One caveat is that unless your index is very small, you will likely only be able to retrieve a sampling of the terms matching the stems. The reason for this is that highlighting returns only those fields relevant to the documents (or records) returned by the query, and is dependent on the number of rows specified.

Update: Well, it appears that I was trying to do too much here. This won’t work as written. It can’t be done in a single query. Rather, what you would need to do is to use faceting to get the top indexed terms:

?q=*.*&facet=true&facet.field=description&facet.limit=20&rows=20

This will return the top 20 (parameter facet.limit) indexed terms. You can then query Solr with highlighting to retrieve the terms that actually match the stemmed terms:

?q=stem&hl=true&hl.fl=description&hl.fragsize=0&hl.simple.pre=[&hl.simple.post=]&rows=20

Twenty rows should be a good number to find a decent sampling of matched terms.

Posted in Solr | Tagged: , , , | Leave a Comment »

Solr: Knowing which fields match

Posted by David Faber on March 9, 2012

How do you know which fields match your query? For example, if we search our articles index for “malaria,” and we want to know whether we matched the term in title, description, and/or journal_name, how do we go about doing that?

The answer is to turn on highlighting for those fields: hl=on&hl.fl=title,description,journal_name . Solr will return a highlighting structure containing the unique keys for each record match, along with the field that matched and a snippet of that field’s contents with the matching text highlighted. N.B.: Fields to be highlighted must be stored, but not actually indexed. How cool is that?

Posted in Solr | Tagged: , | 1 Comment »