Converting .htaccess rules to Undertow text-based rules with CommandBox

Shout Out!
First of all thank you, Brad Wood, for guiding me into converting my rules, and also, for you knowledge, expertise, and time in helping the CFML community. You have been a tremendous help to me, personally, and for the whole community.

I Can Do This, So Can You!
My knowledge of Rules with .htaccess, Undertow, and Tuckey are at a novice level, but it goes to say that if I can do it, anyone can do it!


Converting .htaccess Rules to Undertow text-based Rules
The beauty of a text-based set of rules is that there is no need to escape characters, such as converting rules to a JSON configuration, and no need to use a long form of writing rules in a XML configuration.

.htaccess File
The following is my real-life .htaccess file to use as a reference:

# 
#  Everything is UTF-8 
# 
# AddType 'text/html; charset=UTF-8' html 
# AddType 'text/css; charset=UTF-8' css 
# 
# - correct way to serve js according to http://www.ietf.org/rfc/rfc4329.txt 
# - IE doesn't like application/javascript but on the other hand it doesn't 
#   obey Content-Type headers so it's ok 
# AddType 'application/javascript; charset=UTF-8' js 
# disable directory browsing 
# Options -Indexes 
# 
#  The order of the rewrite rules is generally important 
RewriteEngine On 
RewriteBase / 
#  If we are in the admin area or we already have the language variable skip this block of rules 
RewriteCond %{REQUEST_URI} ^/(_admin|_docs|_cfc|assets|images) 
RewriteRule .* - [S=6] 
# 
#  The language must be the first thing in the URL. If it's present then set 
# and env variable, remove it from the URL and continue with other rules 
# 
#  e.g. en/dir/file.cfm; en/dir1/dir2/file.cfm; en/dir/file.php 
RewriteRule ^(en|jp)/(.+/)*(.+)\.(cfm|php)                                 $1/$2$3.$4 [QSA] 
#  e.g. en/; en/dir/; en/dir1/dir2/ 
RewriteRule ^(en|jp)/(.+/)*$                                               $1/$2      [QSA] 
#  Details for a news, etc: /dir/dir/#fuseaction#/#id#-#name# 
RewriteRule ^(en|jp)/((?:[a-zA-Z0-9\-_\\]+/)*)([a-zA-Z0-9\-_\\]+)/(\d+)-?(.*)$  $1/$2index.cfm?fuseaction=$3&id=$4&title=$5 [NC,NS,L,QSA] 
#  Download asset /dir/dir/download_asset/#name# 
RewriteRule ^(en|jp)/((?:[a-zA-Z0-9\-_\\]+/)*)(download_asset)/([^.]+\.\w{3,})-?.*$  $1/$2index.cfm?fuseaction=$3&id=$4 [NC,NS,L,QSA] 
#  Download asset /dir/dir/products/#type# 
RewriteRule ^(en|jp)/((?:[a-zA-Z0-9\-_\\]+/)*)products/(\w+)$  $1/$2index.cfm?fuseaction=product_details&id=$3 [NC,NS,L,QSA] 
#  Simple fuseaction (/directory/directory/#fuseaction#) 
RewriteRule ^(en|jp)/((?:[a-zA-Z0-9\-_\\]+/)*)([a-zA-Z0-9\-_\\]+)$           $1/$2index.cfm?fuseaction=$3 [QSA] 
RewriteCond %{REQUEST_URI} ^/(testbed|assets/script/charting_library) 
RewriteRule .* - [S=1] 
# Ignore revision numbers in JS and CSS file names 
RewriteRule ^((?:[a-z0-9\-_\\]+/)*)(\d+)\.(.+)\.(js|css)$ $1$3.$4 [NC,NS,L] 
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?ccpb.*$ [NC] 
RewriteRule ^_data/.+$ - [F]

Start CommandBox
All of the suggested CLI commands assume that we are running the CommandBox application. To do so, run the following command to start CommandBox.
$ box

Configure CommandBox to point to rules.txt
Point CommandBox server to your rules.txt that you will create later. In my example, it is located in the web root. Feel free to place it where you see fit for your setup.
$ server set web.rulesFile=rules.txt

.htaccess Acronym to Undertow Conversion Chart

  • In the server rule, the rewrite condition is more/less on the left of → and the rewriterule is more/less on the right.
  • [NC] means No Case
    • Use case-sensitive=false with the regex method
    • IE) regex( case-sensitive=false,pattern=’^/(_admin|_docs|_cfc|assets|images)’ )
  • [NS] doesn’t use the rule on “sub requests” but I’ll be honest, I’m not even sure what that means or if it applies
  • [L] means Last rule to process. That’s doable, but there’s some caveats to what that means in CommandBox (since it also skips any remaining internal rules)
    • Use done; in the c
  • [QSA] Means query string append
    • I found a short-hand textual representation Exchange Attribute “%q” that can be applied to predicates. The long-hand version is %{QUERY_STRING} However, I wound up realizing that I did not need to use %q for my Undertow rules, even though QSA is applied in the .htaccess file. You may want to test to see what it looks like with this %q and without to help you determine if you need it in your rules. Check out the following link for a list of Exchange Attributes.
  • [S] mean Skip. IE) [S=6] means skip the next 6 rewrite rules in the .htaccess file example above.
    • Use Nested rewrite rules to skip past rules based on the conditional response of the parent rewrite
  • [F] mean Finished
  • The .htaccess $n will need to be converted as ${n} in UnderTow.
  • The Request URI does not require a prepending forward slash, but it is needed in UnderTow in order to match the regex in the predicate. When it comes to comparing regex rules to a normal set of regex, I got hung up with the forward slashes “/” because they were not escaped when testing the regex rules in a normal online regex validator. Forward slashes in rules regex are literal forward slashes and do not have to be escaped.
    I was able to verify this by finding an online tester meant for .htaccess. You can use it to test your .htaccess rewrite regex rules as well at the following link: https://htaccess.madewithlove.com/

rules.txt File - Undertow text-based Rewrite Rules
The following is my converted set of rewrite rules meant for Undertow to utilize with CommandBox.

not regex( case-sensitive=false, pattern='^/(_admin|_docs|_cfc|assets|images)' ) -> {
	regex( case-sensitive=false, pattern='^/(en|jp)/(.+/)*(.+)\.(cfm|php)') -> {
		rewrite( '/${1}/${2}${3}.${4}' );
	}
	regex( case-sensitive=false, pattern='^/(en|jp)/(.+/)*$') -> {
		rewrite( '/${1}/${2}' );
	}
	regex( case-sensitive=false, pattern='^/(en|jp)/((?:[a-zA-Z0-9\-_\\]+/)*)([a-zA-Z0-9\-_\\]+)/(\d+)-?(.*)$') -> {
		rewrite( '/${1}/${2}?fuseaction=${3}&id=${4}&title=${5}' );
		done;
	}
	regex( case-sensitive=false, pattern='^/(en|jp)/((?:[a-zA-Z0-9\-_\\]+/)*)(download_asset)/([^.]+\.\w{3,})-?.*$') -> {
		rewrite( '/${1}/${2}?fuseaction=${3}&id=${4}' );
		done;
	}
	regex( case-sensitive=false, pattern='^/(en|jp)/((?:[a-zA-Z0-9\-_\\]+/)*)products/(\w+)$') -> {
		rewrite( '/${1}/${2}?fuseaction=product_details&id=${3}' );
		done;
	}
	regex( case-sensitive=false, pattern='^/(en|jp)/((?:[a-zA-Z0-9\-_\\]+/)*)([a-zA-Z0-9\-_\\]+)$') -> {
		rewrite( '/${1}/${2}?fuseaction=${3}' );
	}
}
not regex( case-sensitive=false, pattern='^/(testbed|assets/script/charting_library)' ) and regex( case-sensitive=false, pattern='^(/(?:[a-z0-9\-_\\]+/)*)(\d+)\.(.+)\.(js|css)$' ) -> {
	rewrite( '/${1}${3}.${4}' );
	done;
}
regex( pattern='!^http(s)?://(www\.)?domain.*$', value="%{i,HTTP_REFERER}", full-match=true) -> {
	rewrite('^_data/.+$');
	done;
}

Testing Undertow Rewrite Rules
I searched high and low for an online Undertow Rules tester but came up with nothing. However, CommandBox has an incredible built-in tester! Run the following CLI commands in order to set up your CommandBox to help you test your rules for Undertow by setting up the logs and also be able to test/monitor every server hit in the CommandBox CLI console.
server set web.rewrites.logEnable=true
$ server info property=rewritesLogPath
$ server log --follow --rewrites OR $ server log myServerName --follow --rewrites

Your rewrites log will be auto-rotated every 10MB. The amount of information that appears in the rewrites log will be affected by the --debug and --trace flags when you start the server.

Tracing in CommandBox by starting the server. If your server is already running, $ stop it and run the following.
$ start --console --trace
Brad Wood recommends tracing over the others, but there is also a debug option that can be utilized as well:

Debugging - another option instead of tracing:
$ start --console --debug

Once your your server has completed the spin-up, test the website and then you will see the CommandBox CLI show the rewrite rules being utilized.


(Alternative to the above)
Configuring CommandBox to point to .htaccess and not convert the rewrite rules suggested above

After conversing with Brad, .htacces is shotty in his experience when attempting configure it into CommandBox. It does work, but you want to make certain that all of the rules work. That being said, if you want to try using .htaccess with CommandBox, use the following command to add the configuration to your server.json which most likely will reside in your web root directory, if you have not explicitly placed it within a directory elsewhere. Be sure to restart your server in order for the configuration to take place.
$ server set web.rewrites.config=htdocs/.htaccess


References
Rules Examples - CommandBox
Server Logs - CommandBox
URL Rewrites - CommandBox
Exchange Attributes - Undertow
.htaccess Tester

3 Likes