Sorry I haven't blogged in a while, but I've been very busy working on a project. Part of the project requires that I convert HTML to formatted plain text. On the very surface, this may seem simple (just use a RegEx to remove the HTML,) but the key word in that first sentence was "formatted."
One of the many issues I've run into, is that none of the built-in ColdFusion string manipulation functions account for the "visual" length of a string. Since one of the things I needed to do was wrap text after XX number of visual characters; I needed a function that, unlike the standard len() function, would return the length of a string as it would appear on the screen. This means I have to take into account how many "spaces" a tab would occupy on the screen.
My first attempt was simply to count every tab character (chr(9)) as 8 spaces. While this number assured I would never go past the right edge of the content, it wasn't very accurate (as a tab can very between 1 space to 8 spaces in Windows.) I quickly started running into problems when I realized that for some functionality (like centering text,) I'd really need an accurate account of the total number of visual spaces a string was occupying.
As I was thinking about the problem, I decided to do a quick Google search to see if I could find anything that solved the problem. I actually came across a post from mailing list dedicated to NEdit (an X Window editor.) While the solution is written for an NEdit macro, the logic was easily replicated in ColdFusion.
So, here's the code translated for ColdFusion. If you're wondering, the wrapText() UDF I wrote supports auto-indenting, smart indenting (for ordered/unordered lists), prepending/appending data to each line. It also correctly wraps lines based upon the visual representation of the string—unlike the built-in wrap() function which assumes a tab occupies a single space.
NOTE:The above UDF does not account for some hidden visual characters. You may need to modify the code to account for various other characters (i.e. carriage return.) In my project, I'm dealing w/individual lines from a block of text, where each line has the cr/nl stripped out.
Comments for this entry have been disabled.