Using Find and Replace
In word processors, the Search feature is considered an editing tool. You can certainly use it just that way while writing in Studio, but the extended search capabilities make it a powerful tool to update site content.
You can find and replace alphanumeric strings (including regular expressions) across folders and projects and choose how tags are processed by the search engine.
Search commands
Studio provides both basic and extended levels of search and replace to help you maintain your Web pages.
To search the current document: |
- Select Search > Find (Ctrl + F) to open the Find dialog box.
- Enter the text you want to locate in the Find what box.
If you highlight text in the editor, it displays in the Find what box.
- Set the Match case, Match whole words, and Direction options.
- Click the Find Next button to sequentially highlight each match.
- To resume the search from the current cursor position after the search dialog box is closed, press F3.
To replace text in the current document: |
- Select Search > Replace (Ctrl + R) to open the Replace dialog box.
- Enter the Find and Replace text in their boxes.
- Set the Match case and Match whole words options.
In addition to the Up and Down Direction options, you can restrict the search to just a part of the document by highlighting a block of text in the editor and picking Selection.
- You can do a selective Replace or choose to Replace All matches.
The last 10 items are saved in the Find what and Replace what dropdown lists.
Using extended search and replace features
For more complex operations across multiple documents, use the Extended Find or Extended Replace commands. These commands offer a number of options to refine your search:
- You can run either command against the current document, all open documents, folders, or projects.
- Click the arrow button next to the Find what box to selectively save and reuse entries.
- The In folders option let you restrict searches to files with specified extensions and to just the root folder.
- Check the Match Case option for case-sensitive searches.
- Check Regular expressions to enable parsing of regular expression entries. See "Searching with Regular Expressions" for details on Studio's implementation of RegExp syntax.
- Select the Skip Tags While Searching option to search the page content only, excluding the tags themselves. This option is not available when the Regular expressions option is enabled.
- The Extended Replace dialog box lets you backup files before making replacements at the folder or project level.
The Results pane displays a list of locations where the matched string was replaced. Double-click on a match in the list to highlight it in the document. Right-click in the Results pane to clear the pane or close it.
Note | The Extended Replace command skips read-only files. |
Replacing special characters
Use the Search > Replace Special Characters command to either replace extended characters with their HTML equivalents, or replace HTML tags with the equivalent extended characters. This command works only in the current document.
Replacing double-spaced lines
Because of the way different operating systems treat carriage returns, text files saved on UNIX or Macintosh systems may become double-spaced when opened in Studio. Use the Search > Replace Double Spacing with Single Spacing command to collapse double-spaced lines to single-spaced lines in the current document.
Searching with Regular Expressions
Studio supports searching with regular expressions (or RegExp) to match patterns in character strings in the Extended Find and Replace commands. Regular expressions allow you to specify all the possible variants in a search and to precisely control replacements. Ordinary characters are combined with special characters to define the pattern for the search. The RegExp parser evaluates the selected files and returns each matching pattern.
In the Find command, the matching pattern is added to the find list. In the Replace operation, it triggers insertion of the replacement string. When replacing a string, it is just as important to ensure what is not found as what is. Simple regular expressions can be concatenated into complex search criteria. Note that enabling the Regular expressions option in the Extended dialog boxes disables the Skip tags while searching option.
Thanks to Team Allaire member Christopher Bradford for his ongoing support of RegExp issues in the ColdFusion Support Forum at http://forums.allaire.com/DevConf/index.cfm.
Note | The rules listed in this section are for creating regular expressions in ColdFusion. The rules used by other RegExp parsers may differ. |
Studio's RegExp engine processes the entire document, it does not parse on a line-by-line basis. This affects the way the characters such the asterisk (*), carat (^) and dollar sign ($) should be used.
Special characters
Because special characters are the operators in regular expressions, in order to represent a special character as an ordinary one, you need to precede it with a backslash. To represent a backslash, for instance, use a double backslash (\\).
Single-character regular expressions
This section describes the rules for creating regular expressions. You can use regular expressions in the Search > Extended Find and Replace commands to match complex string patterns.
The following rules govern one-character RegExp that match a single character:
- Special characters are:
+ * ? . [ ^ $ ( ) { | \
- Any character that is not a special character matches itself.
- Use the keyboard (Tab, Enter) to match whitespace characters.
- The asterisk (*) matches the specified characters throughout the entire document.
- The carat (^) matches the beginning of the docuument.
- The dollar sign ($) matches the end of the document.
- A backslash (\) followed by any special character matches the literal character itself, that is, the backslash escapes the special character.
- A period (.) matches any character, including newline. To match any character except a newline, use [^#chr(13)##chr(10)#], which excludes the ASCII carriage return and line feed codes.
- A set of characters enclosed in brackets ([]) is a one-character RE that matches any of the characters in that set. For example, "[akm]" matches an "a", "k", or "m". Note that if you want to include ] (closing square bracket) in square brackets it must be the first character. Otherwise, it won't work even if you use \].
- Any regular expression can be followed by one of the following suffixes: {m,n} forces a match of m through n (inclusive) occurrences of the preceding regular expression. The suffix {m,} forces a match of at least m occurrences of the preceding regular expression. The syntax {,n} is not allowed.
- A range of characters can be indicated with a dash. For example, "[a-z]" matches any lowercase letter. However, if the first character of the set is the caret (^), the RegExp matches any character except those in the set. It does not match the empty string. For example: [^akm] matches any character except "a", "k", or "m". The caret loses its special meaning if it is not the first character of the set.
- All regular expressions can be made case insensitive by substituting individual characters with character sets, for example, [Nn][Ii][Cc][Kk].
Character classes
You can specify a character by using one of the POSIX character classes. You enclose the character class name inside two square brackets, as in this Replace example:
"Allaire's Web Site","[[:space:]]","*","ALL")
This code replaces all the spaces with *, producing this string:
Allaire's*Web*Site
The following table shows the POSIX character classes that Studio supports.
Supported Character Classes | |
---|---|
Character Class | Matches |
alpha | Matches any letter. Same as [A-Za-z]. |
upper | Matches any upper-case letter. Same as [A-Z]. |
lower | Matches any lower-case letter. Same as [a-z]. |
digit | Matches any digit. Same as [0-9]. |
alnum | Matches any alphanumeric character. Same as [A-Za-z0-9]. |
xdigit | Matches any hexadecimal digit. Same as [0-9A-Fa-f]. |
space | Matches a tab, new line, vertical tab, form feed, carriage return, or space. |
Matches any printable character. | |
punct | Matches any punctuation character, that is, one of ! ` # S % & ` ( ) * + , - . / : ; < = > ? @ [ / ] ^ _ { | } ~ |
graph | Matches any of the characters defined as a printable character except those defined to be part of the space character class. |
cntrl | Matches any character not part of the character classes [:upper:], [:lower:], [:alpha:], [:digit:], [:punct:], [:graph:], [:print:], or [:xdigit:]. |
Multi-character regular expressions
You can use the following rules to build a multi-character regular expressions:
- Parentheses group parts of regular expressions together into grouped sub-expressions that can be treated as a single unit. For example, (ha)+ matches one or more instances of "ha".
- A one-character regular expression or grouped sub-expressions followed by an asterisk (*) matches zero or more occurrences of the regular expression. For example, [a-z]* matches zero or more lower-case characters.
- A one-character regular expression or grouped sub-expressions followed by a plus (+) matches one or more occurrences of the regular expression. For example, [a-z]+ matches one or more lower-case characters.
- A one-character regular expression or grouped sub-expressions followed by a question mark (?) matches zero or one occurrences of the regular expression. For example, xy?z matches either "xyz" or "xz".
- The concatenation of regular expressions creates a regular expression that matches the corresponding concatenation of strings. For example, [A-Z][a-z]* matches any capitalized word.
- The OR character (|) allows a choice between two regular expressions. For example, jell(y|ies) matches either "jelly" or "jellies".
- Braces ({}) are used to indicate a range of occurrences of a regular expression, in the form {m, n} where m is a positive integer equal to or greater than zero indicating the start of the range and n is equal to or greater than m, indicating the end of the range. For example, (ba){0,3} matches up to three pairs of the expression "ba".
Backreferences
Studio supports backreferencing, which allows you to match text in previously matched sets of parentheses. A slash followed by a digit n (\n) is used to refer to the nth parenthesized sub-expression.
One example of how backreferencing can be used is searching for doubled words -- for example, to find instances of `the the' or `is is' in text. The following example shows the syntax you use for backreferencing in regular expressions:
("There is is coffee in the the kitchen", "([A-Za-z]+)[ ]+\1","*","ALL")
This code searches for words that are all letters ([A-Za-z]+) followed by one or more spaces [ ]+ followed by the first matched sub-expression in parentheses. The parser detects the two occurrences of is as well as the two occurrences of the and replaces them with an asterisk, resulting in the following text:
There * coffee in * kitchen
Anchoring a regular expression to a string
All or part of a regular expression can be anchored to either the beginning or end of the string being searched:
- If a caret (^) is at the beginning of a (sub)expression, the matched string must be at the beginning of the string being searched.
- If a dollar sign ($) is at the end of a (sub)expression, the matched string must be at the end of the string being searched.
Expression examples
The following examples show some regular expressions and describe what they match.
Regular Expression Examples | |
---|---|
Expression | Description |
[\?&]value= | A URL parameter value in a URL. |
[A-Z]:(\\[A-Z0-9_]+)+ | An uppercase DOS/Windows full path that (a) is not the root of a drive, and (b) has only letters, numbers, and underscores in its text. |
[A-Za-z][A-Za-z0-9_]* | A ColdFusion variable with no qualifier. |
([A-Za-z][A-Za-z0-9_]*)(\.[A-Za-z][A-Za- z0-9_]*)? | A ColdFusion variable with no more than one qualifier, for example, Form.VarName, but not Form.Image.VarName. |
(\+|-)?[1-9][0-9]* | An integer that does not begin with a zero and has an optional sign. |
(\+|-)?[1-9][0-9]*(\.[0-9]*)? | A real number. |
(\+|-)?[1-9]\.[0-9]*E(\+|-)?[0-9]+ | A real number in engineering notation. |
a{2,4} | Two to four occurrences of 'a': aa, aaa, aaaa. |
(ba){3,} | At least three 'ba' pairs: bababa, babababa, ... |
Resources
An excellent reference on regular expressions is Mastering Regular Expressions by Jeffrey E.F. Friedl, published by O'Reilly & Associates, Inc.