Saturday, December 19, 2009

Right Sed Fred, I'm too sexy to search and replace

Last week I hand-edited 22 XML files to change one attribute in each. I had only ten minutes, and I knew that solution could be done in that time.

Today I had exactly the same task, but with less time pressure, I went hunting. Here is the solution I used:
sed -i 's/old/new/g' *.xml

-i means replace inline. I had struggled with sed years ago and thought it was a horrible monster that made emacs look user-friendly in comparison. But that is so simple. Perhaps I was hurt before by trying to do something that couldn't be described as a simple regex?

Actually I vaguely remember my need at that time was to modify all html files in a directory tree, which sed cannot do. But with find it can:
find . -iname \*.html -execdir sed -i 's/<html>/<html mytag="test">/g' {} +

That inserts an attribute in the html tag of all *.html files in current directory and its subdirectories. (Shamelessly stolen from comments on this page here, and then I did a quick test to confirm I hadn't introduce a typo.)

Cool. One step closer to unix guruness. (sed is also available in cygwin, which is where I was actually doing the edit that started this article.)

UPDATE: for an example where tr is more useful than sed see
find, grep and tr.


UPDATE: Here is an expansion of the find+sed example above. I wanted to alter three types of extensions: html, php, phtml. And I only wanted to alter those files in just certain subdirectories. The -regex parameter of find seemed to do the job:

find . -regex './\(dir1\|dir2/subdir1\|dir3\|dir4\)/.+\.\(html\|php\|phtml\)' -execdir sed -i 's/ABC/XYZ/g' {} +

That is the command exactly as you run it at the bash command prompt. Notice that, in the regex, not just (), but also | need to be prefixed with a single backslash.

No comments: