Posted by Anonymous on Fri 25 Nov 2005 at 14:52
Hello nerdlings! Steve has gone jolly wild with perl this festive season. And to add to the spirit, here's a piece from me on how to do a search and replace across many files in one line of perl, for several kinds of cases:
1. The good - The simple case
Sometimes you want to change the same pattern in a lot of files. How can you do that?
Now, vi and emacs can both open a lot of files, eg: vi *.html. And you can then do a search and replace on the same pattern. But here's a neat perl trick that sometimes comes in useful:
Suppose you have a bunch of files (file1, file2, file3 etc) which have the same chunk of text in them. Eg, the text:
Newspapers are about news.
In a fit of disgust at the tripe that is printed in the Sunday Spurt, you decide to change the statement to
Newspapers are about selling advertising space.
Then this perl in-place edit one-liner comes in handy here:
perl -pi -e 's/about news\.$/about selling advertising space\./' file*
Tada! You're done.
The flags (perldoc perlrun) work like this:
-p loop and swallow the files, and print default.The ugly - the multi-line case
-i edit the files in-place
-e do the command
But what if you originally had three lines in each file? Imagine you were a zombie, and that your actions were controlled by a MegaHurtz MindNumboJumbo Zombifier Ray. This Zombifier ray is being wielded by a cackling mad scientist Frankenstein stereotype, and he has had you write the following three lines in every file, for some unfathomable mad scientist type experiment:
Newspapers are about news. McDonalds is a fast food restaurant. Google is a search engine.
Note that the lines are next to each other (not scattered in separate places in the file).
Now, while the mad scientist is attending to his Van Der Graaf generator in the other room, Igor, the shambling, drooling, mumbling assistant to the mad scientist makes a hat to amuse himself. Then he puts this hat on your head, and has a giggle at your appearance. But what Igor doesn't realise is that the hat, made of tin foil, repels the mind control zombifier waves. Suddenly, you are free to do whatever you like! And, overwhelmed by feelings of extreme cyncism you decide to take revenge! Yes! You decide to change the lines to:
Newspapers are about selling advertising space. McDonalds is a real estate business. Google is a data-acquisition corporation.
That will show them!
Ah. Well, doing this transformation turns out to be rather more abstruse perl magic. The previous incantation: (perl -pi -e 's///' file*) won't work, because the search part stumbles over the new lines (the record separators) in each of the files it swallows.
Now, the -0 flag in perl specifies (in octal) the input record separator (new lines in a text file). If we rely on the fact that no 0x777 character exists, we can swallow each file in one big gulp, and then do the search and replace over many lines. So:
perl -p0777i -e 's/about news\.\n^McDonalds is a fast food restaurant\.$\n^Google is a search engine\.$\n/about selling advertising space\.\nMcDonalds is a fast food restaurant\nGoogle is a data-acquisition corporation./m' file*
is the (rather long) one-liner that will do the job. The m modifier allows the anchor match across multiple lines.
So, now you go ahead and edit all the files that contain this pattern in an instant, thereby foiling the mad scientist's evil experiment before he even comes back. Vengeance is yours!
The bad - the stupid case
One important caveat, before you go all wild and crazy and start applying this technique to every situation.
Perl in-place editing, while handy, does not usually scale up very well. Especially if you have single and double quotes to handle. The bash shell has some interesting quirks (including that you cannot escape a single quote within single quotes). That can result in eye-bleeding contructions like in this example:
Task:
replace this text which is inside a bunch of *.php files:
if (preg_match('/bsd.example.com/i', $httpbase)) {
echo " _uacct=\"UA-12345-2\";\n" ;
} elseif (preg_match('/beta.example.com/i', $httpbase)) {
echo " _uacct=\"UA-12345-3\";\n" ;
} elseif (preg_match('/example.com/i', $httpbase)) {
echo " _uacct=\"UA-12345-1\";\n" ;
}
with:
echo " _uacct=\"$uacct\";\n";
Solution:
An approprite in-place edit in bash is then:
perl -p01000i -e "s/if \(preg_match\('\/bsd\.example\.com\/i', \\\$httpbase\)\) \{\n echo \" _uacct=\\\\\"UA-12345-2\\\\\";\\\n\" ;\n\} elseif \(preg_match\(\'\/beta\.example\.com\/i\', \\\$httpbase\)\) \{\n echo \" _uacct=\\\\\"UA-12345-3\\\\\";\\\n\" ;\n\} elseif \(preg_match\(\'\/example\.com\/i\', \\\$httpbase\)\) \{\n echo \" _uacct=\\\\\"UA-12345-1\\\\\";\\\n\" ;\n\}\n/echo \" _uacct=\\\\\"\\\$uacct\\\\\";\\\n\";\n/" *.php
(Go on, scroll right to see the rest...)
Yes, I went wild and crazy and really did that one in real life. But do as I say, not do as I do!
Let's summarize all this now:
Executive summary:
There! Now you can go around thrilling and impressing everyone with your perl prowess by saying, "Yes, I can fix the site in one line of perl".
ps: http://www.noctilucent.org/blog/archives/2003/12/replacing_large.html covers a more sensible way of handling larger texts. In case you don't have that megalomaniac urge that all geeks secretly have.
PJ
This article can be found online at the Debian Administration website at the following bookmarkable URL:
This article is copyright 2005 Anonymous - please ask for permission to republish or translate.