Thursday, January 22, 2015

Linux: Replace a string with another string in all files

The sed command is designed for this kind of work i.e. find and replace strings or words from a text file under Apple OX, *BSD, Linux, and UNIX like operating systems. The perl can be also used as described below.

sed replace word / string syntax

The syntax is as follows:sed -i 's/old-word/new-word/g' *.txt
GNU sed command can edit files in place (makes backup if extension supplied) using the -i option. If you are using an old UNIX sed command version try the following syntax:
sed 's/old/new/g' input.txt > output.txt
You can use old sed syntax along with bash for loop:
#!/bin/bash
OLD="xyz"
NEW="abc"
DPATH="/home/you/foo/*.txt"
BPATH="/home/you/bakup/foo"
TFILE="/tmp/out.tmp.$$"
[ ! -d $BPATH ] && mkdir -p $BPATH || :
for f in $DPATH
do
  if [ -f $f -a -r $f ]; then
    /bin/cp -f $f $BPATH
   sed "s/$OLD/$NEW/g" "$f" > $TFILE && mv $TFILE "$f"
  else
   echo "Error: Cannot read $f"
  fi
done
/bin/rm $TFILE

A Note About Bash Escape Character

A non-quoted backslash \ is the Bash escape character. It preserves the literal value of the next character that follows, with the exception of newline. If a \newline pair appears, and the backslash itself is not quoted, the \newline is treated as a line continuation (that is, it is removed from the input stream and effectively ignored). This is useful when you would like to deal with UNIX paths. In this example, the sed command is used to replace UNIX path "/nfs/apache/logs/rawlogs/access.log" with "__DOMAIN_LOG_FILE__":
#!/bin/bash
## Our path
_r1="/nfs/apache/logs/rawlogs/access.log"
 
## Escape path for sed using bash find and replace 
_r1="${_r1//\//\\/}"
 
# replace __DOMAIN_LOG_FILE__ in our sample.awstats.conf
sed -e "s/__DOMAIN_LOG_FILE__/${_r1}/" /nfs/conf/awstats/sample.awstats.conf  > /nfs/apache/logs/awstats/awstats.conf
 
# call awstats
/usr/bin/awstats -c /nfs/apache/logs/awstats/awstats.conf
 
The $_r1 is escaped using bash find and replace parameter substitution syntax to replace each occurrence of / with \/.

perl -pie Syntax For Find and Replace

The syntax is as follows:perl -pie 's/old-word/new-word/g' input.file > new.output.file


Also you can use:

1. Replacing all occurrences of one string with another in all files in the current directory:

These are for cases where you know that the directory contains only regular files and that you want to process all non-hidden files. If that is not the case, use the approaches in 2.
All sed solutions in this answer assume GNU sed. If using FreeBSD or OS/X, replace -i with -i ''.
  • Non recursive, files in this directory only:
    sed -i -- 's/foo/bar/g' *
    perl -Ti -pe 's/foo/bar/g' ./* 
  • Recursive, regular files (including hidden ones) in this and all subdirectories
    find . -type f -exec sed -i 's/foo/bar/g' {} +
    If you are using zsh:
    sed -i -- 's/foo/bar/g' **/*(D.)
    (may fail if the list is too big, see zargs to work around).
    If you are using bash, bash having no support for glob qualifiers, you can't check for regular files:
    shopt -s globstar
    shopt -s dotglob
    Then:
    sed -i -- 's/foo/bar/g' **/*

2. Replace only if the file name matches another string / has a specific extension / is of a certain type etc:

  • Non-recursive, files in this directory only:
    sed -i -- 's/foo/bar/g' *baz*    ## all files whose name contains baz
    sed -i -- 's/foo/bar/g' *.baz    ## files ending in .baz
  • Recursive, regular files in this and all subdirectories
    find . -type f -name "*baz*" -exec sed -i 's/foo/bar/g' {} +
    If you are using bash:
    shopt -s globstar
    shopt -s dotglob
    Then:
    sed -i -- 's/foo/bar/g' **/*baz*
    sed -i -- 's/foo/bar/g' **/*.baz
    If you are using zsh:
    sed -i -- 's/foo/bar/g' **/*baz*(D.)
    sed -i -- 's/foo/bar/g' **/*.baz(D.)
    The -- serves to tell sed that no more flags will be given in the command line. This is useful to protect against file names starting with -.
  • If a file is of a certain type, for example, executable (see man find for more options):
    find . -type f -executable -exec sed -i 's/foo/bar/g' {} +
    zsh:
    sed -i -- 's/foo/bar/g' **/*(D*)

3. Replace only if the string is found in a certain context

  • Replace foo with bar only there is a baz later on the same line:
    sed -i ':1;s/foo\(.*baz\)/bar\1/;t1' file
    In sed, using \( \) saves whatever is in the parentheses and you can then access it with \1. There are many variations of this theme, to learn more about such regular expressions, see here. We need to repeat the operation for all foo occurrences, which is done with the tconditional branching.
  • Replace foo with bar only if foo is found on the 3d column (field) of the input file (assuming whitespace-separated fields):
    gawk -i inplace 'gsub(/foo/,"baz",$3)' file
    (need gawk 4.1.0 or newer).
    For a different field just use $N where N is the number of the field of interest. For a different field separator (: in this example) use:
    gawk -i inplace -F':' 'gsub(/foo/,"baz",$3)' file
    Another solution using perl:
    perl -i -ane '$F[2]=~s/foo/baz/g; $" = " "; print "@F\n"' foo 
    NOTE: both the awk and perl solutions will print space separated fields even if the input file had tabs. For a different field use $F[N-1] where N is the field umber you want and for a different field separator use (the $"=":" sets the output field separator to :):
    perl -i -F':' -ane '$F[2]=~s/foo/baz/g; $"=":";print "@F"' foo 
  • Replace foo with bar only on the 4th line:
    sed -i '4s/foo/bar/g' file
    gawk -i inplace 'NR==4{gsub(/foo/,"baz")};1' file
    perl -i -pe 's/foo/bar/g if $.==4' file

4. Multiple replace operations: replace with different strings

You can combine sed commands:
sed -i 's/foo/bar/g; s/baz/zab/g; s/Alice/Joan/g' file
or Perl commands
perl -i -pe 's/foo/bar/g; s/baz/zab/g; s/Alice/Joan/g' file
If you have a large number of patterns, it is easier to save your patterns and their replacements in a sed script file:
#! /usr/bin/sed -i
s/foo/bar/g
s/baz/zab/g
Or, if you have too many patterns pairs for the above to be feasible, you can read pattern pairs from a file (two space separated patterns, $pattern and $replacement, per line):
while read -r pattern replacement; do   
   sed -i "s/$pattern/$replacement/" file
done < patterns.txt
That will be quite slow for long lists of patterns and large data files so you might want to read the patterns and create a sed script from them instead:
sed -f <(awk '{printf "s/%s/%s/g\n", $1, $2}' patterns.txt) -i -- file.txt
Then, run the sed script on your input file(s):
sed -f sedscript.txt inputfile.txt

5. Multiple replace operations: replace multiple patterns with the same string

Replace any of foobar or baz with foobar
sed -Ei 's/foo|bar|baz/foobar/g' file
or
perl -i -pe 's/foo|bar|baz/foobar/g' file

No comments:

Post a Comment