Find and replace in different rows


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find and replace in different rows
# 1  
Old 05-10-2016
Find and replace in different rows

Hello everyone,

I'm a beginner in shell scripting and try to solve my issues myself, but now I am at a point where I need your help.
Below is an excerpt from an xml file.

Code:
<Position>
        <SKU>A/370269/10432/32D</SKU>
        <Batch>00320160501</Batch>
        <Description>Carlotta</Description>
        <Amount>1</Amount>
        <ForeignPos1>172991</ForeignPos1>
        <ForeignPos2>10</ForeignPos2>
        <Positionreferences>
          <PosReference number="1"/>
          <PosReference number="2"/>
          <PosReference number="3">0</PosReference>
          <PosReference number="4">0</PosReference>
          <PosReference number="5"/>
          <PosReference number="6">370269</PosReference>
          <PosReference number="7">10432</PosReference>
          <PosReference number="8">pearl grey</PosReference>
          <PosReference number="9"/>
          <PosReference number="10"/>
          <PosReference number="11">26 INCH</PosReference>
        </Positionreferences>
</Position>
<Position>
        <SKU>A/370269/10432/32D</SKU>
        <Batch>00520160501</Batch>
        <Description>Carlotta</Description>
        <Amount>6</Amount>
        <ForeignPos1>172992</ForeignPos1>
        <ForeignPos2>10</ForeignPos2>
        <Positionreferences>
          <PosReference number="1"/>
          <PosReference number="2"/>
          <PosReference number="3">0</PosReference>
          <PosReference number="4">0</PosReference>
          <PosReference number="5"/>
          <PosReference number="6">370269</PosReference>
          <PosReference number="7">10432</PosReference>
          <PosReference number="8">pearl grey</PosReference>
          <PosReference number="9"/>
          <PosReference number="10"/>
          <PosReference number="11">26 INCH</PosReference>
        </Positionreferences>
</Position>

At the moment I use my own written script to change the lines <PosReference number="9"/> and <PosReference number="10"/> with a positional parameter.

Code:
#!/bin/bash

if [ $# -lt 2 ] ; then
        echo "Usage: `basename $0` <Filename> <Datafield8>"
else
        # initialize variable for filename
            file=$1

        # convert xml file in readable format
           xmllint --format $file > $file.tmp

        # split string from $2 in seperate variables
            IFS="/"
            Wert=$2
            set -- $Wert
            p9=$2
            p10=$1

        # simple sed replace but output to a new file
            sed 's@NEW@UPD@' $file.tmp > $file.tmp2
            sed 's@<PosReference number="9"/>@<PosReference number="9">'$p9'</PosReference>@' $file.tmp2 > $file.tmp3
            sed 's@<PosReference number="10"/>@<PosReference number="10">'$p10'</PosReference>@' $file.tmp3 > $file.tmp4

        # append prefix from original file name with "_script"

            new=`echo $file | sed 's@.xml@_script.xml@'`

        # check if $new is empty or not
            if [ -z "$new" ]
              then
                :
              else
                cp $file.tmp4 $new
            fi

        # delete all temporary files
            rm -f $file.tmp $file.tmp2 $file.tmp3 $file.tmp4
fi

As long as the batch is the same in the xml file everything is okay, but it often happens that there a different batches, where each batch has a different $2 value.

Now I wonder if it's possible to modify the script, that I can search for a specific batch or more and change the associated rows which are always 14 and 15 lines below.

Maybe like this:
basename $0 <Filename> <batch#1> <parameter#1> <batch#2> <parameter#2> <batch#3> <parameter#3>

Has anyone a solution or hint for me please?

Thanks in advance,
Jacko
# 2  
Old 05-22-2016
Since the data field you are adding (or replacing) sometimes contains spaces, the following script takes three separate arguments for each set of changes (instead of the pair of arguments your script used). Any of these arguments can be quoted if spaces or tabs are included and any characters except the ASCII BEL and <newline> control characters can be included in the new string values if properly quoted when given to the shell as command-line arguments. The following script uses a while printf loop to group the command-line arguments into pairs of input lines to feed into an awk script that gathers the reformatted command-line arguments and makes all of the changes requested to the input XML file in one pass.

Although your script creates four temporary files and processes the input file unconditionally, it also throws away all of the work it did if the input file's name does not end with ".xml". The following script prints a diagnostic message and doesn't attempt to process the input file if it's name does not end with ".xml". It only uses one temporary file (holding the reformatted XML input file) and sends the output from awk directly to the desired output file pathname.

Hopefully, this will come close to what you're trying to do. Obviously, production code should verify that the xmllint was successful before continuing.
Code:
#!/bin/bash
IAm=${0##*/}	# Grab basename of this script.

# Verify argument count.
if [ $# -lt 4 ] || [ $(($# % 3)) -ne 1 ]
then	af='file batch1 PRN9val1 PRN10val1 [batchN PRN9valN PRN10valN]...\n'
	printf "Usage: %s $af" "$IAm" >&2
	exit 1
fi

# Get input XML file pathname from arg1, verify name ends in ".xml", compute
# output file pathname, and reformat the input file into a temp file.
File=$1
New=${1%.xml}
if [ "$New" = "$File" ]
then	printf '%s: file (%s) does not end in ".xml", no changes made\n' \
	    "$IAm" "$File" >&2
	exit 2
fi
New=${New}_script.xml
TmpFile=$File.tmp
xmllint --format "$File" > "$TmpFile"

# Throw away the XML file's positional parameter.
shift

# Reformat the remaining command line arguments in groups of three to pairs of
# ctl-G separated fields with the batch number as field 1, the PosReference
# number (PRN) to be fixed as field 2, and the new value to be assigned to that
# PRN as field 3; and feed those reformatted arguments to awk along with the
# reformatted XML file to produce the desired "_script.xml" file.
while [ $# -gt 1 ]
do	printf '%s\a9\a%s\n%s\a10\a%s\n' "$1" "$2" "$1" "$3"
	shift 3
done | awk '
FNR == NR {
	ref2val[$1, $2] = $3
	next
}
/<Batch>/ {
	b = $0
	gsub("^.*<Batch>|</Batch>.*$", "", b)
	found = (b, 9) in ref2val
}
found && /<PosReference number=\"(9|10)\"/ {
	sub(/"9".*/, "\"9\">" ref2val[b, 9] "</PosReference>")
	sub(/"10".*/, "\"10\">" ref2val[b, 10] "</PosReference>")
}
1' FS=$'\a' - FS=' ' "$TmpFile" > "$New"

# Remove the reformatted (temporary) XML input file.
rm -f "$TmpFile"

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Replace space in column with letter for several rows

I have a pbd file, which has the following format: TITLE Protein X MODEL 1 ATOM 1 N PRO 24 45.220 71.410 43.810 1.00 0.00 ATOM 2 H1 PRO 24 45.800 71.310 42.000 1.00 0.00 TER ENDMDL Column 22 is the chain... (5 Replies)
Discussion started by: Egy
5 Replies

2. Shell Programming and Scripting

Replace rows to column

I have a csv file and i want to convert its rows into columns sample file like this Row1,1,2,3,......,n row2,4,5,6,.......,n . . . . rown,7,8,9,........,n i want it like this row1,row2,....,rown 1,4,.............,7 (4 Replies)
Discussion started by: sagar_1986
4 Replies

3. Shell Programming and Scripting

Replace a character of specified column(s) of all rows in a file

Hi - I have a file "file1" of below format. Its a comma seperated file. Note that each string is enclosed in double quotes. "abc","-0.15","10,000.00","IJK" "xyz","1,000.01","1,000,000.50","OPR" I want the result as: "abc","-0.15","10000.00","IJK" "xyz","1,000.01","1000000.50","OPR" I... (8 Replies)
Discussion started by: njny
8 Replies

4. Shell Programming and Scripting

Find duplicate rows between files

Hi champs, I have one of the requirement, where I need to compare two files line by line and ignore duplicates. Note, I hav files in sorted order. I have tried using the comm command, but its not working for my scenario. Input file1 srv1..development..employee..empname,empid,empdesg... (1 Reply)
Discussion started by: Selva_2507
1 Replies

5. Shell Programming and Scripting

How to find DISTINCT rows and combine in one row?

Hi , i need to display only one of duplicated values and merged them in one record only when tag started with 3100.2.128.8 3100.2.97.1=192.168.0.12 3100.2.128.8=418/66/03e9/0044801 3100.2.128.8=418/66/03ea/0044601 3100.2.128.8=418/66/03e9/0044801 3100.2.128.8=418/66/03ea/0044601... (5 Replies)
Discussion started by: OTNA
5 Replies

6. Shell Programming and Scripting

Python: find the minimum in all rows

I am using Biopython to process an alignment in fasta format. I need to slice the sequences where there is the first stop codon. So I divided my alignment in codons and found the stop. I then found the first codon position using enumerate(). But I found the minimum for each row. However I need to... (0 Replies)
Discussion started by: Homa
0 Replies

7. Shell Programming and Scripting

Find and Remove rows

******************************************* * ROW * ******************************************* CODE:CODE1 FILE: FILE1 FIELD: FIELD1 KEY: KEY1 ORA-00001: unique constraint (ETL.KEY_PK) violated ******************************************* * ROW * *******************************************... (7 Replies)
Discussion started by: Shanks
7 Replies

8. Shell Programming and Scripting

count numbers of matching rows and replace its value in another file

Hello all, can you help me in this problem, assume We have two txt file (file_1 and file_3) one is file_1 contains the data: a 0 b 1 c 3 a 7 b 4 c 5 b 8 d 6 . . . . and I need to count the lines with the matching data (a,b,..) and print in new file called file_2 such as the... (4 Replies)
Discussion started by: GoldenFalcon10
4 Replies

9. Shell Programming and Scripting

How to replace rows from...to in a file?

Here is a description what i need: Document1: start... aaa bbb ccc ...end ======================= Document2: start... <paste the copied lines here> ...end All rows of document1 between "start...end" should be copied into the empty section "start...end" of document2. The... (3 Replies)
Discussion started by: smitty11
3 Replies

10. UNIX for Dummies Questions & Answers

Find different column numbers among rows in data

I want to find the different column numbers among rows in a file. For example: A001 a b c d e ... N A002 a b c d e ... N A003 a b c d e ... N+1 A004 a b c d e ... N A005 a b c d e ... N+2 : : For most of the lines I will have N columns (say 1000 rows) in each line except the line 3... (5 Replies)
Discussion started by: AMBER
5 Replies
Login or Register to Ask a Question