Sponsored Content
Top Forums Shell Programming and Scripting How to put the command to remove duplicate lines in my awk script? Post 303037819 by Tim2424 on Wednesday 14th of August 2019 05:52:04 AM
Old 08-14-2019
Hello ! Smilie

Quote:
Your scripts show a variable named test being used to filter input, but gives no indication of how it is set, what it is used to match, nor why it is there.

Please show us the code that you have hidden from us. Am I correct in guessing that you are setting the shell variable test to a value that will be identical to one of the values that will be found in field #1 in each of your input files?
You're right.

The complete code is :

Code:
read a
test=$( echo $a | cut -d'=' -f2)


echo "<p><h2>FRAME : $test</h2></p>"

echo "<table>"
for fn in /var/www/cgi-bin/LPAR_MAP/*;
do
echo "<td>"
echo "<PRE>"
awk -F',|;' -v test="$test" ' 
     NR==1 {
        split(FILENAME ,a,"[-.]");
      }
     $0 ~ test  {
          if(!header++){
              print "DATE ========================== : " a[4] 
          }
          print ""
          print "RAM : " $5
          print "CPU 1 : " $6
          print "CPU 2 : " $7
          print "" 
          print ""
      }' $fn;

echo "</PRE>"
echo "</td>"
done
echo "</table>"

echo "<table>"
echo "<td>"
echo "<PRE>"

read a allow to recover the query string and test=$( echo $a | cut -d'=' -f2) allow to change the output. The basic output is FRAME_NAME=MIAIBYE00. It was generate from a listbox in my index page which is contain the list of my FRAMES. I use the cut command to keep only the right side of the =. My variable $test is equal to the query string with the cut. So I keep only the lines which is contain the query string.


Quote:
From the image you supplied in post #1 in this thread I thought the output you wanted would be something like:

which are the only two lines in your output that do not have identical values in all three columns. I would have thought that it would be more useful to also show the rest of the information lines in the output related to LPARS value miaibg04. But, since the data you say you want in post #9 has three input files with the same date (201908XX) and identical values for all of the other fields (XX), I am still just guessing at what output you want to produce. Smilie
In my post#1, I make a screenshot of only three columns, because... I can't do more. The date is from the filename, so yes, for this exemple, there is only three columns, bu as I have 276 csv files and if the date is from the filename... There is 276 columns. That's why there is only 3 columns here.
And like for the screenshot, the lines from my CSV are just here as an exemple. In reality, I have 226442 lines. You understand that I can't post all these lines as an exemple.

So, in a nuthsell :

- I have many CSV files ( 276 csv -> 226442 lines )
- I make awk to keep only the column 1,2,5,6,and 7. I would like to keep only the lines that are not the same, so I use the command if (!a[$0]++) to delete the duplicate lines ( By eliminating the duplicate lines, I reduce the number of columns too. )
- I would like to display these informations like that thanks to a html array :
Code:
DATE ===== XXXXXXXX    DATE ===== XXXXXXXX    DATE ===== XXXXXXXX
LPARS :  XXX           LPARS :  XXX           LPARS :  XXX
RAM : XX               RAM : XX               RAM : XXX
CPU1 : XX              CPU 1 : XX             CPU 1: XX
CPU 2 : XX             CPU 2 : XX             CPU2 : XX

LPARS :  XXX           LPARS :  XXX           LPARS :  XXX
RAM : XX               RAM : XX               RAM : XXX
CPU1 : XX              CPU 1 : XX             CPU 1: XX
CPU 2 : XX             CPU 2 : XX             CPU2 : XX
 
...

As in my first script and as you can see an exemple on the screenshot.

LPARS : the content of the column 2 kept by the awk command
RAM : the content of the column 5 kept by the awk command
CPU 1 : the content of the column 6 kept by the awk command
CPU 2 : the content of the column 7 kept by the awk command


Quote:
It is after 1:00am here, so I am going to bed. When I get up I will see If I can manufacture some input file data that I can use to test something that might or might not be similar to three of your input files and then see if I can create an awk script that will produce output that I might find useful. Since you are making this so difficult for any of us who are trying to help you, this may take a while and will not be high on my priority list.
There is no problem for that. It's just a simple request. If you haven't time to awser me or if you can't find a soluce, never mind ! I will continue to find a soluce for my part !

Have a nice day ! Smilie
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to remove duplicate lines

I have following file content (3 fields each line): 23 888 10.0.0.1 dfh 787 10.0.0.2 dssf dgfas 10.0.0.3 dsgas dg 10.0.0.4 df dasa 10.0.0.5 df dag 10.0.0.5 dfd dfdas 10.0.0.5 dfd dfd 10.0.0.6 daf nfd 10.0.0.6 ... as can be seen, that the third field is ip address and sorted. but... (3 Replies)
Discussion started by: fredao
3 Replies

2. Shell Programming and Scripting

Command/Script to remove duplicate lines from the file?

Hello, Can anyone tell Command/Script to remove duplicate lines from the file? (2 Replies)
Discussion started by: Rahulpict
2 Replies

3. Shell Programming and Scripting

awk script to remove duplicate rows in line

i have the long file more than one ns and www and mx in the line like . i need the first ns record and first www and first mx from line . the records are seperated with tthe ; i am try ing in awk scripting not getiing the solution. ... (4 Replies)
Discussion started by: kiranmosarla
4 Replies

4. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance (9 Replies)
Discussion started by: cola
9 Replies

5. Shell Programming and Scripting

remove duplicate lines using awk

Hi, I came to know that using awk '!x++' removes the duplicate lines. Can anyone please explain the above syntax. I want to understand how the above awk syntax removes the duplicates. Thanks in advance, sudvishw :confused: (7 Replies)
Discussion started by: sudvishw
7 Replies

6. Shell Programming and Scripting

Remove duplicate lines

Hi, I have a huge file which is about 50GB. There are many lines. The file format likes 21 rs885550 0 9887804 C C T C C C C C C C 21 rs210498 0 9928860 0 0 C C 0 0 0 0 0 0 21 rs303304 0 9941889 A A A A A A A A A A 22 rs303304 0 9941890 0 A A A A A A A A A The question is that there are a few... (4 Replies)
Discussion started by: zhshqzyc
4 Replies

7. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>... (19 Replies)
Discussion started by: raidzero
19 Replies

8. Shell Programming and Scripting

AWK Command to duplicate lines in a file?

Hi, I have a file with date in it like: UserString1 UserString2 UserString3 UserString4 UserString5 I need two entries for each line so it reads like UserString1 UserString1 UserString2 UserString2 etc. Can someone help me with the awk command please? Thanks (4 Replies)
Discussion started by: Grueben
4 Replies

9. Shell Programming and Scripting

Cant get awk 1liner to remove duplicate lines from Delimited file, get "event not found" error..help

Hi, I am on a Solaris8 machine If someone can help me with adjusting this awk 1 liner (turning it into a real awkscript) to get by this "event not found error" ...or Present Perl solution code that works for Perl5.8 in the csh shell ...that would be great. ****************** ... (3 Replies)
Discussion started by: andy b
3 Replies

10. Shell Programming and Scripting

How to remove duplicate lines?

Hi All, I am storing the result in the variable result_text using the below code. result_text=$(printf "$result_text\t\n$name") The result_text is having the below text. Which is having duplicate lines. file and time for the interval 03:30 - 03:45 file and time for the interval 03:30 - 03:45 ... (4 Replies)
Discussion started by: nalu
4 Replies
All times are GMT -4. The time now is 01:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy