Need advice! Removing multiple entries in a single file!


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Need advice! Removing multiple entries in a single file!
# 1  
Old 12-05-2009
Data Need advice! Removing multiple entries in a single file!

Hello,
I have a file Test.txt with 9 columns that looks like this:

1g12 A 14 19 2OAY A 326 331 AAAASA
1l7v A 68 73 1l7v A 68 73 AALAIS
1l7v A 68 73 1XVW B 72 77 AALAIS
1l7v A 68 73 1XXU A 65 70 AALAIS
1l7v A 68 73 1XXU B 65 70 AALAIS
1l7v A 68 73 1XXU C 65 70 AALAIS
1l7v A 68 73 1XXU D 65 70 AALAIS
1j1n A 439 444 1j1n A 439 444 ADVRTY
1j1n A 439 444 1FUI B 360 365 ADVRTY

I am trying to remove repetitive entries from this file. The repetitive entry is where Col1=Col 5 AND Col 2=Col 6 AND Col 3=7 AND Col 4=Col 8. Examples of this are in bold above.

Is there a way to remove these repetitive entries and print the rest? I have read through some threads and tried to copy some awk scripts.. I have tried it at least for the first condition of Col1!=Col 5 but I get syntax errors. The code I wrote:

awk -F" " '{if($1!=$5){print $1" "$2" "$3" "$4" "$5" "$6" "$7" "$8" "$9"} }' Test.txt

Can someone advise me how to write this properly, extend it to all the conditions I mentioned, and print the whole line if all conditions are met?

Thanks in advance!
DG
# 2  
Old 12-05-2009
Code:
$ cat Test
awk '
  ($1 == $5) && ($2 == $6) && ($3 == $7) && ($4 == $8) { next }
  1
' file1

$ ./Test
1g12 A 14 19 2OAY A 326 331 AAAASA
1l7v A 68 73 1XVW B 72 77 AALAIS
1l7v A 68 73 1XXU A 65 70 AALAIS
1l7v A 68 73 1XXU B 65 70 AALAIS
1l7v A 68 73 1XXU C 65 70 AALAIS
1l7v A 68 73 1XXU D 65 70 AALAIS
1j1n A 439 444 1FUI B 360 365 ADVRTY

# 3  
Old 12-05-2009
another way:
Code:
awk -F" " '{if($1!=$5||$2!=$6||$3!=$7||$4!=$8){print $0}}' test.txt
1g12 A 14 19 2OAY A 326 331 AAAASA
1l7v A 68 73 1XVW B 72 77 AALAIS
1l7v A 68 73 1XXU A 65 70 AALAIS
1l7v A 68 73 1XXU B 65 70 AALAIS
1l7v A 68 73 1XXU C 65 70 AALAIS
1l7v A 68 73 1XXU D 65 70 AALAIS
1j1n A 439 444 1FUI B 360 365 ADVRTY


Yours would've worked if you remove the " after $9.

Last edited by jsmithstl; 12-05-2009 at 01:32 PM.. Reason: FYI on your original code
# 4  
Old 12-05-2009
Code:
awk '$1" "$2" "$3" "$4!=$5" "$6" "$7" "$8' infile

Code:
sed '/\(\([^ ]* \)\{4\}\)\1/d' infile

Code:
sed -r '/(([^ ]* ){4})\1/d' infile

Code:
egrep -v '(([^ ]* ){4})\1' infile

Code:
egrep -v '((\w* ){4})\1' infile


Last edited by Scrutinizer; 12-05-2009 at 03:40 PM..
# 5  
Old 12-05-2009
Thank you all for your replies!
Works fine now! Smilie
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Output file name and file contents of multiple files to a single file

I am trying to consolidate multiple information files (<hostname>.Linux.nfslist) into one file so that I can import it into Excel. I can get the file contents with cat *Linux.nfslist >> nfslist.txt. I need each line prefaced with the hostname. I am unsure how to do this. --- Post updated at... (5 Replies)
Discussion started by: Kentlee65
5 Replies

2. Shell Programming and Scripting

Removing multiple lines from input file, if multiple lines match a pattern.

GM, I have an issue at work, which requires a simple solution. But, after multiple attempts, I have not been able to hit on the code needed. I am assuming that sed, awk or even perl could do what I need. I have an application that adds extra blank page feeds, for multiple reports, when... (7 Replies)
Discussion started by: jxfish2
7 Replies

3. Shell Programming and Scripting

Reducing multiple entries in a tri-lingual dictionary to single entries

Dear all, I am editing a tri-lingual dictionary for open source which has the following data structure English headwords <Tab>Devanagari Headwords<Tab>PersoArabic headwords as in the example below to mark, to number अंगणु (اَنگَڻُ) The English headword entry has at times more than one word,... (2 Replies)
Discussion started by: gimley
2 Replies

4. Shell Programming and Scripting

Execution of loop :Splitting a single file into multiple .dat file

hdr=$(cut -c1 $path$file|head -1)#extract header”H” trl=$(cut -c|path$file|tail -1)#extract trailer “T” SplitFile=$(cut -c 50-250 $path 1$newfile |sed'$/ *$//' head -1')# to trim white space and extract table name If; then # start loop if it is a header While read I #read file Do... (4 Replies)
Discussion started by: SwagatikaP1
4 Replies

5. Shell Programming and Scripting

Shell scripting - need to arrange the columns from multiple file into a single file

Hi friends please help me on below, i have 5 files like below file1 is x 10 y 20 z 15 file2 is x 100 z 245 file3 is y 78 z 23 file4 is x 100 (3 Replies)
Discussion started by: siva kumar
3 Replies

6. Shell Programming and Scripting

Awk match multiple columns in multiple lines in single file

Hi, Input 7488 7389 chr1.fa chr1.fa 3546 9887 chr5.fa chr9.fa 7387 7898 chrX.fa chr3.fa 7488 7389 chr21.fa chr3.fa 7488 7389 chr1.fa chr1.fa 3546 9887 chr9.fa chr5.fa 7898 7387 chrX.fa chr3.fa Desired Output 7488 7389 chr1.fa chr1.fa 2 3546 9887 chr5.fa chr9.fa 2... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

7. Shell Programming and Scripting

Removing part of a file name and appending into a single file

I have two files like ABC_DEF_yyyyymmdd_hhmiss_XXX.txt and ABC_DEF_yyyyymmdd_hhmiss_YYY.txt. The date part is going to be changing everytime. How do i remove this date part of the file and create a single file like ABC_DEF_XXX.txt. (8 Replies)
Discussion started by: varlax
8 Replies

8. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no... (5 Replies)
Discussion started by: cokedude
5 Replies

9. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777... (5 Replies)
Discussion started by: G.K.K
5 Replies

10. Shell Programming and Scripting

Single to multiple line file

I am working with single line file with 589744523 characters having 542 "^M" (line feed) character. I want to make 542 different lines file from the single line file thr. shell program only (it can be done thr vi command) rd anil sorry for duplicate post previously, actually i don,t know... (6 Replies)
Discussion started by: anil_kut
6 Replies
Login or Register to Ask a Question