Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Need advice! Removing multiple entries in a single file! Post 302377831 by InfoSeeker on Saturday 5th of December 2009 12:03:50 PM
Old 12-05-2009
Data Need advice! Removing multiple entries in a single file!

Hello,
I have a file Test.txt with 9 columns that looks like this:

1g12 A 14 19 2OAY A 326 331 AAAASA
1l7v A 68 73 1l7v A 68 73 AALAIS
1l7v A 68 73 1XVW B 72 77 AALAIS
1l7v A 68 73 1XXU A 65 70 AALAIS
1l7v A 68 73 1XXU B 65 70 AALAIS
1l7v A 68 73 1XXU C 65 70 AALAIS
1l7v A 68 73 1XXU D 65 70 AALAIS
1j1n A 439 444 1j1n A 439 444 ADVRTY
1j1n A 439 444 1FUI B 360 365 ADVRTY

I am trying to remove repetitive entries from this file. The repetitive entry is where Col1=Col 5 AND Col 2=Col 6 AND Col 3=7 AND Col 4=Col 8. Examples of this are in bold above.

Is there a way to remove these repetitive entries and print the rest? I have read through some threads and tried to copy some awk scripts.. I have tried it at least for the first condition of Col1!=Col 5 but I get syntax errors. The code I wrote:

awk -F" " '{if($1!=$5){print $1" "$2" "$3" "$4" "$5" "$6" "$7" "$8" "$9"} }' Test.txt

Can someone advise me how to write this properly, extend it to all the conditions I mentioned, and print the whole line if all conditions are met?

Thanks in advance!
DG
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Single to multiple line file

I am working with single line file with 589744523 characters having 542 "^M" (line feed) character. I want to make 542 different lines file from the single line file thr. shell program only (it can be done thr vi command) rd anil sorry for duplicate post previously, actually i don,t know... (6 Replies)
Discussion started by: anil_kut
6 Replies

2. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777... (5 Replies)
Discussion started by: G.K.K
5 Replies

3. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no... (5 Replies)
Discussion started by: cokedude
5 Replies

4. Shell Programming and Scripting

Removing part of a file name and appending into a single file

I have two files like ABC_DEF_yyyyymmdd_hhmiss_XXX.txt and ABC_DEF_yyyyymmdd_hhmiss_YYY.txt. The date part is going to be changing everytime. How do i remove this date part of the file and create a single file like ABC_DEF_XXX.txt. (8 Replies)
Discussion started by: varlax
8 Replies

5. Shell Programming and Scripting

Awk match multiple columns in multiple lines in single file

Hi, Input 7488 7389 chr1.fa chr1.fa 3546 9887 chr5.fa chr9.fa 7387 7898 chrX.fa chr3.fa 7488 7389 chr21.fa chr3.fa 7488 7389 chr1.fa chr1.fa 3546 9887 chr9.fa chr5.fa 7898 7387 chrX.fa chr3.fa Desired Output 7488 7389 chr1.fa chr1.fa 2 3546 9887 chr5.fa chr9.fa 2... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

6. Shell Programming and Scripting

Shell scripting - need to arrange the columns from multiple file into a single file

Hi friends please help me on below, i have 5 files like below file1 is x 10 y 20 z 15 file2 is x 100 z 245 file3 is y 78 z 23 file4 is x 100 (3 Replies)
Discussion started by: siva kumar
3 Replies

7. Shell Programming and Scripting

Execution of loop :Splitting a single file into multiple .dat file

hdr=$(cut -c1 $path$file|head -1)#extract header”H” trl=$(cut -c|path$file|tail -1)#extract trailer “T” SplitFile=$(cut -c 50-250 $path 1$newfile |sed'$/ *$//' head -1')# to trim white space and extract table name If; then # start loop if it is a header While read I #read file Do... (4 Replies)
Discussion started by: SwagatikaP1
4 Replies

8. Shell Programming and Scripting

Reducing multiple entries in a tri-lingual dictionary to single entries

Dear all, I am editing a tri-lingual dictionary for open source which has the following data structure English headwords <Tab>Devanagari Headwords<Tab>PersoArabic headwords as in the example below to mark, to number अंगणु (اَنگَڻُ) The English headword entry has at times more than one word,... (2 Replies)
Discussion started by: gimley
2 Replies

9. Shell Programming and Scripting

Removing multiple lines from input file, if multiple lines match a pattern.

GM, I have an issue at work, which requires a simple solution. But, after multiple attempts, I have not been able to hit on the code needed. I am assuming that sed, awk or even perl could do what I need. I have an application that adds extra blank page feeds, for multiple reports, when... (7 Replies)
Discussion started by: jxfish2
7 Replies

10. UNIX for Beginners Questions & Answers

Output file name and file contents of multiple files to a single file

I am trying to consolidate multiple information files (<hostname>.Linux.nfslist) into one file so that I can import it into Excel. I can get the file contents with cat *Linux.nfslist >> nfslist.txt. I need each line prefaced with the hostname. I am unsure how to do this. --- Post updated at... (5 Replies)
Discussion started by: Kentlee65
5 Replies
COL(1)							    BSD General Commands Manual 						    COL(1)

NAME
col -- filter reverse line feeds from input SYNOPSIS
col [-bfpx] [-l num] DESCRIPTION
Col filters out reverse (and half reverse) line feeds so the output is in the correct order with only forward and half forward line feeds, and replaces white-space characters with tabs where possible. This can be useful in processing the output of nroff(1) and tbl(1). Col reads from standard input and writes to standard output. The options are as follows: -b Do not output any backspaces, printing only the last character written to each column position. -f Forward half line feeds are permitted (``fine'' mode). Normally characters printed on a half line boundary are printed on the follow- ing line. -p Force unknown control sequences to be passed through unchanged. Normally, col will filter out any control sequences from the input other than those recognized and interpreted by itself, which are listed below. -x Output multiple spaces instead of tabs. -lnum Buffer at least num lines in memory. By default, 128 lines are buffered. The control sequences for carriage motion that col understands and their decimal values are listed in the following table: ESC-7 reverse line feed (escape then 7) ESC-8 half reverse line feed (escape then 8) ESC-9 half forward line feed (escape then 9) backspace moves back one column (8); ignored in the first column carriage return (13) newline forward line feed (10); also does carriage return shift in shift to normal character set (15) shift out shift to alternate character set (14) space moves forward one column (32) tab moves forward to next tab stop (9) vertical tab reverse line feed (11) All unrecognized control characters and escape sequences are discarded. Col keeps track of the character set as characters are read and makes sure the character set is correct when they are output. If the input attempts to back up to the last flushed line, col will display a warning message. SEE ALSO
expand(1), nroff(1), tbl(1) STANDARDS
The col utility conforms to the Single UNIX Specification, Version 2. The -l option is an extension to the standard. HISTORY
A col command appeared in Version 6 AT&T UNIX. BSD
June 17, 1991 BSD
All times are GMT -4. The time now is 02:40 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy