09-30-2012
Extract unique combination of rows from text files
Hi Gurus,
I have 100 tab-delimited text files each with 21 columns. I want to extract only 2nd and 5th column from each text file. However, the values in both 2bd and 5th column contain duplicate values but the combination of these values in a row are not duplicate. I want to extract only those entries which are unique based on the first appearance of value in the 5th column.
Ex. file1.txt conatins
rup m45 23 67 334 56 88
ytp m65 45 52 334 67 23
asd m43 12 34 456 23 11
wer m56 34 23 334 45 56
ayd m42 12 34 456 27 17
tyu m78 12 45 678 23 56
The output should be
rup m45 23 67 334 56 88
asd m43 12 34 456 23 11
tyu m78 12 45 678 23 56
Could somebody show me a way to deal with this for 100 files in one go!
Thanks a lot indeed.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi all,
I have got a problem while comparing 2 text files and the result should contains the unique values(Non repeatable).
For eg:
file1.txt
1
2
3
4
file2.txt
2
3
So after comaping the above 2 files I should get only 1 and 4 as the output. Pls help me out. (7 Replies)
Discussion started by: smarty86
7 Replies
2. Shell Programming and Scripting
Friends,
I have two very large plain text files with pipe delimited as below.
Both files are not sorted.
Both files have 200,000 of rows.
FName|LName|Address|HPhNumber
Is perl or shell script feasible for this task?
Thanks,
Prashant (1 Reply)
Discussion started by: ppat7046
1 Replies
3. Shell Programming and Scripting
Hi All,
I am trying to extract data from a large text file , I want to extract lines which contains a five digit number followed by a hyphen , like
12345- , i tried with egrep ,eg : egrep "+" text.txt
but which returns all the lines which contains any number of digits followed by hyhen ,... (19 Replies)
Discussion started by: shijujoe
19 Replies
4. Shell Programming and Scripting
Hello,
I am trying to write a bash shell script that does the following:
1.Finds all *.txt files within my directory of interest
2. reads each of the files (25 files) one by one (tab-delimited format and have the same data format)
3. skips the first 10 rows of the file
4. extracts and... (4 Replies)
Discussion started by: manishabh
4 Replies
5. Shell Programming and Scripting
Hi,
As part of our project, we need to load historical data for a year before our system is live. We have the data feed files that we need to load. However, I need to make sure that the file structure (number of fields separated by a comma) on the field is same for all the files of the same... (1 Reply)
Discussion started by: raj.jha
1 Replies
6. Shell Programming and Scripting
Hi All,
bash-3.00$ gzgrep -i '\ ExecuteThread:' /******/******/******/******/stdout.log.txt.gz
<Jan 7, 2012 5:54:55 PM UTC> <Error> <WebLogicServer> <BEA-000337> < ExecuteThread: '414' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "696" seconds working on the request... (4 Replies)
Discussion started by: osmanux
4 Replies
7. Shell Programming and Scripting
Hallo Everyone.
I have to admit I'm shell scripting illiterate . I need to find certain strings in several text files and replace each of the string by unique & corresponding text.
I prepared a csv file with 3 columns: <filename>;<old_pattern>;<new_pattern>
... (5 Replies)
Discussion started by: gordom
5 Replies
8. Shell Programming and Scripting
In a incoming folder i have list of files like below,i want to pick the unique files to process the job. if same file contain more than one then it should pick latest date modified file to process.
drwxrwsrwx 2 n308799 infagrp 256 May 20 17:42 Final_Working
drwxrwsrwx 2... (1 Reply)
Discussion started by: katakamvivek
1 Replies
9. Shell Programming and Scripting
I do have a tab delimited file of the following format:
431 kat1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
432 kat2 2 NA NA NA NA NA NA NA NA NA NA NA NA NA
433 KATe NA 3 NA NA 6 NA NA NA 10 11 NA NA NA NA
542 Kaed 2 NA NA NA NA NA NA NA NA NA NA NA NA NA
543 hkwuy NA NA NA NA 6 NA NA NA NA 11 NA NA... (11 Replies)
Discussion started by: Kanja
11 Replies
10. Shell Programming and Scripting
Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns.
I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies
LEARN ABOUT REDHAT
column
COLUMN(1) BSD General Commands Manual COLUMN(1)
NAME
column -- columnate lists
SYNOPSIS
column [-tx] [-c columns] [-s sep] [file ...]
DESCRIPTION
The column utility formats its input into multiple columns. Rows are filled before columns. Input is taken from file operands, or, by
default, from the standard input. Empty lines are ignored.
The options are as follows:
-c Output is formatted for a display columns wide.
-s Specify a set of characters to be used to delimit columns for the -t option.
-t Determine the number of columns the input contains and create a table. Columns are delimited with whitespace, by default, or with
the characters supplied using the -s option. Useful for pretty-printing displays.
-x Fill columns before filling rows.
Column exits 0 on success, >0 if an error occurred.
ENVIRONMENT
COLUMNS The environment variable COLUMNS is used to determine the size of the screen if no other information is available.
EXAMPLES
(printf "PERM LINKS OWNER GROUP SIZE MONTH DAY HH:MM/YEAR NAME
"
; ls -l | sed 1d) | column -t
SEE ALSO
colrm(1), ls(1), paste(1), sort(1)
HISTORY
The column command appeared in 4.3BSD-Reno.
BSD
June 6, 1993 BSD