Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Extract unique combination of rows from text files Post 302707975 by Unilearn on Sunday 30th of September 2012 05:05:07 PM
Old 09-30-2012
Extract unique combination of rows from text files

Hi Gurus,

I have 100 tab-delimited text files each with 21 columns. I want to extract only 2nd and 5th column from each text file. However, the values in both 2bd and 5th column contain duplicate values but the combination of these values in a row are not duplicate. I want to extract only those entries which are unique based on the first appearance of value in the 5th column.

Ex. file1.txt conatins
rup m45 23 67 334 56 88
ytp m65 45 52 334 67 23
asd m43 12 34 456 23 11
wer m56 34 23 334 45 56
ayd m42 12 34 456 27 17
tyu m78 12 45 678 23 56

The output should be
rup m45 23 67 334 56 88
asd m43 12 34 456 23 11
tyu m78 12 45 678 23 56

Could somebody show me a way to deal with this for 100 files in one go!
Thanks a lot indeed.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

comparing 2 text files to get unique values??

Hi all, I have got a problem while comparing 2 text files and the result should contains the unique values(Non repeatable). For eg: file1.txt 1 2 3 4 file2.txt 2 3 So after comaping the above 2 files I should get only 1 and 4 as the output. Pls help me out. (7 Replies)
Discussion started by: smarty86
7 Replies

2. Shell Programming and Scripting

Compare 200,000 of rows in two text files

Friends, I have two very large plain text files with pipe delimited as below. Both files are not sorted. Both files have 200,000 of rows. FName|LName|Address|HPhNumber Is perl or shell script feasible for this task? Thanks, Prashant (1 Reply)
Discussion started by: ppat7046
1 Replies

3. Shell Programming and Scripting

extract unique pattern from large text file

Hi All, I am trying to extract data from a large text file , I want to extract lines which contains a five digit number followed by a hyphen , like 12345- , i tried with egrep ,eg : egrep "+" text.txt but which returns all the lines which contains any number of digits followed by hyhen ,... (19 Replies)
Discussion started by: shijujoe
19 Replies

4. Shell Programming and Scripting

extract multiple cloumns from multiple files; skip rows and include filenames; awk

Hello, I am trying to write a bash shell script that does the following: 1.Finds all *.txt files within my directory of interest 2. reads each of the files (25 files) one by one (tab-delimited format and have the same data format) 3. skips the first 10 rows of the file 4. extracts and... (4 Replies)
Discussion started by: manishabh
4 Replies

5. Shell Programming and Scripting

head / tail combination returns multiple rows

Hi, As part of our project, we need to load historical data for a year before our system is live. We have the data feed files that we need to load. However, I need to make sure that the file structure (number of fields separated by a comma) on the field is same for all the files of the same... (1 Reply)
Discussion started by: raj.jha
1 Replies

6. Shell Programming and Scripting

Select combination unique using shell script

Hi All, bash-3.00$ gzgrep -i '\ ExecuteThread:' /******/******/******/******/stdout.log.txt.gz <Jan 7, 2012 5:54:55 PM UTC> <Error> <WebLogicServer> <BEA-000337> < ExecuteThread: '414' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "696" seconds working on the request... (4 Replies)
Discussion started by: osmanux
4 Replies

7. Shell Programming and Scripting

Finding a text in files & replacing it with unique strings

Hallo Everyone. I have to admit I'm shell scripting illiterate . I need to find certain strings in several text files and replace each of the string by unique & corresponding text. I prepared a csv file with 3 columns: <filename>;<old_pattern>;<new_pattern> ... (5 Replies)
Discussion started by: gordom
5 Replies

8. Shell Programming and Scripting

Extract unique files

In a incoming folder i have list of files like below,i want to pick the unique files to process the job. if same file contain more than one then it should pick latest date modified file to process. drwxrwsrwx 2 n308799 infagrp 256 May 20 17:42 Final_Working drwxrwsrwx 2... (1 Reply)
Discussion started by: katakamvivek
1 Replies

9. Shell Programming and Scripting

Unique extraction of rows

I do have a tab delimited file of the following format: 431 kat1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 432 kat2 2 NA NA NA NA NA NA NA NA NA NA NA NA NA 433 KATe NA 3 NA NA 6 NA NA NA 10 11 NA NA NA NA 542 Kaed 2 NA NA NA NA NA NA NA NA NA NA NA NA NA 543 hkwuy NA NA NA NA 6 NA NA NA NA 11 NA NA... (11 Replies)
Discussion started by: Kanja
11 Replies

10. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies
COLUMN(1)						    BSD General Commands Manual 						 COLUMN(1)

NAME
column -- columnate lists SYNOPSIS
column [-tx] [-c columns] [-s sep] [file ...] DESCRIPTION
The column utility formats its input into multiple columns. Rows are filled before columns. Input is taken from file operands, or, by default, from the standard input. Empty lines are ignored. The options are as follows: -c Output is formatted for a display columns wide. -s Specify a set of characters to be used to delimit columns for the -t option. -t Determine the number of columns the input contains and create a table. Columns are delimited with whitespace, by default, or with the characters supplied using the -s option. Useful for pretty-printing displays. -x Fill columns before filling rows. Column exits 0 on success, >0 if an error occurred. ENVIRONMENT
COLUMNS The environment variable COLUMNS is used to determine the size of the screen if no other information is available. EXAMPLES
(printf "PERM LINKS OWNER GROUP SIZE MONTH DAY HH:MM/YEAR NAME " ; ls -l | sed 1d) | column -t SEE ALSO
colrm(1), ls(1), paste(1), sort(1) HISTORY
The column command appeared in 4.3BSD-Reno. BSD
June 6, 1993 BSD
All times are GMT -4. The time now is 09:54 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy