File manipulation based on values in file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File manipulation based on values in file
# 1  
Old 10-30-2009
Lightbulb File manipulation based on values in file

Hi,
using Shell to do some file manipulation here.

Code:
Input - input.txt
"2006/2007", "string1a","string2v","stringf3"
"2006/2007", "string12b","string30c","string10d"
"2006/2007", "string22","string22","string11"
"2007/2008", "string1a","string2v","stringf3"
"2007/2008", "string12","string30","string10"
"2007/2008", "string22","string222","string111"
"2007/2008", "string100","string111","string444"
"2007/2008", "string134","string245","string389"
"2008/2009", "string1","string2","string3"
"2008/2009", "string12","string30","string10"
"2008/2009", "string22","string222","string111"
"2008/2009", "string1000","string1111","string4444"
"2008/2009", "string1234","string2456","string3789"

I need the output to be thrown into three or x files based on the unique values on first column
Code:
 
Output
File 2006-2007.txt
"2006/2007", "string1a","string2v","stringf3"
"2006/2007", "string12b","string30c","string10d"
"2006/2007", "string22","string22","string11"
File 2007-2008.txt
"2007/2008", "string1a","string2v","stringf3"
"2007/2008", "string12","string30","string10"
"2007/2008", "string22","string222","string111"
"2007/2008", "string100","string111","string444"
"2007/2008", "string134","string245","string389"
File 2008-2009.txt
"2008/2009", "string1","string2","string3"
"2008/2009", "string12","string30","string10"
"2008/2009", "string22","string222","string111"
"2008/2009", "string1000","string1111","string4444"
"2008/2009", "string1234","string2456","string3789"

I am trying to to take unique values from the first column using cut -d , :f1 | sort | uniq to create a list of files and using the below code to populate the output files.
Code:
for id in `cut -d , :f1 input.txt | sort | uniq` ;do
 grep $id input.txt > $id.txt
done

But it is taking so long time. Cut itself taking 7mins as the input file is huge.
Please suggest any other approach which can improve the turnaround time.
# 2  
Old 10-30-2009
Why didnt you do the sort first? That will speed up things...
If its huge why not extract the values and store them in a file to be used with for/while
e.g
Code:
cat valuefile |while read IDVALUE
do
  grep $IDVALUE etc...
done

# 3  
Old 10-30-2009
Code:
sed 's#/#-#' input.txt  |awk -F[\",] '{file = $2 ".txt" } {print > file}'

# 4  
Old 10-30-2009
# 5  
Old 10-30-2009
Code:
while(<DATA>){
	my @tmp = split(",",$_);
	$hash{$tmp[0]}.=$_;
}
foreach my $key (keys %hash){
	my $file = $key;
	$file=~s/"//g;
	$file=~s/\//-/;
	$file.=".txt";
	open FH,">$file";
	print FH $hash{$key};
	close FH;
}
__DATA__
"2006/2007", "string1a","string2v","stringf3"
"2006/2007", "string12b","string30c","string10d"
"2006/2007", "string22","string22","string11"
"2007/2008", "string1a","string2v","stringf3"
"2007/2008", "string12","string30","string10"
"2007/2008", "string22","string222","string111"
"2007/2008", "string100","string111","string444"
"2007/2008", "string134","string245","string389"
"2008/2009", "string1","string2","string3"
"2008/2009", "string12","string30","string10"
"2008/2009", "string22","string222","string111"
"2008/2009", "string1000","string1111","string4444"
"2008/2009", "string1234","string2456","string3789"

# 6  
Old 10-31-2009
Code:
awk -F'["/]' '{print >> ("File_"$2"-"$3".txt")}' inFile

Use gawk, nawk or /usr/xpg4/bin/awk on Solaris.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to look up values in File 2 from File 1, & printingNth field of File1 based value of File2 $2

I have two files which are the output of a multiple choice vocab test (60 separate questions) from 104 people (there are some missing responses) and the question list. I have the item list in one file (File1) Item,Stimulus,Choice1,Choice2,Choice3,Choice4,Correct... (5 Replies)
Discussion started by: samonl
5 Replies

2. Shell Programming and Scripting

Fetching values in CSV file based on column name

input.csv: Field1,Field2,Field3,Field4,Field4 abc ,123 ,xyz ,000 ,pqr mno ,123 ,dfr ,111 ,bbb output: Field2,Field4 123 ,000 123 ,111 how to fetch the values of Field4 where Field2='123' I don't want to fetch the values based on column position. Instead want to... (10 Replies)
Discussion started by: bharathbangalor
10 Replies

3. Shell Programming and Scripting

Identifying columns and their values based on schema file

I have 3 files, data file,schema file and a threshold file. Data file contains data in which columns are distributed according to schema file. This data file doesn't contain any headers. Three continuous columns in the data file represent single variable in schema file. first column represent... (1 Reply)
Discussion started by: bharathbangalor
1 Replies

4. UNIX for Dummies Questions & Answers

Adding a column to a text file based on mathematical manipulation

Hi, I have a tab delimited text file with three different columns. I want to add an extra column to the text file. The extra column will be the second column and it will equal third column - 1. How do I go about doing that? Thanks! Input: chr1 788822 rs11240777 chr1 1008567 rs9442372... (2 Replies)
Discussion started by: evelibertine
2 Replies

5. Shell Programming and Scripting

Splitting file based on column values

Hi all, I have a file (say file.txt) which contains comma-separated rows. Each row has seven columns. Only column 4 or 5 (not both) can have empty values like "" in each line. Sample lines So, now i want all the rows that have column 4 as "" go in file1.txt and all the rows that have column... (8 Replies)
Discussion started by: jakSun8
8 Replies

6. UNIX for Dummies Questions & Answers

Filtering records from 1 file based on some manipulation doen on second file

Hi, I am looking for an awk script which should help me to meet the following requirement: File1 has records in following format INF: FAILEd RECORD AB1234 INF: FAILEd RECORD PQ1145 INF: FAILEd RECORD AB3215 INF: FAILEd RECORD AB6114 ............................ (2 Replies)
Discussion started by: mintu41
2 Replies

7. UNIX for Dummies Questions & Answers

count values based on contents of another file

Hello, I have two files as shown below: test1 678 679 689 690 710 test2 1 678 654 800 676 791 689 900 I want to get a count of lines from test2 whose columns bound the values in test1 I tried running the code below; however am getting wrong results. (3 Replies)
Discussion started by: Gussifinknottle
3 Replies

8. UNIX for Dummies Questions & Answers

Find and Replace based on values in an file

I have a file in which I want to do multiple find and replace of strings. For a single replace I can implement: sed -i 's/old/new/' <input_file> I have a second file that contains the old and the new values like the arbitrary example below: old new xyz pqr ab 756 rst pqr... (3 Replies)
Discussion started by: Gussifinknottle
3 Replies

9. Shell Programming and Scripting

Replacing values in a file based on values in another file

Hi I have 2 files:- 1. List of files which consists of names of some output files. 2. A delimited file; delimted by "|" I want to replace the value of the $23 (23rd column) in the delimited file with name in the first file. It is always position to position. Meaning first row of the first... (5 Replies)
Discussion started by: pparthiv
5 Replies

10. Shell Programming and Scripting

extract from a file based on values in another file

Hello, I have two files that have delimited entries as shown below. I would like to use either Perl or Shell script to extract all the rows in File 1 corresponding to values in File 2 and output it to another File. File 1 ------- 1 36 24 Object1 2 45 36 Object2 3 96 ... (1 Reply)
Discussion started by: Gussifinknottle
1 Replies
Login or Register to Ask a Question