Split file based on distinct value at specific position


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split file based on distinct value at specific position
# 1  
Old 07-22-2013
Split file based on distinct value at specific position

OS : Linux 2.6x
Shell : Korn

In a single file , how can I identify all the Uniqe values at a specific character position and length of each record ,
and simultaneously SPLIT the records of the file based on each of these values and write them in seperate files .

Lets say :
Code:
a) I want to know what are the distinct values in the field marked by start character position 15 , and the next three
   characters , for each record in a file
b) If there are TWO SUCH DISTINCT VALUES , how to get the records for each of the distinct values in seperate files ?

Please help , it urgent , and its not a HOMEWORK ASSIGNMENT .

Thanks
Kumarjit.
# 2  
Old 07-22-2013
Could you provide sample input and output files?
# 3  
Old 07-22-2013
Try
Code:
awk '{fn=substr($0, 15,3); print > fn}' file

If the number of output files gets larger and larger, you may need to close (fn) in between...
# 4  
Old 07-23-2013
@RudiC : Danke Rudi .......Smilie
But , didnt understand what you tried to mean by saying :

Code:
you may need to close (fn) in between...

If the numbre distinct values indicated by the field whose start position is 15 th character spanning the next three characters , how to ensure that this code performs optimally , cause I had tried and awk command over 10 million records , and it was going at snails's pace.

Please validate if my undertstanding is true.

Thanks to all of you.

Regards
Kumarjit.

---------- Post updated at 03:25 AM ---------- Previous update was at 03:21 AM ----------

Actually , what I tried to mean is :

If the number distinct values indicated by the field whose start position is 15 th character spanning the next three characters is significantly on the larger side , how to ensure that this code performs optimally , cause I had tried and awk command over 10 million records , and it was going at snails's pace.



Thanks again
Kumarjit.
# 5  
Old 07-26-2013
At some point, the number of open files per process is exhausted. Certainly, with three characters, you will be able to reach 1000+ files, which is very close to OPEN_MAX (1024 on Linux). You need to append to the files, then, using >>, and close (fn) after each write.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Is it possible to rename fasta headers based on its position specified in another file?

I have 5 sequences in a fasta file namely gene1.fasta as follows, gene1.fasta >1256 ATGTAGC >GEP TAGAG >GTY578 ATGCATA >67_iga ATGCTGA >90_ld ATGCTG I need to rename the gene1.fasta file based on the sequence position specified in list.txt as follows, list.txt position1=org5... (5 Replies)
Discussion started by: dineshkumarsrk
5 Replies

2. Shell Programming and Scripting

Count specific character of a file in each line and delete this character in a specific position

I will appreciate if you help me here in this script in Solaris Enviroment. Scenario: i have 2 files : 1) /tmp/TRANSACTIONS_DAILY_20180730.txt: 201807300000000004 201807300000000005 201807300000000006 201807300000000007 201807300000000008 2)... (10 Replies)
Discussion started by: teokon90
10 Replies

3. Shell Programming and Scripting

Search for a string at a particular position and replace with blank based on position

Hi, I have a file with multiple lines(fixed width dat file). I want to search for '02' in the positions 45-46 and if available, in that lines, I need to replace value in position 359 with blank. As I am new to unix, I am not able to figure out how to do this. Can you please help me to achieve... (9 Replies)
Discussion started by: Pradhikshan
9 Replies

4. Shell Programming and Scripting

Fixed width file search based on position value

Hi, I am unable to find the right option to extract the data in the fixed width file. sample data abcd1234xgyhsyshijfkfk hujk9876 io xgla loki8787eljuwoejroiweo dkfj9098 dja Search based on position 8-9="xg" and print the entire row output ... (4 Replies)
Discussion started by: onesuri
4 Replies

5. Shell Programming and Scripting

position specific replace in file

How to replace the position specific values in the file.. i searched a lot the forums but i couldn't able to do... i have file like below 576666666666666666666666666 7878 897987 121 0asdas Y12 5900fbb 777 09JJJ 78798347892374 234234234364 234232898 89HJHIHIGIUG989902743748327khjkhkjlh... (6 Replies)
Discussion started by: greenworld123
6 Replies

6. UNIX for Dummies Questions & Answers

Script to delete a word based on position in a file

Hi, I am new to unix. I want to delete 2 words placed at position say for example at 23rd and 45th position in a line. I used sed but couldnt achieve this. Example: the file contains 2 lines 12345 98765 "12345" 876 12345 98765 "64578" 876 I want to delete " placed at position 13 and 19... (4 Replies)
Discussion started by: nbks2u
4 Replies

7. Shell Programming and Scripting

Copy an entire file to specific position to another file

Hi , I need your kind help for my below requirement I need to copy and entire txt file to a certain position to the target file . Source file has 3 lines and it has to be copied to the target file in position from line 10 to 12. Thanks for your support (1 Reply)
Discussion started by: Pratik4891
1 Replies

8. UNIX for Dummies Questions & Answers

To Extract words from File based on Position

Hi Guys, While I was writing one shell script , I just got struck at this point. I need to extract words from a file at some specified position and do some comparison operation and need to replace the extracted word with another word. Eg : I like Orange very much. I need to replace... (19 Replies)
Discussion started by: kuttu123
19 Replies

9. Shell Programming and Scripting

Add characters at specific position in file

Hello I want to add some value at the specific position. My file has data like Hello Welcome to UNIX Forums Need Assistance I want to add some value at the end but at same character position for all lines. I want my output file to have data like : Here '_' represents blanks.... (3 Replies)
Discussion started by: dashing201
3 Replies

10. Shell Programming and Scripting

Insert character in a specific position of a file

Hi, I need to add Pipe (|) at 5th and 18th position of all records a file. How can I do this? I tried to add it at 5th position using the below code. It didnt work. Please help!!! awk '{substr($0,5,1) ~ /|/}{print}' $input_file > $temp_file (1 Reply)
Discussion started by: gpaulose
1 Replies
Login or Register to Ask a Question