Extract data based on position


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract data based on position
# 1  
Old 11-17-2010
Extract data based on position

The file has record length 200. And i have 100 search strings which are ten digits of character from 1 to 10 characters all of them are unique, they need to searched in a file. Please help me to pull the records based on position (say from 1-10).
test data

Code:
 
1FAHP2DW0BG115206RASHEED                        ABDALAZYZ                                                                  SAU2011PH    TWQE      9B040 RIYADH              A12010-11-072010-11-08K0K22                                                                                                            0555211674          
1FAHP2DW0BG120728MILHEM                         AL JOBIL                           033623707                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-012010-11-07K0K22                                                                                                            0567489990          
1FAHP2DW1BG115148SAWAD                          ALAWAMI                            026211449                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-062010-11-07K0K22                                                                                                            0555518986          
1FAHP2DW1BG115165ABRAHIM                        SAQR                               014264836                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-092010-11-10K0K22                                                                                                            0505255296          
1FAHP2DW2BG110041MOHAMMAD                       AL SALAMA                          038605655                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-072010-11-08K0K22                                                                                                            0500707095          
1FAHP2DW2BG115174MOHAMMED FAHED                 ABDULLAZIZ                                                                 SAU2011PH    TWQE      9B040 RIYADH              A12010-11-092010-11-10K0K22                                                                                                            0530001551          
1FAHP2DW3BG118889TORKEE                         ABAD ALKAREEM                      0565550099                              SAU2011PH    TWQE      9B040 RIYADH              A12010-11-042010-11-07K0K22                                                                                                            0555550099          
1FAHP2DW3BG126622MOSAB                          ALGTHAMI                           027290078                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-062010-11-07K0K22                                                                                                            0505704557          
1FAHP2DW4BG110090ASEM                           EZAT                               026602012                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-022010-11-07K0K22                                                                                                            0565656323          
1FAHP2DW4BG115192AL MULHIM                      AUTO SERVICE                                                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-082010-11-09K0K22                                                                                                            0590221522          
1FAHP2DW4BG126631MOHAMMED                       HMEED                              026791887                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-082010-11-09K0K22                                                                                                            0554625854          
1FAHP2DW5BG110017AL MULHIM                      AUTO SERVICE                                                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-092010-11-10K0K22                                                                                                            0590221522          
1FAHP2DW5BG115220ABDULALAH                      ABDULAH                                                                    SAU2011PH    TWQE      9B040 RIYADH              A12010-11-042010-11-07K0K22                                                                                                            0506722882          
1FAHP2DW5BG120725MAHER                          MOHAMAD                            025482574                               SAU2011PH    TWQE      9B040 RIYADH              A12010-11-092010-11-10K0K22                                                                                                            0567511759          
1FAHP2DW6BG110057ZHER                           SOLIMAM                                                                    SAU2011PH    TWQE      9B040 RIYADH              A12010-11-092010-11-10K0K22                                                                                                            0556286929          
1FAHP2DW6BG115176NASER                          ABDALLAH                                                                   SAU2011PH    TWQE      9B040 RIYADH              A12010-11-082010-11-09K0K22                                                                                                            0502832959

# 2  
Old 11-17-2010
Code:
awk ' FNR == NR { a[$0]=1; next } a[substr($0,1,10)] } ' search_string_file input_file

# 3  
Old 11-18-2010
Thanks for your reply. While executing above command i get following error message. I just tried by adjusting curly braces but again i get error.


awk: FNR == NR { a[$0]=1; next } a[substr($0,1,10)] }
awk: ^ syntax error
# 4  
Old 11-18-2010
Looks like a typo in the command, this fixes syntax error:

Code:
awk ' FNR == NR { a[$0]=1; next } a[substr($0,1,10)] ' search_string_file input_file

Note: It wasn't clear from your description where the search should be looking, the above code checks the first 10 chars of input_file for a match with the search strings. If a search string isn't exactly 10 chars long it wont match anything.

If all you want is a match anywhere on the line you could use grep

Code:
grep -f search_string_file input_file


Last edited by Chubler_XL; 11-18-2010 at 05:48 PM..
# 5  
Old 11-18-2010
@Chubler :

If the matching is similar, then consider the following example
(we can then proceed by analogy adapting the matching to our needs)

Code:
[ctsgnb@shell ~]$ cat mys
toto
bla
124
125
[ctsgnb@shell ~]$ cat myfile
134512345 toto whateva
124523043 hhhh nothing
123512344 anyword bla
125412450 hshs juju
124452344 titi bla
123451345 hhhhhhh tatat
[ctsgnb@shell ~]$ sed 's/.*/^&/' mys
^toto
^bla
^124
^125
[ctsgnb@shell ~]$ sed 's/.*/^&/' mys | grep -f - myfile
124523043 hhhh nothing
125412450 hshs juju
124452344 titi bla
[ctsgnb@shell ~]$

another one :

Code:
[ctsgnb@shell ~]$ cat myfile
134512345 toto whateva
124523043 hhhh nothing
124599943 bla nothing
123512344 anyword bla
125412450 hshs juju
124452344 titi bla
123451345 hhhhhhh toto
125412450 bla juju
125412450 toto juju
[ctsgnb@shell ~]$ cat mys
toto
bla
124
125
[ctsgnb@shell ~]$ sed 's/.*/^.\\{10\\}&/' mys | grep -f - myfile
134512345 toto whateva
124599943 bla nothing
125412450 bla juju
125412450 toto juju
[ctsgnb@shell ~]$

so
Code:
sed 's/.*/^&/' patternfile | grep -f - mydatafile


Last edited by ctsgnb; 11-18-2010 at 06:27 PM..
# 6  
Old 11-19-2010
Perfect... All three works as expected.. Thank you all!
# 7  
Old 11-19-2010
hi ,
while doing FTP from mainframe to unix if the file we are sending is already existing in the server ,it is getting replaced.when i used sunique command,its not replacing but stil creates a copy of the file with .1 to .99 extension.is there any way to abort the file transfer if the file is already existing in the unix server 'without creating another copy'?
can u plz help me on this.
thanks in advance
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search for a string at a particular position and replace with blank based on position

Hi, I have a file with multiple lines(fixed width dat file). I want to search for '02' in the positions 45-46 and if available, in that lines, I need to replace value in position 359 with blank. As I am new to unix, I am not able to figure out how to do this. Can you please help me to achieve... (9 Replies)
Discussion started by: Pradhikshan
9 Replies

2. UNIX for Dummies Questions & Answers

extract regions of file based on start and end position

Hi, I have a file1 of many long sequences, each preceded by a unique header line. file2 is 3-columns list: headers name, start position, end position. I'd like to extract the sequence region of file1 specified in file2. Based on a post elsewhere, I found the code: awk... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

3. Shell Programming and Scripting

extract data in a csv file based on a certain field.

I have a csv file that I need to extract some data from depending on another field after reading info from another text file. The text file would say have 592560 in it. The csv file may have some data like so Field 1 Field2 Field3 Field4 Field5 Field6 20009756 1 ... (9 Replies)
Discussion started by: GroveTuckey
9 Replies

4. Shell Programming and Scripting

Help with grouping data based on range position

Input file: data_1 1000 1290 data_4 290 234 data_2 1114 1110 data_5 534 999 data_6 900 1050 . . Desired_output_file_1_0_999: data_4 290 234 data_5 534 999 Desired_output_file_2_1000_1999: data_1 1000 1290 data_2 1114 1110 (1 Reply)
Discussion started by: perl_beginner
1 Replies

5. Shell Programming and Scripting

Cut multiple data based on character position

How to extract multiple data based on character position. I need to fetch from 7-9 and 22-26 and there is no delimiter for 22-26 since it is part of the column. The file may have more than 1000 character long.I managed to pull any one but not both for example test data 12345 zxc vbnmlk... (1 Reply)
Discussion started by: zooby
1 Replies

6. Shell Programming and Scripting

Extract data based on specific search criteria

I have a huge file (about 2 millions records) contains data separated by “,” (comma). As part of the requirement, I can't change the format. The objective is to remove some of the records with the following condition. If the 23rd field on each line start with 302 , I need to remove that from the... (4 Replies)
Discussion started by: jaygamini
4 Replies

7. Shell Programming and Scripting

How to extract log data based on date

Hi Gurus, I've been having some problem in extracting the log data based on the current date and month. As shown in the sample data below, how to extract the log info for Aug 11? Sample data: root pts/ta userpc Wed Aug 11 09:46 - 20:21 (10:35) root pts/ta userpc... (13 Replies)
Discussion started by: superHonda123
13 Replies

8. UNIX for Dummies Questions & Answers

Extract Data and arrange it based on timelines

Hi Im trying to extract a specific pattern of data from a log file and store it in a other file, Im executing the below command in various files and storing it in a single file. I data that Im storing needs to be arranged based on a date and timestamp. Please assist, how to achieve this. Thanks... (1 Reply)
Discussion started by: vr3w3c9
1 Replies

9. UNIX for Dummies Questions & Answers

To Extract words from File based on Position

Hi Guys, While I was writing one shell script , I just got struck at this point. I need to extract words from a file at some specified position and do some comparison operation and need to replace the extracted word with another word. Eg : I like Orange very much. I need to replace... (19 Replies)
Discussion started by: kuttu123
19 Replies

10. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies
Login or Register to Ask a Question