Fetching record based on Uniq Key from huge file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Fetching record based on Uniq Key from huge file.
# 8  
Old 10-18-2012
Hi
i elixir_sinari ran your commmand but it is not producing any output.
plz see below.
where extract.txt contain records just for e.g. i m running for two records that is 87 and 88 no from the file.
Code:
ckm1&cu158158:~/work/models/model/temp/shell > awk '$2+0>=06000000087' FS='SEQUENCE NUMBER: ' RS= extract.txt
ckm1&cu158158:~/work/models/model/temp/shell >

thanks,
prashant

---------- Post updated at 05:38 AM ---------- Previous update was at 05:33 AM ----------

itkamaraj your command is working fine but it is producing only sequence no but i want all the data like i posted in my second post.
e.g.
Code:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
EMDS/CREDIT SERVICES RunDate: 10/25/2011 REC# 1
SEQUENCE NUMBER: 06000000001 L90 KEY :060882D85800E201684C0000 CID KEY:000000000000000000
DEBT RATIO= 0% INQ W/I 6= 0, W/I 12= 0, W/I 18= 0, W/I 24= 0 ,PRM W/I 6=0; Trades RPTed W/I 24 = 0, No. of Trades = 3
*** FACT FRAUD CODE = ,FACT ALERT DATA CODE = 
XXXXXX XXXXX SSN-XXX-XX-XXXX C
XXXXXX,XXXXX. FAD: 0/00/0000 FILE SINCE: 0/00/0000
XXXXX XXXXXXXX XX XXXXXXXCA,92236
,,,,,,,,0
,,,,,,,,0
,,,,,,,,0
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
EMDS/CREDIT SERVICES RunDate: 10/25/2011 REC# 2
SEQUENCE NUMBER: 06000000002 L90 KEY :060882D85800E201684C0000 CID KEY:000000000000000000
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
EMDS/CREDIT SERVICES RunDate: 10/25/2011 REC# 2
SEQUENCE NUMBER: 06000000003 L90 KEY :060882D85800E201684C0000 CID KEY:000000000000000000
.
.
.
 
till 100 k
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
EMDS/CREDIT SERVICES RunDate: 10/25/2011 REC# 2
SEQUENCE NUMBER: 06000100000 L90 KEY :060882D85800E201684C0000 CID KEY:000000000000000000
DEBT RATIO= 0% INQ W/I 6= 0, W/I 12= 0, W/I 18= 0, W/I 24= 0 ,PRM W/I 6=0; Trades RPTed W/I 24 = 0, No. of Trades = 3
*** FACT FRAUD CODE = ,FACT ALERT DATA CODE = 
XXXXXX XXXXX SSN-XXX-XX-XXXX C
XXXXXX,XXXXX. FAD: 0/00/0000 FILE SINCE: 0/00/0000
XXXXX XXXXXXXX XX XXXXXXXCA,92236
,,,,,,,,0
,,,,,,,,0
,,,,,,,,0

thanks

Last edited by Scrutinizer; 10-18-2012 at 06:47 AM.. Reason: code tags
# 9  
Old 10-18-2012
based on the given data....

Code:
 $ awk 'BEGIN{a=1}a{print;getline;print;getline;a=0}/SEQUENCE NUMBER: 06000000001/,/SEQUENCE NUMBER: 06000100000/{
if($0~/06000100000/){for(i=0;i<=7;i++){print;getline;}}}1' a.txt

# 10  
Old 10-18-2012
hi itkamaraj thanks for updates.

some how i am geting error at the time of running your command.
awk: syntax error near line 1
awk: bailing out near line 1

and i am not getting why this error is coming.

thanks
prashant
Smilie
# 11  
Old 10-18-2012
try with nawk
# 12  
Old 10-18-2012
hi itkamaraj,

i made small changes in your command that i changes sequence no and now i am fetching only first five record from a file which is having as of now 10 records using below command. But some how it is displaying all 10 records which is present into the extract.txt file.

nawk 'BEGIN{a=1}a{print;getline;print;getline;a=0}/SEQUENCE NUMBER: 56000000001/,/SEQUENCE NUMBER: 56000000005/{
if($0~/56000000005/){for(i=0;i<=7;i++){print;getline;}}}1' extract.txt

---------- Post updated at 09:59 AM ---------- Previous update was at 06:39 AM ----------

can anybody help me sort out the problem which i am facing?

i m running below command which is displaying all sequence no record from a file but i want only 2 sequence no from a file.

nawk 'BEGIN{a=1}a{print;getline;print;getline;a=0}/SEQUENCE NUMBER: 56000000001/,/SEQUENCE NUMBER: 56000000002/{
if($0~/56000000002/){for(i=0;i<=7;i++){print;getline;}}}1' extract.txt

Input file
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 1
xxxxxxx xxxxxxA SEQUENCE NUMBER: 56000000001
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 2
SEQUENCE NUMBER: 56000000001(CONT)
0*************************************************************************************************** *********************************
* **** xxxxxxxxxxxxxxxxxxxxxx **** *
**************************************************************************************************** ********************************
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 2
xxxxxxx xxxxxxA SEQUENCE NUMBER: 56000000002
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxxx REC# 3 PAGE: 2
SEQUENCE NUMBER: 56000000002(CONT)
0*************************************************************************************************** *********************************
* **** xxxxxxxxxxxxxxxxxxxxxx **** *
**************************************************************************************************** ********************************
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 2
xxxxxxx xxxxxxA SEQUENCE NUMBER: 56000000003
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 2
SEQUENCE NUMBER: 56000000003(CONT)
0*************************************************************************************************** *********************************
* **** xxxxxxxxxxxxxxxxxxxxxx **** *
**************************************************************************************************** ********************************

output file should be
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 1
xxxxxxx xxxxxxA SEQUENCE NUMBER: 56000000001
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 2
SEQUENCE NUMBER: 56000000001(CONT)
0*************************************************************************************************** *********************************
* **** xxxxxxxxxxxxxxxxxxxxxx **** *
**************************************************************************************************** ********************************
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 2
xxxxxxx xxxxxxA SEQUENCE NUMBER: 56000000002
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxxx REC# 3 PAGE: 2
SEQUENCE NUMBER: 56000000002(CONT)
0*************************************************************************************************** *********************************
* **** xxxxxxxxxxxxxxxxxxxxxx **** *
**************************************************************************************************** ********************************

means it should display only two record out of 3.
# 13  
Old 10-18-2012
try this...

Code:
$ nawk 'BEGIN{a=1}a{print;getline;print;getline;a=0}/SEQUENCE NUMBER: 56000000001/,/SEQUENCE NUMBER: 56000000002/{if($0~/56000000002/){for(i=0;i<=5;i++){print;getline}exit}}1' inp.txt
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 1
xxxxxxx xxxxxxA SEQUENCE NUMBER: 56000000001
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 2
SEQUENCE NUMBER: 56000000001(CONT)
0*************************************************************************************************** *********************************
* **** xxxxxxxxxxxxxxxxxxxxxx **** *
**************************************************************************************************** ********************************
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxx REC# 3 PAGE: 2
xxxxxxx xxxxxxA SEQUENCE NUMBER: 56000000002
1 xxxxxxxxxxxxx xxxxxxxx#1000215 xxxxxx REC# 3 PAGE: 2
SEQUENCE NUMBER: 56000000002(CONT)
0*************************************************************************************************** *********************************
* **** xxxxxxxxxxxxxxxxxxxxxx **** *
**************************************************************************************************** ********************************

# 14  
Old 10-18-2012
hi itkamaraj,

as you mentioned for loop for 5 line count and if we dont know the line count then what can be command?

prashant
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

EBCDIC File Split Based On Record Key

I was wondering if anyone could explain to me how to split a variable length EBCDIC file into seperate files based on the record key. I have the COBOL layout, and so I need to split the file into 13 different EBCDIC files so that I can run each one through a C++ converter I have, and get the... (11 Replies)
Discussion started by: hanshot1stx
11 Replies

2. Shell Programming and Scripting

Fetching values in CSV file based on column name

input.csv: Field1,Field2,Field3,Field4,Field4 abc ,123 ,xyz ,000 ,pqr mno ,123 ,dfr ,111 ,bbb output: Field2,Field4 123 ,000 123 ,111 how to fetch the values of Field4 where Field2='123' I don't want to fetch the values based on column position. Instead want to... (10 Replies)
Discussion started by: bharathbangalor
10 Replies

3. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

4. Shell Programming and Scripting

Removing Dupes from huge file- awk/perl/uniq

Hi, I have the following command in place nawk -F, '!a++' file > file.uniq It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error: bash-3.2$ nawk -F, '!a++'... (17 Replies)
Discussion started by: makn
17 Replies

5. Shell Programming and Scripting

Need help splitting huge single record file

I was given a data file that I need to split into multiple lines/records based on a key word. The problem is that it is 2.5GB or bigger and everything I try in perl or sed causes a Segmentation fault. Can someone give me some other ideas. The data is of the form:... (5 Replies)
Discussion started by: leolson
5 Replies

6. Shell Programming and Scripting

Keep the last uniq record only

Hi folks, Below is the content of a file 'tmp.dat', and I want to keep the uniq record (key by first column). However, the uniq record should be the last record. 302293022|2|744124889|744124889 302293022|3|744124889|744124889 302293022|4|744124889|744124889 302293022|5|744124889|744124889... (4 Replies)
Discussion started by: ChicagoBlues
4 Replies

7. Shell Programming and Scripting

filter the uniq record problem

Anyone can help for filter the uniq record for below example? Thank you very much Input file 20090503011111|test|abc 20090503011112|tet1|abc|def 20090503011112|test1|bcd|def 20090503011131|abc|abc 20090503011131|bbc|bcd 20090503011152|bcd|abc 20090503011151|abc|abc... (8 Replies)
Discussion started by: bleach8578
8 Replies

8. Shell Programming and Scripting

Logic for file fetching based on date

Dear friends, I receive the following files into a FTP location on a daily basis -rw-r----- 1 guest ftp1 5021 Aug 19 09:03 CHECK_TEST_Extracts_20080818210000.zip -rw-r----- 1 guest ftp1 2437 Aug 20 05:15 CHECK_TEST_Extracts_20080819210000.zip -rw-r----- 1 guest ... (2 Replies)
Discussion started by: sureshg_sampat
2 Replies

9. Shell Programming and Scripting

Compare 2 huge files wrt to a key using awk

Hi Folks, I need to compare two very huge file ( i.e the files would contain a minimum of 70k records each) using awk or sed. The comparison needs to be done with respect to a 'key'. For example : File1 ********** 1234|TONY|Y75634|20/07/2008 1235|TINA|XCVB56|30/07/2009... (13 Replies)
Discussion started by: Ranjani
13 Replies

10. UNIX for Dummies Questions & Answers

Parsing out records from one huge record

Hi, I have one huge record and know that each record in the file is 550 bytes long. How do I parse out individual records from the single huge record. Thanks, (4 Replies)
Discussion started by: bwrynz1
4 Replies
Login or Register to Ask a Question