How can I get certain records from this file format?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers How can I get certain records from this file format?
# 1  
Old 06-03-2013
Wrench How can I get certain records from this file format?

I have a file named: learn.999 that has the following format where each record is separated by ^M as follows:


6232|12341|8144836458|20110719132528|0^M4124|5963|3022610000|20011023205510|0^M7606|10858|3022560001 |20071012222359|0^M7606|10257|3022560002|20061205202926|0^M6664|7208|7247129469|20020927162412|0^M41 51|10918|4237949999|20071128163443|0^M9211|1487|7574682269|19980526145925|0^M0661|5553|4346450000|20 010521202205|0^M6232|12345|8144836476|20110719132528|0^M418C|10800|8149080000|20071001161928|0^M6232 |8833|4438648920|20040205225704|0^M6940|9939|2762180272|20051031165254|0^M7125|11379|7179166000|2009 0622152947|0^M6664|11714|8144189179|20100324182655|0^M9211|5562|8042149046|20010531173604|0^M6214|70 10|3022499476|20020919201646|0^M9211|3935|8564689999|19990610130025|0^M9211|4845|8144792051


I need to pick and write the first 1,000,000 records out of that file into another file named learn.top1m and keeping the same format however, without having a ^M at the end of the last record?

Thanks

Last edited by Scott; 06-03-2013 at 02:38 PM.. Reason: Moved thread - has nothing to do with Redhat
# 2  
Old 06-03-2013
If I understood your question correctly, you want only 1000000 records in new file, every record is separated by "New Line feed" instead of "^M"? If yes, you can try:

Records are separated by ^M and fields are separated by "|".

awk -F"|" 'BEGIN{RS="^M"}NR<=1000000{print}' learn.999 | sed 's/^M//g' > learn.top1m

Hope this helps....

Last edited by juzz4fun; 06-03-2013 at 02:46 PM..
# 3  
Old 06-03-2013
Quote:
Originally Posted by juzz4fun
If I understood your question correctly, you want only 1000000 records in new file, every record is separated by "New Line feed" instead of "^M"? If yes, you can try:

Records are separated by ^M and fields are separated by "|".

awk -F"|" 'BEGIN{RS="^M"}NR<=1000000{print}' learn.999 > learn.top1m

Hope this helps....

Can i run your command on zipped file? or do i have to unzip it first? The source file is so huge and i do not have enough space to unzip it first. Thanks

---------- Post updated at 12:53 PM ---------- Previous update was at 12:41 PM ----------

Quote:
Originally Posted by juzz4fun
If I understood your question correctly, you want only 1000000 records in new file, every record is separated by "New Line feed" instead of "^M"? If yes, you can try:

Records are separated by ^M and fields are separated by "|".

awk -F"|" 'BEGIN{RS="^M"}NR<=1000000{print}' learn.999 | sed 's/^M//g' > learn.top1m

Hope this helps....


It did not work. It created output the same as input !
# 4  
Old 06-03-2013
I wonder if my 1st solution worked.

awk -F"|" 'BEGIN{RS="^M"}NR<=1000000{print}' learn.999 > learn.top1m

If it does, I do not know if it works on zipped files...
# 5  
Old 06-03-2013
Quote:
Originally Posted by juzz4fun
I wonder if my 1st solution worked.

awk -F"|" 'BEGIN{RS="^M"}NR<=1000000{print}' learn.999 > learn.top1m

If it does, I do not know if it works on zipped files...


I unzipped the file. I ran it. it produced the same inputput as the output. It did not work looks like. When I vi the file, the fields are saparated by | and the end of the record it has 0^M and then a new record starts. Not sure why it is not working thu running our command?
# 6  
Old 06-03-2013
^M usually appears when we transfer files using FTP... I usually select ascii mode to do this. Or manually remove all ^M characters using (while being in vi)

:%s/^M//g

---------- Post updated at 02:25 PM ---------- Previous update was at 02:17 PM ----------

Is it only "^M" that you want to take care of or "0^M"?

I guess DOS to Unix conversion didn't happen properly.
Can you try with dos2unix tool?

dos2unix <input>
# 7  
Old 06-03-2013
Quote:
Originally Posted by juzz4fun
^M usually appears when we transfer files using FTP... I usually select ascii mode to do this. Or manually remove all ^M characters using (while being in vi)

:%s/^M//g

---------- Post updated at 02:25 PM ---------- Previous update was at 02:17 PM ----------

Is it only "^M" that you want to take care of or "0^M"?

I guess DOS to Unix conversion didn't happen properly.
Can you try with dos2unix tool?

dos2unix <input>

I ran that dos2unix and it is still the same. I want to get the first 1 million lines that ends with the ^M. The file i have was sftp'd from the source a unix box to my unix box. Did not land it on dos or anywhere else. Thanks
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Separate records of a file on 2 types of records

Hi I am new to shell programming in unix Please if I can provide help. I have a file structure of a header record and "N" detail records. The header record will be the total number of detail records I need to split the file in 2: One for the header Another for all detail records Could... (1 Reply)
Discussion started by: jamcogar
1 Replies

2. Shell Programming and Scripting

Multiple records need to convert UNIXtime to human readable datatime and all output in one format

Hello Experts, Below is the record i have: sample data attached I want this record of each row to be in single line and there are multiple rowise unixtime mentioned e.g 11996327 , This needs to be converted to Human readdable data and time from multiple rows Can you help me , it will be... (10 Replies)
Discussion started by: manishK
10 Replies

3. UNIX for Dummies Questions & Answers

Printing records in different format

Hi all, I have a input file say record.txt hostname IP_address Port_No Version A 10.10.10.1 80 6.02 B 10.10.10.2 81 6.03 C 10.10.10.3 82 6.04 row 1 has 4 field headings : hostname, IP_address, Port_No and Version. and from 2nd row onwards the actual records start. now i need to... (2 Replies)
Discussion started by: PranavEcstasy
2 Replies

4. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

5. Shell Programming and Scripting

Converting windows format file to unix format using script

Hi, I am having couple of files which i used to copy from windows to Linux, so now in case of text files (CTRL^M) appears at end of line. I know i can convert this windows format file to unix format file by running dos2unix. My requirement here is that i want to do it automatically using a... (5 Replies)
Discussion started by: sarbjit
5 Replies

6. UNIX for Dummies Questions & Answers

Grep specific records from a file of records that are separated by an empty line

Hi everyone. I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this: ID: 20 Name: X Age: 19 ID: 21 Name: Z ID: 22 Email: xxx@yahoo.com Name: Y Age: 19 I want to grep records that... (4 Replies)
Discussion started by: Atrisa
4 Replies

7. Shell Programming and Scripting

Format & re-arrange the records

Data on my input file : Ac1n1s1c2n2s2XPd1r1e1t1d2r2e2t2d3r3e3t3d4r4e4t4RT Bh1k1p1h2k2p2NTq1y1f1m1q2y2f2m2q3y3f3m3q4y4f4m4ZN and i want the output to be: Ac1n1s1XPd1r1e1t1RT Ac1n1s1XPd2r2e2t2RT Ac1n1s1XPd3r3e3t3RT Ac1n1s1XPd4r4e4t4RT Ac2n2s2XPd1r1e1t1RT Ac2n2s2XPd2r2e2t2RT... (6 Replies)
Discussion started by: rlmadhav
6 Replies

8. UNIX for Dummies Questions & Answers

Use records from one file to delete records in another file

file_in_1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 file_in_2: 9 10 11 12 21 22 23 24 1 2 3 4 17 18 19 20 file_out: (5 Replies)
Discussion started by: kenneth.mcbride
5 Replies

9. UNIX for Dummies Questions & Answers

To convert multi format file to a readable ascii format

Hi I have a file which has ascii , binary, binary decimal coded,decimal & hexadecimal data with lot of special characters (like öƒ.ƒ.„İİ¡Š·œƒ.„İİ¡Š· ) in it. I want to standardize the file into ASCII format & later use that as source . Can any one suggest a way a logic to convert such... (5 Replies)
Discussion started by: gaur.deepti
5 Replies

10. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies
Login or Register to Ask a Question