Visit Our UNIX and Linux User Community


How to cut some data from big file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to cut some data from big file
# 1  
Old 08-14-2009
How to cut some data from big file

How to cut data from big file

my file around 30 gb

I tried "head -50022172 filename > newfile.txt ,and tail -5454283 newfile.txt. It's slowy.

afer that I tried sed -n '46467831,50022172p' filename > newfile.txt ,also slow

Please recommend me , faster command to cut some data from Big file


Thanks.
# 2  
Old 08-15-2009
Well, a 30 GB file is a *HUGE* file and any shell command you run on it will take time for processing.

Maybe you want to split up the file and then work on the smaller components ? Run the command:

Code:
man split

to see what your options are.

tyler_durden
# 3  
Old 08-15-2009
You can try awk:
Code:
# time awk 'NR >= 46467831 && NR <= 50022172' big_file > new_big_file

real    0m46.536s
user    0m43.761s
sys     0m1.487s

# wc -l < new_big_file
 3554342

# 4  
Old 08-15-2009
Quote:
Originally Posted by danmero
You can try awk:
Code:
# time awk 'NR >= 46467831 && NR <= 50022172' big_file > new_big_file
 
real    0m46.536s
user    0m43.761s
sys     0m1.487s
 
# wc -l < new_big_file
 3554342

Thank you danmero.

So faster
My file has size 18 GB
Code:
time nawk 'NR >= 77930597 && NR <= 86671221' bigfile > newfile
real    2m41.942s
user    2m10.469s
sys     0m18.560s

but this command rather consume cpe 25%
# 5  
Old 08-15-2009
Any method that works on a variable length record size requires scanning the data to find a record. This will always incur a cpu penalty. But if your records are of a fixed sized, then we will be able to calculate the offset to the beginning and end of the section of interest and use more efficient ways of copying the data that remove the cpu intensive bits of the operation..

Previous Thread | Next Thread
Test Your Knowledge in Computers #745
Difficulty: Medium
The Austrian Computer Kit (ACK) is a retargetable compiler suite and toolchain written by Andrew Tanenbaum.
True or False?

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to cut a big file into small ones?

Hello all, Currently I have a txt file named as a.txt with the content as: f e100 aa bb cc dd ee ff f e222 aa dd ff gg f e987 dd aa f e2222 gg ff gg aa dd ff ee ee While, for some reason I want to cut a.txt into small ones, e.g. f1.txt, f2.txt, f3.txt and f4.txt. The routine is to... (6 Replies)
Discussion started by: locohd
6 Replies

2. Shell Programming and Scripting

Need to cut a some required data from file

Data_Consolidation_Engine_Part_2_Job2..TgtArBkt: ORA-00942: table or view does not exist I have some thing like above in the file.. Upto this portion Data_Consolidation_Engine_Part_2_Job2..TgtArBkt: the length can be vary .. Can some one help me in taking this portion alone ORA-00942:... (7 Replies)
Discussion started by: saj
7 Replies

3. Shell Programming and Scripting

Cut and paste data in new file

HI Guys, I have file A: Abc XyZ Abc Xyz Kal Kaloo Abc XyZ Abc Xyz Kalpooo Abc XyZ Abc Xyz Kloo Abc Abc Klooo I want file B Abc XyZ Abc Xyz Kal Kaloo Abc XyZ Abc Xyz Kalpooo Abc XyZ Abc Xyz Kloo File A is now 1 lines Abc Abc Klooo Cut all lines which have xyz... (2 Replies)
Discussion started by: asavaliya
2 Replies

4. Shell Programming and Scripting

parsing data from a big file using keys from another smaller file

Hi, I have 2 files format of file 1 is: a1 b2 a2 c2 d1 f3 format of file 2 is (tab delimited): a1 1.2 0.5 0.06 0.7 0.9 1 0.023 a3 0.91 0.007 0.12 0.34 0.45 1 0.7 a2 1.05 2.3 0.25 1 0.9 0.3 0.091 b1 1 5.4 0.3 9.2 0.3 0.2 0.1 b2 3 5 7 0.9 1 9 0 1 b3 0.001 1 2.3 4.6 8.9 10 0 1 0... (10 Replies)
Discussion started by: Lucky Ali
10 Replies

5. Shell Programming and Scripting

Sort a big data file

Hello, I have a big data file (160 MB) full of records with pipe(|) delimited those fields. I`m sorting the file on the first field. I'm trying to sort with "sort" command and it brings me 6 minutes. I have tried with some transformation methods in perl but it results "Out of memory". I was... (2 Replies)
Discussion started by: rubber08
2 Replies

6. Shell Programming and Scripting

Cut big text file into 2

I have a big text file. I want to cut it into 2 pieces at known point or I know the pattern of the contents from where it can separate the files. Is there any quick command/solution? (4 Replies)
Discussion started by: sandy221
4 Replies

7. UNIX for Dummies Questions & Answers

How to cut data block from .txt file in shell scripting

Hi All, Currently i have to write a script. For which i need to cut a block from .txt file. I know the specific word that starts the block and ends the block. Can we do it in shell scripting..? Please suggest.... (6 Replies)
Discussion started by: pank29
6 Replies

8. Shell Programming and Scripting

Big data file - sed/grep/awk?

Morning guys. Another day another question. :rolleyes: I am knocking up a script to pull some data from a file. The problem is the file is very big (up to 1 gig in size), so this solution: for results in `grep "^\ ... works, but takes ages (we're talking minutes) to run. The data is held... (8 Replies)
Discussion started by: dlam
8 Replies

9. Shell Programming and Scripting

Need to read data from a file (cut/awk)

Hi list, i have an orcale spool file with SQL> select COMPMAP as SVC, ITEM, UNIT, sum(JAN), sum(FEB) SVC ITE U SUM(JAN) SUM(FEB) ------ --- - ---------- ---------- 401500 IOC Q 14 14 406200 LC Q 1 1 410124 IOC Q 5 4 410124 LC... (1 Reply)
Discussion started by: rejirajraghav
1 Replies

Featured Tech Videos