Sponsored Content
Full Discussion: Pattern Matchin Huge File
Top Forums Shell Programming and Scripting Pattern Matchin Huge File Post 302495374 by senthil.ak on Thursday 10th of February 2011 04:29:57 AM
Old 02-10-2011
Lightbulb Pattern Matchin Huge File

Hi Experts,
I've issue with the huge file.
My requirement is I need to search a pattern between the 155-156 position and if its match's to 31 or 36 then need to route that to a new separate files.
The main file has around 1459328 line and 2 GB in size. I tired with the below code which take around 2 hrs to execute.
Code:
while read line
do
    record_type=`echo "$line" | cut -c 155-156`
    if [ "$record_type" -eq 31 ] ; then
    print "$line" >> ./31.txt
    elif  [ "$record_type" -eq 39 ] ; then
    print "$line" >> ./39.txt
    fi
done < LOAD.txt

Where as I modified this and used awk which is still taking more than 30 minutes but the results vary.
Code:
 
awk '/839I/ {print $0}' LOAD.txt > record_39.txt &
awk '/831I/ {print $0}' LOAD.txt > record_31.txt &
cat LOAD.txt | cut -c 155-156 > smp.log
grep -c '31' smp.log
 1182483
wc -l record_type_31.txt 
 1182495 record_31.txt

I even tired this too
Code:
 
awk '$5 ~ 39{print $0;}' LOAD.txt

but always the $5 wont come in between 155-156 position.
Sample records.
Code:
14115726     0000000000         00000000000000000000000000000000000000000000000000000000                                                      000         00I201
06485726     0000000000         00000000000000000000000000000000000000000000000000000000                                                      000        805I201
18005726ABCUS0000005726         01002080000000000000000000000000000000000000000000000000370291010381009    20090218                           000 I      839I201
18005726ABCUS0000005726         08009100000000000000000000000000000000000000000000000000370290173421008    20101203                           000I       839I201
18005726ABCUS0000005726         00000020000000000000000000000000000000000000000000000000370282295281006    20060706                           000C       831I201
18005726ABCUS0000005726         01002080000000000000000000000000000000000000000000000000370282010171003    20090216                           000 I      831I201

Do we have any other way in which I can get the currect results.

Thanks
Senthil.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Simple to you not simple to me pattern matchin help

hey all, im new and my first question is: say i have a word "blahblah" how do i get and replace the last letter of the word with say k, so replace the h with a k. However you cant just replace the h it has to change the LAST LETTER of the word. Cheers In advance. :b: (0 Replies)
Discussion started by: aleks001
0 Replies

2. Shell Programming and Scripting

sorting huge file

Hi All I am sorting a huge file -rw-r--r-- 1 rama users 448156978 May 13 18:48 102384.temp $ sort -k 1,40n 102384.temp > 102384.temp1 msgcnt 1468 vxfs: mesg 001: vx_nospace - /dev/vg00/var file system full (1 block extent) sort: A write error occurred while sorting. I thought... (3 Replies)
Discussion started by: dhanamurthy
3 Replies

3. Shell Programming and Scripting

insert a header in a huge data file without using an intermediate file

I have a file with data extracted, and need to insert a header with a constant string, say: H|PayerDataExtract if i use sed, i have to redirect the output to a seperate file like sed ' sed commands' ExtractDataFile.dat > ExtractDataFileWithHeader.dat the same is true for awk and... (10 Replies)
Discussion started by: deepaktanna
10 Replies

4. Shell Programming and Scripting

Help on splitting this huge file

Hi , i have files coming in my system which are very huge in MB and GBs, all these files are in a single line, there is no newline character. I need to get only last 700 bytes of these files, of this i am splitting the files by "split -b 700 filename" but this gives all the splitted... (2 Replies)
Discussion started by: Prateek007
2 Replies

5. Shell Programming and Scripting

Huge File Comparison

Hi i need to compare two fixed length files and produce the differences if any to a seperate file. I have to capture each and every differneces line by line. Ideally my files should not have any differences but if there are any then it should be captured without any miss. Also my files sizes are... (4 Replies)
Discussion started by: naveenn08
4 Replies

6. UNIX for Dummies Questions & Answers

Pattern matchin Between Two Files

Hi All, I have two files as below: file1 file2 AAAA CCCC,1234,0909 BBBBB AAAA,1234 AAAA DDDD,23536,9090 CCCC DDDD EEEEE I want a out file as below AAAA,1234 BBBB AAAA,1234... (5 Replies)
Discussion started by: thana
5 Replies

7. Shell Programming and Scripting

Optimised way for search & replace a value on one line in a very huge file (File Size is 24 GB).

Hi Experts, I had to edit (a particular value) in header line of a very huge file so for that i wanted to search & replace a particular value on a file which was of 24 GB in Size. I managed to do it but it took long time to complete. Can anyone please tell me how can we do it in a optimised... (7 Replies)
Discussion started by: manishkomar007
7 Replies

8. UNIX for Dummies Questions & Answers

My file system is 100%, can't find the huge file

Please help. My file system is 100%, I can't seem to find what is taking so much space. The total hard drive space is 150Gig free but I got nothing now. I did to this to find the big file but it's taking so much time. Is there any other way? du -ah / | more find ./ -size +200M... (3 Replies)
Discussion started by: samnyc
3 Replies

9. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

10. Shell Programming and Scripting

How to quickly substitute pattern within certain range of a huge file?

I have big files (some are >300GB!) that need substitution for some patterns, for example, change Multiple Spaces into Tab. I used this oneliner:sed '1,18s/ \{1,\}/\t/g' infile_big.sam > outfile_big.sambut it seems very slow as the job is still running after 24 hours! In this example, only the... (8 Replies)
Discussion started by: yifangt
8 Replies
ucblinks(1B)                                         SunOS/BSD Compatibility Package Commands                                         ucblinks(1B)

NAME
ucblinks - adds /dev entries to give SunOS 4.x compatible names to SunOS 5.x devices SYNOPSIS
/usr/ucb/ucblinks [-e rulebase] [-r rootdir] DESCRIPTION
ucblinks creates symbolic links under the /dev directory for devices whose SunOS 5.x names differ from their SunOS 4.x names. Where possi- ble, these symbolic links point to the device's SunOS 5.x name rather than to the actual /devices entry. ucblinks does not remove unneeded compatibility links; these must be removed by hand. ucblinks should be called each time the system is reconfiguration-booted, after any new SunOS 5.x links that are needed have been created, since the reconfiguration may have resulted in more compatibility names being needed. In releases prior to SunOS 5.4, ucblinks used a nawk rule-base to construct the SunOS 4.x compatible names. ucblinks no longer uses nawk for the default operation, although nawk rule-bases can still be specifed with the -e option. The nawk rule-base equivalent to the SunOS 5.4 default operation can be found in /usr/ucblib/ucblinks.awk. OPTIONS
-e rulebase Specify rulebase as the file containing nawk(1) pattern-action statements. -r rootdir Specify rootdir as the directory under which dev and devices will be found, rather than the standard root directory /. FILES
/usr/ucblib/ucblinks.awk sample rule-base for compatibility links ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWscpu | +-----------------------------+-----------------------------+ SEE ALSO
devlinks(1M), disks(1M), ports(1M), tapes(1M), attributes(5) SunOS 5.10 13 Apr 1994 ucblinks(1B)
All times are GMT -4. The time now is 04:59 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy