AWK/SED line based search


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK/SED line based search
# 1  
Old 01-02-2012
AWK/SED line based search

Hi,

I have a file with values like this
1 11
2 11
3 44
4 55
5 66 (an representative of what I have).
I want to split this file into smaller files based on column 1 values (values within a range).

The issue that I am facing is that the file is really big, and takes too long to run.

Is there anyway that I can do this-

1. given a initial range, export the values, save the last line number.
2. start the next search from where the previous search finished.

I tried playing with NR values but turned up useless.

Or may be SED is an alternative?

HELP!!

Thanks a lot in advance Smilie
# 2  
Old 01-02-2012
Hi,
Please elaborate the query. Also provide with more sample input/output of exactly what you want.
# 3  
Old 01-02-2012
Hi,

Thanks for responding. Expanding on the query, say I have a file with 2 columns, the second column has many numbers which does not concern this search and export. The first column has linearly increasing values, say from 1.000 to 1000000.000. I want to export this big file into smaller files with each file containing a thousand column 1 values (it is not exactly 1000 every time, it is all values between 1 and 1000, 1000 and 2000). There are decimals so between 1 to 1000 there may be more than a thousand values. I used
Code:
awk -vmyvarA=$a -vmyvarB=$b '($1 >= myvarA && $1 <= myvarB) { print $0; }'

to split into smaller files.

As you can see, this is very inefficient if I loop it over all values I need. It will search from the start for every iteration. What I wanted to do was to do the first search, save the last matching line number and continue the next search from that line.

I hope it is clear now.
# 4  
Old 01-02-2012
If you want to split the files, you can make use of split command.
I still didn't get what you really want! What do you mean by export?

--ahamed
# 5  
Old 01-02-2012
I dont think split will work because I am splitting one big file into smaller files, but the number of lines in the smaller file is not same every time. I meant export as writing the output to a new text file.

I am splitting one big file into smaller ones.
Each small file has values between 'a' and 'b'.
Though the difference between a and b are fixed every time, the number of entries within this range is not.
So, when I run between 'a' and 'b' for the second iteration, awk will begin to look from the first line-> I want to stop this from happening as I have to iterate a large number of times.
So, I thought it would be more efficient if I save the last matching NR and resume the search from where I left off ( since column 1 increases linearly).
# 6  
Old 01-02-2012
Hi,
First try
Code:
 cat -n filename

. Then try the for loop in awk if you can give me the output to cat -n i will try but it won't be a generic solution only problem based solution.
# 7  
Old 01-02-2012
Hi,

cat -n is too large to post here, I have 42273547 lines in the file.
A small snippet of what I have is-
Code:
40556271        1319211119.897235 0.0047939
40556272        1319211119.897236 0.0000012
40557277        1319211121.492435 0.0000250
40557278        1319211121.492436 0.0000010

What I want is to create smaller files containing values in between two column 1 values.

As you can see, say I split into smaller files for every 1000 column 1 values, doing a exhaustive search 1000 times would be wasteful. Once I reach the last 'b' value, I want to store it and resume the search from where I left off (and it would also be inline with my search)
Thanks!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Search a multi-line shell command output and execute logic based on result

The following is a multi-line shell command example: $cargo build Compiling prawn v0.1.0 (/Users/ag/rust/prawn) error: failed to resolve: could not find `setup_panix` in `human_panic` --> src/main.rs:14:22 | 14 | human_panic::setup_panix!(); | ... (2 Replies)
Discussion started by: yogi
2 Replies

2. Shell Programming and Scripting

awk command to search based on 5 user input fields

Field1=”” Field2=”” Field3=”” Field4=”” Field5=”” USER INPUT UP TO 5 FIELDS awk -F , '{ if ( $3 == Field1 && $6 == Field2 && $8 == Field3 && $9 == Field4 && $10 == Field5) print $0 }' /tmp/rodney.outD INPUT FILE (Rodney.outD): ... (3 Replies)
Discussion started by: rmerrird
3 Replies

3. Shell Programming and Scripting

Multiple line search, replace second line, using awk or sed

All, I appreciate any help you can offer here as this is well beyond my grasp of awk/sed... I have an input file similar to: &LOG &LOG Part: "@DB/TC10000021855/--F" &LOG &LOG &LOG Part: "@DB/TC10000021852/--F" &LOG Cloning_Action: RETAIN &LOG Part: "@DB/TCCP000010713/--A" &LOG &LOG... (5 Replies)
Discussion started by: KarmaPoliceT2
5 Replies

4. Shell Programming and Scripting

Search several string and convert into a single line for each search string using awk command AIX?.

I need to search the file using strings "Request Type" , " Request Method" , "Response Type" and by using result set find the xml tags and convert into a single line?. below are the scenarios. Cat test Nov 10, 2012 5:17:53 AM INFO: Request Type Line 1.... (5 Replies)
Discussion started by: laknar
5 Replies

5. Shell Programming and Scripting

Split a line based on : using sed

Hi, i have a file say file1 having following data /abc/def:ghi/jkl/ some other text Now i want to extract only ghi/jkl/using sed, can some one please help me. Thanks Sarbjit (2 Replies)
Discussion started by: sarbjit
2 Replies

6. Shell Programming and Scripting

Printing previous line based on pattern using sed

Hi, I have a written a shell script to get the previous line based on the pattern. For example if a file has below lines: ---------------------------------------------- #UNBLOCK_As _per #As per 205.162.42.92 #BLOCK_As_per #----------------------- #input checks abc.com... (5 Replies)
Discussion started by: Anjan1
5 Replies

7. Shell Programming and Scripting

Append specific lines to a previous line based on sequential search criteria

I'll try explain this as best I can. Let me know if it is not clear. I have large text files that contain data as such: 143593502 09-08-20 09:02:13 xxxxxxxxxxx xxxxxxxxxxx 09-08-20 09:02:11 N line 1 test line 2 test line 3 test 143593503 09-08-20 09:02:13... (3 Replies)
Discussion started by: jesse
3 Replies

8. Shell Programming and Scripting

using sed to conditionally extract stanzas of a file based on a search string

Dear All, I have a file with the syntax below (composed of several <log ..... </log> stanzas) I need to search this file for a number e.g. 2348022225919, and if it is found in a stanza, copy the whole stanza/section (<log .... </log>) to another output file. The numbers to search for are... (0 Replies)
Discussion started by: aitayemi
0 Replies

9. Shell Programming and Scripting

sed search and replace in next line

Hello, I am hoping someone can provide some guidance on using context based search and replace to search for a pattern and then do a search and replace in the line that follows it. For example, I have a file that looks like this: <bold>bold text </italic> somecontent morecontent... (3 Replies)
Discussion started by: charissaf67
3 Replies

10. Shell Programming and Scripting

search file, change existing value based on input (awk help)

I have a file (status.file) of the form: valueA 3450 valueB -20 valueC -340 valueD 48 I am tailing a data.file, and need to search and modify a value in status.file...the tail is: tail -f data.file | awk '{ print $3, ($NF - $(NF-1)) }' which will produce lines that look like this: ... (3 Replies)
Discussion started by: nortonloaf
3 Replies
Login or Register to Ask a Question