manipulating data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting manipulating data
# 1  
Old 03-07-2010
Question manipulating data

Hi guys

Firstly, I'd like to say hi and how great this forum is. I'm not new to UNIX but am relatively new to scripting. I have a personal project that I'm working on just to try and speed up my learning.

I working with a text file, well more of a logfile really. It has several columns of information each with a few thousand lines of data. Columns titles include Time, Date and even a Yes/No column plus many others which I won't bore you with.

I have managed to assign a line number to the beginning of each line as a sort of record number kind of thing.

What I am trying to do is grab the first line in the file that has a "Yes" in that line. I then want to output it to another file and then jump back in the file to capture the previous 50 entries and then keep going until the it reaches the first instance of no and then capture the next 50 entries.

Just to clarify things. basically the script needs to capture a snapshot range of 101 lines. 50 before then the first "Yes" entry and then the following 50 entries.

Once this has been done the script should then continue scanning the list until it comes across the next line with "Yes" in it. Then the capture process needs to happen again.

This process needs to continue until the end of the file. All of these snapped lines need to be outputted to a seperate file which can be printed.

I have managed to do some of what I need using Awk but I've now hit a wall and can't figure out where to go next.

I don't want someone to write a script for me, I'd just like to be pointed in the right direction. I'm having so much fun trying to learn shell scripting I'd feel like I'd cheated if somebody gave me a complete solution.

Thank you in advance for any advice you can offer.

Cheers

abcd69
# 2  
Old 03-07-2010
Hi abcd69,

Welcome to the community.

Could you post an example of the data you are working with
and if possible the expected output. It'll be easier to visualise
the problem and solution.

A.
# 3  
Old 03-08-2010
Please provide an example
# 4  
Old 03-08-2010
Quote:
Originally Posted by abcd69
What I am trying to do is grab the first line in the file that has a "Yes" in that line. I then want to output it to another file and then jump back in the file to capture the previous 50 entries and then keep going until the it reaches the first instance of no and then capture the next 50 entries.
I'm not sure I follow, is the "no" a typo and supposed to be a "yes"?

Quote:
Originally Posted by abcd69
Just to clarify things. basically the script needs to capture a snapshot range of 101 lines. 50 before then the first "Yes" entry and then the following 50 entries.

Once this has been done the script should then continue scanning the list until it comes across the next line with "Yes" in it. Then the capture process needs to happen again.
You might want to look into the grep -A and -B options. It can easily output this:
Code:
50 lines
Yes
50 lines

Do the 50 lines contain any yesses? It could get messy if they overlap.

You'll learn more by using awk than grep for this though. Grep would be too easy, and you'd also have to run two grep commands by piping grep -A to grep -B, or vice versa.
# 5  
Old 03-08-2010
Hi Guys

Firstly, thank you for your replies.

Secondly

Quote:
Originally Posted by fubaya
I'm not sure I follow, is the "no" a typo and supposed to be a "yes"?
The "No" was not a typo.

Here is a similar example of the file layout (The original has more a lot more data columns in it):-

LineNo Date Time Result Yes/No Data 1 Data 2 Data 3
1 08/03/2010 12:00:01 No Blah Blah Blah Blah
2 08/03/2010 13:00:01 No Blah Blah Blah Blah
3 08/03/2010 14:00:01 No Blah Blah Blah Blah
4 08/03/2010 15:00:01 No Blah Blah Blah Blah
5 08/03/2010 16:00:01 No Blah Blah Blah Blah
6 08/03/2010 17:00:01 No Blah Blah Blah Blah
7 08/03/2010 18:00:01 No Blah Blah Blah Blah
8 08/03/2010 19:00:01 No Blah Blah Blah Blah
9 08/03/2010 20:00:01 No Blah Blah Blah Blah
10 08/03/2010 21:00:01 No Blah Blah Blah Blah
11 09/03/2010 22:00:01 No Blah Blah Blah Blah
12 09/03/2010 23:00:01 No Blah Blah Blah Blah
13 09/03/2010 00:00:01 Yes Blah Blah Blah Blah
14 09/03/2010 01:00:01 Yes Blah Blah Blah Blah
15 09/03/2010 02:00:01 Yes Blah Blah Blah Blah
16 09/03/2010 03:00:01 Yes Blah Blah Blah Blah
17 09/03/2010 04:00:01 Yes Blah Blah Blah Blah
18 09/03/2010 05:00:01 Yes Blah Blah Blah Blah
19 09/03/2010 06:00:01 Yes Blah Blah Blah Blah
20 09/03/2010 07:00:01 Yes Blah Blah Blah Blah
21 10/03/2010 08:00:01 Yes Blah Blah Blah Blah
22 10/03/2010 09:00:01 Yes Blah Blah Blah Blah
23 10/03/2010 10:00:01 No Blah Blah Blah Blah
24 10/03/2010 11:00:01 No Blah Blah Blah Blah
25 10/03/2010 12:00:01 No Blah Blah Blah Blah
26 10/03/2010 13:00:01 No Blah Blah Blah Blah
27 10/03/2010 14:00:01 No Blah Blah Blah Blah
28 10/03/2010 15:00:01 No Blah Blah Blah Blah
29 10/03/2010 16:00:01 No Blah Blah Blah Blah
30 10/03/2010 17:00:01 No Blah Blah Blah Blah
31 11/03/2010 18:00:01 No Blah Blah Blah Blah
32 11/03/2010 19:00:01 No Blah Blah Blah Blah
33 11/03/2010 20:00:01 No Blah Blah Blah Blah
34 11/03/2010 21:00:01 No Blah Blah Blah Blah
35 11/03/2010 22:00:01 Yes Blah Blah Blah Blah
36 11/03/2010 23:00:01 Yes Blah Blah Blah Blah
37 11/03/2010 00:00:01 Yes Blah Blah Blah Blah
38 11/03/2010 01:00:01 Yes Blah Blah Blah Blah
39 11/03/2010 02:00:01 Yes Blah Blah Blah Blah
40 11/03/2010 03:00:01 Yes Blah Blah Blah Blah
41 12/03/2010 04:00:01 Yes Blah Blah Blah Blah
42 12/03/2010 05:00:01 Yes Blah Blah Blah Blah
43 12/03/2010 06:00:01 Yes Blah Blah Blah Blah
44 12/03/2010 07:00:01 Yes Blah Blah Blah Blah
45 12/03/2010 08:00:01 No Blah Blah Blah Blah
46 12/03/2010 09:00:01 No Blah Blah Blah Blah
47 12/03/2010 10:00:01 No Blah Blah Blah Blah
48 12/03/2010 11:00:01 No Blah Blah Blah Blah
49 12/03/2010 12:00:01 No Blah Blah Blah Blah
50 12/03/2010 13:00:01 No Blah Blah Blah Blah
51 13/03/2010 14:00:01 No Blah Blah Blah Blah
52 13/03/2010 15:00:01 No Blah Blah Blah Blah
53 13/03/2010 16:00:01 No Blah Blah Blah Blah
54 13/03/2010 17:00:01 Yes Blah Blah Blah Blah
55 13/03/2010 18:00:01 Yes Blah Blah Blah Blah
56 13/03/2010 19:00:01 Yes Blah Blah Blah Blah
57 13/03/2010 20:00:01 Yes Blah Blah Blah Blah
58 13/03/2010 21:00:01 No Blah Blah Blah Blah
59 13/03/2010 22:00:01 No Blah Blah Blah Blah
60 14/03/2010 23:00:01 No Blah Blah Blah Blah

Sorry, I know that the headings don't line up. They will when I complete the script and apply column formatting.


As I said the file I am working with is a great deal larger than this example.

I am hoping that the script will be able to search for the first instance of Yes, then I need to be able to output that line and then 50 lines before and after this line.

Then the script should continue and find the next instance of "Yes" and repeat the procedure.

I hope this clarifies things a little.

Cheers

abcd69
# 6  
Old 03-08-2010
# 7  
Old 03-09-2010
Hi Ygor

Thank you for your reply.


Quote:
Originally Posted by Ygor
Try...

Code:
nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r[(NR-c+1)%b];print;c=a}b{r[NR%b]=$0}' b=2 a=4 s="string" file1


...where "b" and "a" are the number of lines to print before and after string "s".

I'd be really grateful if you could break this down and explain what is going on in this command as I'm quite new to this.


Kind regards

abcd69
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Manipulating Data Records for reporting

Hello All, I have Data Records (DRs) with the following format: ... (2 Replies)
Discussion started by: EAGL€
2 Replies

2. Shell Programming and Scripting

Data manipulating script. Please HELP!

Dear friends, I'm struggling to preparing a bunch of gromacs input files, say manually. It's really a time-consuming work without any techniques. I suppose that it could be done by a smart script automatically. But I lack some basic knowledge on scripting. Please help! My original input looks... (3 Replies)
Discussion started by: liuzhencc
3 Replies

3. Shell Programming and Scripting

Manipulating xml data with awk

Hi everyone, I have a little bit of complicated task to finish with AWK. Here it is; I have a data file in xml format which looks like this <data> a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 c1 c2 c3 c4 c5 d1 d2 d3 d4 d5 e1 e2 e3 e4 e5 </data> lets say each data block contains 5 rows and 5 columns,... (13 Replies)
Discussion started by: hayreter
13 Replies

4. Shell Programming and Scripting

Manipulating Filenames

Hi Folks, I'm looking for some ideas on how to change some file names. I'm pretty sure I need to use sed or awk but they still escape me. The files I have are like: VOD0615 NEW Blades R77307.pdf or VOD0615_NEW_Blades_R77307.pdf and what I want after processing is: R77307 NEW Blades.pdf ... (5 Replies)
Discussion started by: imonkey
5 Replies

5. Shell Programming and Scripting

reading from two files and manipulating the data

hi i have a file of the following format FILE1 5 937 8 1860 1850 1 683 2 1 129 2 2 5 938 8 1122 1123 1 20 520 4 1860 1851 1 5 939 8 1122 1124 1 20 521 4i have another file which... (3 Replies)
Discussion started by: vaibhavkorde
3 Replies

6. Shell Programming and Scripting

Manipulating a file

Hi everybody, I need an urgent help with a BASH script. I have file which contains (besides the other data) the lines with the following structure identified by with keyword PCList: <PARAMETER NAME="PCList" TYPE="LIST_STRUCTURE" MODEL="{,}" ... (1 Reply)
Discussion started by: sameucho
1 Replies

7. Emergency UNIX and Linux Support

Manipulating Data

Hi. I haven't had to write bash scripts in a long time and have a simple task to do, but need some help: Input: chrY:22627291-22651542 chrY:23045932-23070172 chrY:23684890-23696359 chrY:25318610-25330083 chrY:25451096-25462570 chr10:1054847-1061799 chr10:1058606-1080131... (7 Replies)
Discussion started by: awknerd
7 Replies

8. Shell Programming and Scripting

Manipulating Pick multi dimensional data with awk.

Hi. I am reasonably new to awk, but have done quite a lot of unix scripting in the past. I have resolved the issues below with unix scripting but it runs like a dog. Moved to awk for speed and functionality but running up a big learning curve in a hurry, so hope there is some help here. I... (6 Replies)
Discussion started by: mike.strategis
6 Replies

9. UNIX for Dummies Questions & Answers

Help!! manipulating file

Hi all, I need help manipulating the file below. Here is what I needed to do. First, I have to replace INSUPD to DELETE. Then I need to change the content of the file around by flipping the contents in the file from the bottom to the top (start from "CMD") How should I attack this? Here... (2 Replies)
Discussion started by: sirrtuan
2 Replies

10. Shell Programming and Scripting

Manipulating data in variable

Hi, I have two variables - A and B - containing a bunch of file paths. I am comparing them and when I find a match I want to remove that entry from A so that as the compare proceeds A shrinks entry by entry. How can I remove a matched entry from A whilst leaving the non matched entries... (6 Replies)
Discussion started by: ajcannon
6 Replies
Login or Register to Ask a Question