Printing string from last field of the nth line of file to start (or end) of each line (awk I think)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Printing string from last field of the nth line of file to start (or end) of each line (awk I think)
# 1  
Old 02-06-2018
Printing string from last field of the nth line of file to start (or end) of each line (awk I think)

My file (the output of an experiment) starts off looking like this,
Code:
_____________________________________________________________
Subjects incorporated to date: 001
Data file started on machine PKSHS260-05CP

**********************************************************************
Subject 1, 11/30/2017 16:07:17 on PKSHS260-05CP, DMDX 5.1.5.3, Windows 6.1.7601, refresh 16.67ms, ID ik60607
!  DMDX is running in auto mode (automatically determined raster sync)
!  Video Mode 1280,1024,32,60
!  Item File <C:\Users\XXXXXXX\Desktop\Experiment\Version2.rtf>
Item 11213, 1372.44
 1372.44,+Right Ctrl
Item 11213, 1052.90
 1052.90,+CTRL
Item 114109, -1102.03
 1102.03,+Right Ctrl
Item 11131, 721.06
 721.06,+Right Ctrl
Item 111325, 1075.30
 1075.30,+Right Ctrl


I used the following:

Code:
egrep '^(Item|Subject|!)' filename

to get it like this
Subjects incorporated to date: 001
Subject 1, 11/30/2017 16:07:17 on PKSHS260-05CP, DMDX 5.1.5.3, Windows 6.1.7601, refresh 16.67ms, ID ik60607
! DMDX is running in auto mode (automatically determined raster sync)
! Video Mode 1280,1024,32,60
! Item File <C:\Users\XXXXX\Desktop\Experiment\Version2.rtf>
Item 11213, 1372.44
Item 11213, 1052.90
Item 114109, -1102.03
Item 11131, 721.06


Now I want to use something like

Code:
awk 'NF > 5' | awk '{print $NF}'

to extract the ik60607 (which is an identifier for that participant in the experiment)

and produce something like this


ik60607,Item 11213, 1372.44
ik60607,Item 11213, 1052.90
ik60607,Item 114109, -1102.03
ik60607,Item 11131, 721.06

etc.
or ideally do something like this operation twice (using the rtf filename) to produce

Version2.rtf,ik60607,Item 11213, 1372.44
Version2.rtf,ik60607,Item 11213, 1052.90
Version2.rtf,ik60607,Item 114109, -1102.03
Version2.rtf,ik60607,Item 11131, 721.06


I think I might need to use 'paste' after making a file with the ID printed the same number of times as the original file has lines and then deleting the bits. I have tried using awk but got confused with record and field separators & do not understand what I have read in manual pages. I am not in anyway skilled with this but was forced to do some of this twice in my life for 2 month periods 10 and 25 years ago working with large dictionaries which is why I even tried. Please help the very naive non-programmer (& it is pre-processing data for even more hapless students). I have to do this on a 108 (currently separate) files all with different (non regular) names and am not sure whether using cat first to join them up would make things even worse (there would be record and field separators that way but it seems even more complex)

Any advice gratefully received.


___________________________________________________________

Last edited by RudiC; 02-06-2018 at 07:55 AM..
# 2  
Old 02-06-2018
Welcome to the forum!

Please make sure to enclose ALL code and data in code tags as required by the forum rules!

Your spec is not quite clear and consistent (why is Item 111325, 1075.30 not found in your desired output?), but would
Code:
awk '
/^Item/         {print FN "," ID "," $0
                }

/^Subject /     {ID = $NF
                }

/! Item/        {gsub (/^.*\\|>$/, _)
                 FN = $0
                }

' file
Version2.rtf,ik60607,Item 11213, 1372.44
Version2.rtf,ik60607,Item 11213, 1052.90
Version2.rtf,ik60607,Item 114109, -1102.03
Version2.rtf,ik60607,Item 11131, 721.06
Version2.rtf,ik60607,Item 111325, 1075.30

come close to what you need?
# 3  
Old 02-06-2018
It's not found in my desired output because I copied it by hand. I couldn't copy and paste because the windows generated output end of line characters are/were messy everywhere (on mac text editor; in browser, on windows partition - it sometimes seems to result in one continuous line sometimes not)

Your suggestion produced this:

Code:
,Item 11213, 1372.44
,Item 11213, 1052.90
,Item 114109, -1102.03
,Item 11131, 721.06

which is close (added a , rather the end of the relevant field). Thank you though.

Moderator's Comments:
Mod Comment Seriously: Please use CODE tags as required by forum rules!

Last edited by samonl; 02-06-2018 at 07:02 AM.. Reason: That's not code . It is the text output of my file.
# 4  
Old 02-06-2018
I'm pretty sure that result is due to the "windows generated output end of line characters", making the "Item" line overwrite the file name and ID.

Try adding
Code:
                {sub (/\r$/, "")
                }

in front of the /^Item/ regex line.

Last edited by RudiC; 02-06-2018 at 06:39 AM..
This User Gave Thanks to RudiC For This Post:
# 5  
Old 02-06-2018
Thank you. I think you are right. I am using 'cat' to join the whole lot together and try to solve the labelling afterwards once all the rtf rubbish has been removed. I am going to say this is solved because a) your code shows me the correct gsub b) I have forced myself to read the array pages / loop pages of awk to the point where I'll have fun trying until I give up and do the whole thing manually/with the vba bit of excel. All for some ungrateful third year students. Thanks very much.

---------- Post updated at 11:08 AM ---------- Previous update was at 10:39 AM ----------

Code:
,ik60607,Item 122116, 658.49
,ik60607,Item 12313, 550.71
,ik60607,Item 50, 30111.98
,ik60607,Item 51, 3807.15
,ik60607,Item 52, 2384.38


You have no idea how much pure joy you have generated. Thank you. Really.
# 6  
Old 02-06-2018
I can't understand that result. Please use "Manage Attachments" in the "Advanced" editor window to attach the (truncated meaningfully if need be) original file for analysis. If more than your actual post count is necessary therefor, post the output of
Code:
od -tx1c filename

as text.
# 7  
Old 02-06-2018
Hi,

Thank you for trying. I have got to go and teach now though. I had to resave both files on this mac using TextEdit and whatever the default encoding was (UTF-8 I think). I made the awk one with ancient dredged up emacs muscle memories. Thanks - to be honest the subject id is good enough and it's not hard to get rid of a leading , which why I marked it solved. The files (not big) are attached - I had to edit the original output to remove potential identifier (hastily after I had posted it including it..).

Last edited by samonl; 02-06-2018 at 07:41 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file. I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies

2. Shell Programming and Scripting

Replacing nth field with nth_text for each line in a file

Hi All, I am very new to shell scripting and tried to search this in the forum but no luck. Requirment: I have an input file which is comma separated. I need to replace the value in 4th column with another value. This has to happen for all the lines in the file. Sample data: Input... (2 Replies)
Discussion started by: arunkumarsd
2 Replies

3. Shell Programming and Scripting

Grep start and end line of each segments in a file

Cat file1 -------- ---------- SCHEMA.TABLE1 insert------- update----- ------------- ---------- SCHEMA.TABLE2 insert------- update----- ----------- ------------ SCHEMA.TABLE3 insert------- update----- ------------ grep -n SCHEMA > header_file2.txt (2 Replies)
Discussion started by: Veera_V
2 Replies

4. Shell Programming and Scripting

awk - printing nth field based on parameter

I have a need to print nth field based on the parameter passed. Suppose I have 3 fields in a file, passing 1 to the function should print 1st field and so on. I have attempted below function but this throws an error due to incorrect awk syntax. function calcmaxlen { FIELDMAXLEN=0 ... (5 Replies)
Discussion started by: krishmaths
5 Replies

5. Shell Programming and Scripting

Search a string in a text file and add another string at the end of line

Dear All I am having a text file which is having more than 200 lines. EX: 001010122 12000 BIB 12000 11200 1200003 001010122 2000 AND 12000 11200 1200003 001010122 12000 KVB 12000 11200 1200003 In the above file i want to search for string KVB... (5 Replies)
Discussion started by: suryanarayana
5 Replies

6. Shell Programming and Scripting

awk to count start and end keyword in a line

Hello fellow awkers and seders: need to figure out a way to ensure a software deployment has completed by checking its trace file in which I can store the deployment results as follows: echo $testvar ===== Summary - Deploy Result - Start ===== ===== Summary - Deploy Result - End =====... (1 Reply)
Discussion started by: ux4me
1 Replies

7. Shell Programming and Scripting

Remove lines between the start string and end string including start and end string Python

Hi, I am trying to remove lines once a string is found till another string is found including the start string and end string. I want to basically grab all the lines starting with color (closing bracket). PS: The line after the closing bracket for color could be anything (currently 'more').... (1 Reply)
Discussion started by: Dabheeruz
1 Replies

8. UNIX for Dummies Questions & Answers

Printing nth and n+1th line after a pattern match

Hi , I want to print the nth and n+1 lines from a file once it gets a pattern match. For eg: aaa bbb ccc ddd gh jjjj If I find a match for bbb then I need to print bbb as well as 3rd and 4th line from the match.. Please help..Is it possible to get a command using sed :) (6 Replies)
Discussion started by: saj
6 Replies

9. Shell Programming and Scripting

How to start reading from the nth line till the last line of a file.

Hi, For my reuirement, I have to read a file from the 2nd line till the last line<EOF>. Say, I have a file as test.txt, which as a header record in the first line followed by records in rest of the lines. for i in `cat test.txt` { echo $i } While doing the above loop, I have read... (5 Replies)
Discussion started by: machomaddy
5 Replies

10. Shell Programming and Scripting

find string nth occurrence in file and print line number

Hi I have requirement to find nth occurrence in a file and capture data from with in lines (between lines) Data in File. <QUOTE> <SESSION> <ATTRIBUTE NAME='Parameter Filename' VALUE='file1.parm'/> <ATTRIBUTE NAME='Service Name' VALUE='None'/> </SESSION> <SESSION> <ATTRIBUTE... (6 Replies)
Discussion started by: tmalik79
6 Replies
Login or Register to Ask a Question