Data Processing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Data Processing
# 8  
Old 04-24-2017
You're joking, aren't you?
# 9  
Old 04-24-2017
I'm confused...

First: You requested empty lines between output sections that do not seem to be provided by RudiC's suggestion. Did you want them or not?

Second: It looks to me like RudiC's suggestion would duplicate the Directive URL as a Src URL in each output section (which you did not seem to want). But, you have not indicated that that is a problem. Did you want the Directive URL to be output as both a Directive URL and as a Src URL or not?

And, third: The last Directive URL in your sample input is:
Code:
directive url is : https://coursera-eu.mokar.com/directives/dc70a6d8-6422-30e4-bc9f-680ff0911a10

but the output you say you want corresponding to that input is:
Code:
https://coursera-eu.mokar.com/directives/dc70a6d8-6422-30e4-bc9f-680ff0911a10i,...

Where did the i come from in that output?

Assuming that you did want empty lines between your output records, assuming that you did not want the Directive URL to be included in the <space> separated list of Src URL field entries, and assuming that the extraneous i in the last line of your sample output was a typo; the following minor modification of RudiC's suggestion might be worth trying:
Code:
awk '
BEGIN		{print "Directive Url,Src Url"
		}
sub (/^directive url is : /, "") \
		{printf "%s%s", TRS, $0
		 TRS = ORS ORS
		 TFS = ","
		 next
		}
/^https/	{printf "%s%s", TFS, $0
		 TFS = " "
		}
END		{printf ORS
		}
' "$1"

If you invoke this script with the name of a file containing the sample input you provided in post #1 in this thread as its first operand, it produces the output:
Code:
Directive Url,Src Url
https://coursera-eu.mokar.com/directives/96df29ff-176a-35f7-8b1b-4ce483d15762,https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547 https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164

https://coursera-eu.mokar.com/directives/05570fd8-563a-316a-9428-a60a6f404303,https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547 https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164

https://coursera-eu.mokar.com/directives/dc70a6d8-6422-30e4-bc9f-680ff0911a10,https://images-eu.ssl-images-mokar.com/images/G/31/img16/app/sweeps/courses/Books-bunk-top_1242x150._SX828_CB529168623_.jpg : 11.00293 https://images-eu.ssl-images-mokar.com/images/G/31/img16/app/sweeps/courses/Books-bunk-top_1242x150._SX1242_CB529168623_.jpg : 13.24707

You could also try the following slightly different approach that produces exactly the same output as the above script:
Code:
awk '
BEGIN {	print "Directive Url,Src Url"
}
/^https/ {
	printf "%s%s", srccnt++ ? " " : ",", $0
}
sub(/^directive url is : /, "") {
	printf "%s%s", directivecnt++ ? ORS ORS : "", $0
	srccnt = 0
}
END {	printf ORS
}' "$1"

As always, if someone wants to try either of these on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.

Last edited by Don Cragun; 04-24-2017 at 07:14 PM.. Reason: Add alternative awk suggestion.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Processing files one by one using data from pipe

Hi guys, I receive a list from pipe (with fixed number of lines) like this: name1 name2 name3 And in my ./ folder I have three files: 01-oldname.test 02-someoldname.test 03-evenoldername.test How to rename files one by one using while read? Desired result: 01-name1.test 02-name2.test... (3 Replies)
Discussion started by: useretail
3 Replies

2. Shell Programming and Scripting

Data processing using awk

Hello, I have some bitrate data in a csv which is in an odd format and is difficult to process in Excel when I have thousands of rows. Therefore, I was thinking of doing this in bash and using awk as the primary application except that due to its complication, I'm a little stuck. ... (24 Replies)
Discussion started by: shadyuk
24 Replies

3. UNIX for Dummies Questions & Answers

Genomic data processing

Dear fellow members, I've just joined the forum and am a newbie to shell scripting and programming. I'm stuck on the following problem. I'm working with large scale genomic data and need to do some analyses on it. Essentially it is text processing problem, so please don't mind the scientific... (0 Replies)
Discussion started by: mvaishnav
0 Replies

4. Programming

Data processing

Hello guys! I have some issue in how to processing some data. I have some files with 3 columns. The 1st column is a name of my sample. The 2nd column is a numerical sequence (very big sequence) starting from "1". And the 3rd column is a feature of each line, represented for a number (completely... (2 Replies)
Discussion started by: bfantinatti
2 Replies

5. Shell Programming and Scripting

Help with data processing, maybe awk

I have a file, first 5 columns are very normal, like "1107",106027,71400,"Y","BIOLOGY",, however, the 6th columns, the user can put comments, anything, just any characters, like new line, double quote, single quote, whatever from the keyboard, like"Please load my previous SOM597G course content in... (3 Replies)
Discussion started by: freelong
3 Replies

6. UNIX for Dummies Questions & Answers

a dummy question on data processing

Hi, everyone, I have a matrix, let's say: 1 2 3 4 5 6 ... 4 5 6 7 8 9 ... 7 8 9 1 2 3 ... 3 4 5 6 7 8 ... ......... (nxm matrix) Is there a simple command that can take certain specific rows out of the matrix? e.g., I want to take row 2 (4 5 6 7 8 9 ...) and row 4 (3 4 5 6 7 8... (2 Replies)
Discussion started by: kaixinsjtu
2 Replies

7. Shell Programming and Scripting

How should i know that the process is still processing data

I have some process . How should i know that the process is still processing data or got hanged even though it is showing that it is running in background I know of a command called truss. how should i use this command and determine 1) process is still processing data 2) process got hanged... (7 Replies)
Discussion started by: ali560045
7 Replies

8. UNIX for Dummies Questions & Answers

Data File Processing Help

I need to read contents of directory and create a list of data files that match a certain pattern and process by renaming it and calling a existing .ksh script then archiving off to file another directory. Any suggestions or samples u could point me to on using .ksh perl or other to process... (5 Replies)
Discussion started by: mavsman
5 Replies

9. UNIX for Advanced & Expert Users

data processing

hi i am having a file of following kind: 20015#67143645#143123#4214 62014#67143148#67143159#456 15432#67143568#00143862#4632 54112#67143752#0067143657#143 54623#67143357#167215#34531 65446#67143785#143598#7456 75642#67143546#156146#845 24464#67143465#172532#6544... (5 Replies)
Discussion started by: rochitsharma
5 Replies
Login or Register to Ask a Question