Data Processing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Data Processing
# 1  
Old 04-24-2017
Data Processing

I have below Data
Code:
***************************************************
********************BEGINNING-1********************

directive url is : https://coursera-eu.mokar.com/directives/96df29ff-176a-35f7-8b1b-4ce483d15762


Src urls are :
https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547
https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164

directive url is : https://coursera-eu.mokar.com/directives/05570fd8-563a-316a-9428-a60a6f404303


Src urls are :
https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547
https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164

directive url is : https://coursera-eu.mokar.com/directives/dc70a6d8-6422-30e4-bc9f-680ff0911a10


Src urls are :
https://images-eu.ssl-images-mokar.com/images/G/31/img16/app/sweeps/courses/Books-bunk-top_1242x150._SX828_CB529168623_.jpg : 11.00293
https://images-eu.ssl-images-mokar.com/images/G/31/img16/app/sweeps/courses/Books-bunk-top_1242x150._SX1242_CB529168623_.jpg : 13.24707


I want it in the below format

Code:
Directive Url,Src Url
https://coursera-eu.mokar.com/directives/96df29ff-176a-35f7-8b1b-4ce483d15762,https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547 https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164

https://coursera-eu.mokar.com/directives/05570fd8-563a-316a-9428-a60a6f404303,https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547 https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164

https://coursera-eu.mokar.com/directives/dc70a6d8-6422-30e4-bc9f-680ff0911a10i,https://images-eu.ssl-images-mokar.com/images/G/31/img16/app/sweeps/courses/Books-bunk-top_1242x150._SX828_CB529168623_.jpg : 11.00293 https://images-eu.ssl-images-mokar.com/images/G/31/img16/app/sweeps/courses/Books-bunk-top_1242x150._SX1242_CB529168623_.jpg : 13.24707

Please let me know how can i solve this problem.

Last edited by Don Cragun; 04-24-2017 at 06:44 PM.. Reason: Get rid of nested CODE tags.
# 2  
Old 04-24-2017
With the suggestions that we have provided you on more than 50 other problems, we would hope that you have learned something from all of our previous help. What have you tried to solve this problem on your own?
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 04-24-2017
Hi Don,

I tried writing below code, It was not working, hence turned upto the forum for the help

Code:
paste -d, -s norm.txt |awk -F "directive url is : " '{print $2 $3 $4}' | awk -F ",,,Src urls are :," '{print $1 "," $2}' | awk -F ",," '{print $1}'

Below is the o/p i'm getting which is incorrect

Code:
https://coursera-eu.mokar.com/directives/96df29ff-176a-35f7-8b1b-4ce483d15762,https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547,https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164


Last edited by Don Cragun; 04-24-2017 at 06:41 PM.. Reason: Get rid of nested CODE tags.
# 4  
Old 04-24-2017
Try
Code:
awk '
BEGIN           {print "Directive Url,Src Url"
                }
sub (/^directive url is : /, "") \
                {printf "%s%s", TRS, $0
                 TRS = ORS
                }
/^https/        {printf ", %s", $0
                }
END             {printf RS
                }
' file

# 5  
Old 04-24-2017
Hi Rudi,

Thanks for the solution, but the o/p i'm getting is bit different.

I dont want the comma(,) between the src urls, want it only after the directive url as there are only 2 columns.

Below is the O/p i wanted

Code:
Directive Url,Src Url
https://coursera-eu.mokar.com/directives/96df29ff-176a-35f7-8b1b-4ce483d15762,https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547 https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164

https://coursera-eu.mokar.com/directives/05570fd8-563a-316a-9428-a60a6f404303,https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX828_CB529193423_.jpg : 11.685547 https://images-eu.ssl-images-mokar.com/images/G/31/img17/PCA/Watches/Watchestrack/Ingress/1041299_watches_1242x150_3._SX1242_CB529193423_.jpg : 12.743164

https://coursera-eu.mokar.com/directives/dc70a6d8-6422-30e4-bc9f-680ff0911a10i,https://images-eu.ssl-images-mokar.com/images/G/31/img16/app/sweeps/courses/Books-bunk-top_1242x150._SX828_CB529168623_.jpg : 11.00293 https://images-eu.ssl-images-mokar.com/images/G/31/img16/app/sweeps/courses/Books-bunk-top_1242x150._SX1242_CB529168623_.jpg : 13.24707


Last edited by Don Cragun; 04-24-2017 at 06:44 PM.. Reason: Get rid of nested CODE tags.
# 6  
Old 04-24-2017
I'm glad I could (almost) help. For your required modifications, why don't you give it a try, with 168 posts and a six year membership?
This User Gave Thanks to RudiC For This Post:
# 7  
Old 04-24-2017
Hi Rudi

I'm able to do through sed, was just bit curious if it was possible with the same awk script you shared. anyways thanks my solution below

content.txt is my data file and the process.sh is the script you shared

Code:
sh process.sh content.txt | sed 's/,\([^,]*\)$/ \1/'

Thanks again for your help
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Processing files one by one using data from pipe

Hi guys, I receive a list from pipe (with fixed number of lines) like this: name1 name2 name3 And in my ./ folder I have three files: 01-oldname.test 02-someoldname.test 03-evenoldername.test How to rename files one by one using while read? Desired result: 01-name1.test 02-name2.test... (3 Replies)
Discussion started by: useretail
3 Replies

2. Shell Programming and Scripting

Data processing using awk

Hello, I have some bitrate data in a csv which is in an odd format and is difficult to process in Excel when I have thousands of rows. Therefore, I was thinking of doing this in bash and using awk as the primary application except that due to its complication, I'm a little stuck. ... (24 Replies)
Discussion started by: shadyuk
24 Replies

3. UNIX for Dummies Questions & Answers

Genomic data processing

Dear fellow members, I've just joined the forum and am a newbie to shell scripting and programming. I'm stuck on the following problem. I'm working with large scale genomic data and need to do some analyses on it. Essentially it is text processing problem, so please don't mind the scientific... (0 Replies)
Discussion started by: mvaishnav
0 Replies

4. Programming

Data processing

Hello guys! I have some issue in how to processing some data. I have some files with 3 columns. The 1st column is a name of my sample. The 2nd column is a numerical sequence (very big sequence) starting from "1". And the 3rd column is a feature of each line, represented for a number (completely... (2 Replies)
Discussion started by: bfantinatti
2 Replies

5. Shell Programming and Scripting

Help with data processing, maybe awk

I have a file, first 5 columns are very normal, like "1107",106027,71400,"Y","BIOLOGY",, however, the 6th columns, the user can put comments, anything, just any characters, like new line, double quote, single quote, whatever from the keyboard, like"Please load my previous SOM597G course content in... (3 Replies)
Discussion started by: freelong
3 Replies

6. UNIX for Dummies Questions & Answers

a dummy question on data processing

Hi, everyone, I have a matrix, let's say: 1 2 3 4 5 6 ... 4 5 6 7 8 9 ... 7 8 9 1 2 3 ... 3 4 5 6 7 8 ... ......... (nxm matrix) Is there a simple command that can take certain specific rows out of the matrix? e.g., I want to take row 2 (4 5 6 7 8 9 ...) and row 4 (3 4 5 6 7 8... (2 Replies)
Discussion started by: kaixinsjtu
2 Replies

7. Shell Programming and Scripting

How should i know that the process is still processing data

I have some process . How should i know that the process is still processing data or got hanged even though it is showing that it is running in background I know of a command called truss. how should i use this command and determine 1) process is still processing data 2) process got hanged... (7 Replies)
Discussion started by: ali560045
7 Replies

8. UNIX for Dummies Questions & Answers

Data File Processing Help

I need to read contents of directory and create a list of data files that match a certain pattern and process by renaming it and calling a existing .ksh script then archiving off to file another directory. Any suggestions or samples u could point me to on using .ksh perl or other to process... (5 Replies)
Discussion started by: mavsman
5 Replies

9. UNIX for Advanced & Expert Users

data processing

hi i am having a file of following kind: 20015#67143645#143123#4214 62014#67143148#67143159#456 15432#67143568#00143862#4632 54112#67143752#0067143657#143 54623#67143357#167215#34531 65446#67143785#143598#7456 75642#67143546#156146#845 24464#67143465#172532#6544... (5 Replies)
Discussion started by: rochitsharma
5 Replies
Login or Register to Ask a Question