text manipulation


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting text manipulation
# 1  
Old 06-04-2008
text manipulation

Hi, i have a file like this below, and it my have n no. of lines. Moderator gave me a solution with awk, but it was working only for the first 2 lines because awk has a limitation. can anyone give me the solution, thank you
INPUT FILE:
1081 "WPCW 19 - CW/AM1, WPCB 40 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER, "
W35AW - Various Shopping Pgms
W41CF - TBN
W47CV - TBN
WLLS-LP 49 - AM1
WATCH WPXI 11 N & WPIX 11 CW

1082 "WPCW 19 - CW/AM1, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER, "
"WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER"
W45BT - FAMN/CORNER
W47CV - TBN
WLLS-LP 49 - AM1
WATCH WPXI 11 N & WPIX 11 CW
WATCH WPGH 53 F & WWCP 08 F

1086 "WPCW 19 - CW/AM1, WFPT-DT3 28 - V-ME, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER,"
"WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER,"
"WPCB-DT1 50 - FAMN/CORNER, WGPT-DT3 54 - V-ME"
W35AW - Various Shopping Pgms
W47CV - TBN
WATCH WPXI 11 N & WPIX 11 CW
WATCH WPGH 53 F & WWCP 08 F

OUTPUT FILE should be like this:
1081WPCW 19 - CW/AM1, WPCB 40 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER, "W35AW - Various Shopping PgmsW41CF - TBN W47CV - TBN
WLLS-LP 49 - AM1 WATCH WPXI 11 N & WPIX 11 CW

1082WPCW 19 - CW/AM1, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER, "
"WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER" W45BT - FAMN/CORNER W47CV - TBN WLLS-LP 49 - AM1 WATCH WPXI 11 N & WPIX 11 CW WATCH WPGH 53 F & WWCP 08 F

1086WPCW 19 - CW/AM1, WFPT-DT3 28 - V-ME, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER,""WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER,""WPCB-DT1 50 - FAMN/CORNER, WGPT-DT3 54 - V-ME" W35AW - Various Shopping Pgms W47CV - TBN WATCH WPXI 11 N & WPIX 11 CWWATCH WPGH 53 F & WWCP 08 F
# 2  
Old 06-04-2008
Question Are you just trying to eliminate <cr> and <lf> characters?

Perhaps I am missing something, but it appears that you are trying to eliminate single-instance <cr> <lf> between lines, but maintain when there are consecutive <cr> <lf> pairs. Thus, turn 6-7 lines or so into one long sentence.
# 3  
Old 06-04-2008
Hi, thanks for the reply, each line should start with a number like 1081 and all the lines of text should be joined till next number 1082.

so output should be like

1081WPCW 19 - CW/AM1, WPCB 40 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER, "W35AW - Various Shopping PgmsW41CF - TBN W47CV - TBN
WLLS-LP 49 - AM1 WATCH WPXI 11 N & WPIX 11 CW

1082WPCW 19 - CW/AM1, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER, "
"WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER" W45BT - FAMN/CORNER W47CV - TBN WLLS-LP 49 - AM1 WATCH WPXI 11 N & WPIX 11 CW WATCH WPGH 53 F & WWCP 08 F
# 4  
Old 06-04-2008
Question Before/after clarification

Quote:
Before = 1081 "WPCW
After = 1081WPCW
You also show a drop of:
space character
first double-quote

Also, are all starting prefixes four digits?
Is there a range of numbers? i.e. >1000 and <2000 ?
# 5  
Old 06-04-2008
when the lines are joined, a comma can be appended or space and there is no range of numbers but always 4 digit numbers, if double quotes are removed then its good, thank you
# 6  
Old 06-04-2008
Hammer & Screwdriver One approach in unix script

script:
Code:
> cat conv_form
#! /bin/bash
#conv_form

ifile=file1
ofile=file9
rm $ofile 2>/dev/null
first=0

while read zf
   do
   fourc=$(echo "$zf" | cut -c1-4)
   if [ $fourc -gt 1 ] 2>/dev/null
      then
#output prior data (if any) skipping first pass thru file
         if [ $first -gt 0 ]
            then
            echo "$hold_var">>$ofile
            echo " ">>$ofile
         fi
         first=1
#clear variables
         hold_var=$(echo "$zf")
      else
         hold_var=$(echo "$hold_var" "$zf")
   fi
done <$ifile
#output prior data (if any) from where fell out of loop
echo "$hold_var">>$ofile

output:
>cat file9
1081 "WPCW 19 - CW/AM1, WPCB 40 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER, " W35AW - Various Shopping Pgms W41CF - TBN W47CV - TBN WLLS-LP 49 - AM1 WATCH WPXI 11 N & WPIX 11 CW

1082 "WPCW 19 - CW/AM1, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER, " "WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER" W45BT - FAMN/CORNER W47CV - TBN WLLS-LP 49 - AM1 WATCH WPXI 11 N & WPIX 11 CW WATCH WPGH 53 F & WWCP 08 F

1086 "WPCW 19 - CW/AM1, WFPT-DT3 28 - V-ME, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER," "WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER," "WPCB-DT1 50 - FAMN/CORNER, WGPT-DT3 54 - V-ME" W35AW - Various Shopping Pgms W47CV - TBN WATCH WPXI 11 N & WPIX 11 CW WATCH WPGH 53 F & WWCP 08 F
# 7  
Old 06-04-2008
i will try that, the output seems to be exactly what i wanted but i dont need space & doube quotes in between 1081 "WPCW

output should be like this:
1081WPCW 19 - CW/AM1, WPCB 40 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER, W35AW - Various Shopping Pgms W41CF - TBN W47CV - TBN WLLS-LP 49 - AM1 WATCH WPXI 11 N & WPIX 11 CW

1082WPCW 19 - CW/AM1, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER, WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER, WPCB-DT1 50 - FAMN/CORNER W45BT - FAMN/CORNER W47CV - TBN WLLS-LP 49 - AM1 WATCH WPXI 11 N & WPIX 11 CW WATCH WPGH 53 F & WWCP 08 F

1086WPCW 19 - CW/AM1, WFPT-DT3 28 - V-ME, WTRF-DT2 32 - F/MY, WPCB 40 - FAMN/CORNER,WKBS-DT1 46 - FAMN/CORNER, WKBS 47 - FAMN/CORNER,WPCB-DT1 50 - FAMN/CORNER, WGPT-DT3 54 - V-ME W35AW - Various Shopping Pgms W47CV - TBN WATCH WPXI 11 N & WPIX 11 CW WATCH WPGH 53 F & WWCP 08 F
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help text manipulation

Hello Forum , I need a help about text manupulation. I have a text file and I have to manipulate this file. Let's say source.txt source.txt UNB+UNOC:3+O0013000005MAN MN RVS:91+0098006688:92+190304:2313+F004169241' UNH+8146848+DELJIT:D:96A:UN' BGM+307:::JIS_SYNCRO_FIRM+2019030423234101+9'... (8 Replies)
Discussion started by: cemokam65
8 Replies

2. Shell Programming and Scripting

Text manipulation help

Hello again, I have a problem manipulating a large text document and there is no way I could edit this document by hand. Format is: Address : XXXX N 37 Ave, Hollywood, FL, 33021 Phone: XXX3190XXX Player: XXXXXX Character: Jaramillo DOB: June-14-1995 ----- Name: Alexandra Ticket... (3 Replies)
Discussion started by: galford
3 Replies

3. UNIX for Dummies Questions & Answers

Text manipulation help

Hello unix.com users, I have a ip file (line-by-line). How can I delete the ips that keep repeating by mark XXX.XXX.XXX.* ... I want to erase only the lines that keep repeating more than 2 times. Example: 1.2.3.1 1.2.3.2 1.2.3.3 I want to erase all ips blocks that are repeating by C... (1 Reply)
Discussion started by: galford
1 Replies

4. UNIX for Dummies Questions & Answers

Text Manipulation Help

Hello Unix.com, I have a text in format: john sara lee How can I make it: john:john john:john1 john:john12 john:john123 sara:sara sara:sara12 sara:sara123 and so on (2 Replies)
Discussion started by: galford
2 Replies

5. UNIX for Dummies Questions & Answers

text manipulation help

Hello again unix.com How can I extract from a large file in format: steve@aol.com steve hawkins Location of this member is bla bla bla sun@hotmail.com Sun Ying This member is using browser bla bla bla to another text in format: steve@aol.com steve hawkins sun@hotmail.com sun ying ... (5 Replies)
Discussion started by: galford
5 Replies

6. Shell Programming and Scripting

[HELP] Text manipulation... [HELP]

I need to know how can I remove all word after comma on each line. Like: jjkj,iiuiui,ijlkjkij,ookoo kijljlj,jhhkj,ijijkijkj,oijkijj kjkljlkj,kjkjlkjlkj,opok,okop to jjkj, kijljlj, ... (5 Replies)
Discussion started by: slutb3
5 Replies

7. UNIX for Dummies Questions & Answers

Text Manipulation

Greetings. Iīm a biologist and I donīt have mucho knowledge on Unix/Linux, but I need to use Cygwin to change some documents from a GenBank format to a FASTA format. GenBank format goes somthing like this: LOCUS NM_013964 2568 bp mRNA linear PRI 26-APR-2009... (2 Replies)
Discussion started by: vanesa1230
2 Replies

8. UNIX for Dummies Questions & Answers

Help with text manipulation

Hi there, I have some text files in unix format that processed by a program in windows, and when I open them with less or vi in linux, a warn for opening binary file is prompted, and as shown in vi, between every two characters there was inserted a "^@". How can I fix this. Plus, there are over... (2 Replies)
Discussion started by: dustinwang2003
2 Replies

9. UNIX for Dummies Questions & Answers

text manipulation

I am tryin to figure out how to extract interested text from file example.txt blah blah blah a: child1 blah a: child2 blah b: parent1 blah blah blah .... blah a: child21 blah a: child22 blah a: child23 blah b: parent2 this kinda text repeats .. number of children is... (6 Replies)
Discussion started by: rajkishore
6 Replies

10. Shell Programming and Scripting

Text Manipulation.

Hi I have only ever used awk and sed for basic requirements up until now. I have had to break a log down for multiple purposes. Using awk, sed and a date script. I am left with this: (message id, time of msg attempt, message id, domain name, time of msg completion) ... (4 Replies)
Discussion started by: Icepick
4 Replies
Login or Register to Ask a Question