splitting a column into rows

07-24-2007

Registered User

36, 0

Join Date: May 2007

Last Activity: 25 January 2012, 1:28 PM EST

Posts: 36

Thanks Given: 0

Thanked 0 Times in 0 Posts

splitting a column into rows

I have a column of data of the format:

EDITORIAL OPED
193987141
193986701
193987451
193986321
STATISTICS
193986351
COLUMN
EDITORIAL OPED
193987171
NEWS
193321171
NEWS
193321111
NEWS
193320891
NEWS
193321841
NEWS
193320821
BUSINESS
BUSINESS
193323581

i would like to take this column and separate it into several columns so it looks like this.

EDITORIAL OPED 193987141

193986701
193987451
193986321
STATISTICS 193986351

COLUMN eDITORIAL OPED 193987171

NEWS 193321171

NEWS 200057287

Can anyone offer any suggestions?

spindoctor

View Public Profile for spindoctor

Find all posts by spindoctor

07-24-2007

Registered User

1,203, 103

Join Date: Mar 2007

Last Activity: 28 January 2020, 10:33 PM EST

Location: Orlando, Florida

Posts: 1,203

Thanks Given: 1

Thanked 103 Times in 100 Posts

Spindoctor,
I cannot find a descriptive English logic to explain your output based
on the input file you provided.

What should happen to:

Code:

NEWS
193321111
NEWS
193320891
NEWS
193321841
NEWS
193320821
BUSINESS
BUSINESS
193323581

Where did the following output come from:

Code:

NEWS 200057287

After you make all the necessary corrections, see if this can work for you:

Code:

!/bin/ksh
while read mLine
do
  if [ "${mLine}" = "EDITORIAL OPED"  -o \
       "${mLine}" = "STATISTICS"      -o \
       "${mLine}" = "NEWS" ]; then
    printf '%s ' "${mLine}"
  else
    printf '%s\n' "${mLine}"
  fi
done < input_file

Shell_Life

View Public Profile for Shell_Life

Find all posts by Shell_Life

07-24-2007

Registered User

36, 0

Join Date: May 2007

Last Activity: 25 January 2012, 1:28 PM EST

Posts: 36

Thanks Given: 0

Thanked 0 Times in 0 Posts

Ultimately, this comes from a series of files which are one news story each.
A typical news story looks like this:

Document 4 of 6

Ours is a manufacturing province:[Final Edition]
Edmonton Journal. Edmonton, Alta.:Jan 2, 2002. p. A12

Document types: Business; Editorial

Section: Opinion

Publication title: Edmonton Journal. Edmonton, Alta.: Jan 2, 2002. pg. A.12

Source type: Newspaper

ProQuest document: 221806441

Text Word Count 559

Document URL: http://proquest.umi.com/
pqdweb?did=221806441&Fmt=3&clientId=14119&RQT=309&VName=PQD

Abstract (Document Summary)

The tax load on manufacturers in Edmonton is the lowest of all cities in North
America, according to a com ison by ICF Economic Consulting Group of San
Francisco, in a study for Economic Development Edmonton.

The EDE survey discovered that 80 per cent of advanced manufacturing companies
in Edmonton were founded here. So our city's economic growth may depend more
upon encouraging local entrepreneurs than upon attracting businesses from
elsewhere.

Allan Scott, EDE's incoming president, promises to pursue venture capital and
has suggested that small amounts of provincial or municipal government funds
might reasonably go into high-risk, high- return venture portfolios.

Full Text (559 words)

Copyright Southam Publications Inc. Jan 2, 2002

Premier Ralph Klein has accurately recognized the importance of manufacturing
to Alberta's economy.

Too often, we assume that our province depends only on energy prices, inviting
complacency when they are high, and gloom when they are low -- as they are now.

If prices stay low, "the only way we can make up the difference is if there is
a strong movement in the manufacturing sector," Klein said in a year-end
interview.

Fortunately, that sector has grown steadily over the past three decades.
Alberta manufacturing shipments have risen from $1.9 billion to $32.8 billion
from 1970 to 1998.

*************************

I'm at the stage where I'm extracting information into files to put into an excel spreadsheet. I received help from someone else in another thread and settled on using the egrep command. I would go through each file, and egrep the line that started with "ProQuest Document ID" and output that to a separate file. Then, I would egrep again and seek every line that started with, say, Publication title and output that to a different file. Then, I would import both into excel, and line both columns up so that the ProQuest Document ID line matched up with the *corresponding* publication title information in the next column.

That actually worked pretty well for most fields of data that I'm interested in. However, some of the fields that I'm interested in (Section and Document types) in particular, <b> do not appear in each file</b>. Therefore, my technique of egrepping each line wouldn't work, because not every file would have a line to egrep.

I compromised and developed an awk command:

awk ' BEGIN { FS = ":" } ; /^Document.types|^ProQuest document/ { print $2 } ' * >> ~/documents/dissertation/con/prime/newfile.txt

and that is how I got the column of data above. i was hoping to find a way then, to split the column into three columns where the data would nicely line up. However, I'm seeing this might be difficult.

Any suggestions that would work either with this column of data that I have here, or with the original news stories would work. I.e. I'd like to get the fields: ProQuest Document ID; Document Types; Section; and print them <b> in rows</b> - one row for each file - and not in a column.

By the way, I'm aware that some people may be frustrated with me as I hav eposted a numbber of times on the same project. Please understand, this my very first foray into programming and, for whatever it's worth, I have learned a shitload about unix and I'm getting much more independent at it. But I'm not ready to take the training wheels off just yet. Simple or annotated, explained scripts are welcome!

spindoctor

View Public Profile for spindoctor

Find all posts by spindoctor

07-24-2007

Registered User

1,203, 103

Join Date: Mar 2007

Last Activity: 28 January 2020, 10:33 PM EST

Location: Orlando, Florida

Posts: 1,203

Thanks Given: 1

Thanked 103 Times in 100 Posts

Quote:

By the way, I'm aware that some people may be frustrated with me as I hav eposted a numbber of times on the same project. Please understand, this my very first foray into programming and, for whatever it's worth, I have learned a shitload about unix and I'm getting much more independent at it. But I'm not ready to take the training wheels off just yet. Simple or annotated, explained scripts are welcome!

I have also helped you a while back.

Your requests start with a small and simple specification and then it grows
to become a large project.

Keep in mind the following:

1) We are all volunteers here.

2) We are here to help and we are glad to help.

3) Unix is an operating system for information systems professionals.
It is not non-professional user friendly.

4) We can help people resolve an specific issue that does not require
many lines of code and specially many hours/days of thinking.

Per your description, you have a large and complex project.

It will be very hard to find a volunteer to devote time and effort
to complete your project.

What I suggest to you is to hire a programmer in your city to analyze,
plan and develop your project.

Spindoctor, think how much time you have put into this project
and it is still far away from being finished.

If you had hired a programmer from the beginning you would already
have a finished product.

Sometimes, it does not pay to try fix our own cars.

Shell_Life

View Public Profile for Shell_Life

Find all posts by Shell_Life

UNIX for Dummies Questions & Answers

splitting a column into rows

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting delimited string into rows

Discussion started by: techmoris

2. Shell Programming and Scripting

Converting Single Column into Multiple rows, but with strings to specific tab column

Discussion started by: AK47

3. Shell Programming and Scripting

Need help in splitting the string to diff rows

Discussion started by: rithushri

4. UNIX for Dummies Questions & Answers

[Solved] Deleting all rows where the first column equals the second column

Discussion started by: evelibertine

5. UNIX for Dummies Questions & Answers

[SOLVED] splitting a single column(with spaces) into multiple rows

Discussion started by: girlofgenuine

6. UNIX for Dummies Questions & Answers

merging rows into new file based on rows and first column

Discussion started by: A-V

7. Shell Programming and Scripting

awk command to print only selected rows in a particular column specified by column name

Discussion started by: ks_reddy

8. Shell Programming and Scripting

Replicating rows by splitting column in text file

Discussion started by: nick2011

9. Linux

Splitting a Text File by Rows

Discussion started by: dvdrevilla

10. Shell Programming and Scripting

Splitting file based on number of rows

Discussion started by: wahi80