Script or command: Formatted text to CSV


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script or command: Formatted text to CSV
# 1  
Old 03-23-2010
Script or command: Formatted text to CSV

Hi Everyone, I've been using this site as a great resource to aid me with simple search and replace tasks. I still consider myself a novice and now I've been pulling my hair out over this problem. Any hints or suggestions would be welcome!

I have a text file in a format like this

name: John_Smith division: A
age: 5
favorite food: cake
favorite color: blue
about: "John likes to fish"

name: Mary_Smith division: B
age: 7
favorite food: french fries
favorite color: red
about: "Mary will be moving next year"

name: Josh_Murray division: C
age: 5
favorite food: starfish
favorite color: blue
about: "Josh likes the beach"

The situation is I need to create a CSV file using the name, age, and about fields in the file and have it sorted by division in the order or A, C, and B. The sorting would be a bonus, I was thinking that the worse case scenario would be me importing the CSV file into excel and sorting it there. Why it needs to be in A,C,B? I cant figure out why.

The desired output would be:

John_Smith,5,John likes to fish
Josh_Murray,5,Josh likes the beach
Mary_Smith,7,Mary will be moving next year

I've been trying to use egrep to pull out the desired fields since they will remain the same, but unable to figure out how to create each comma separated line in the new file. However, I think I am on the wrong track.

I've been trying to use awk and sed, but again, I cannot figure out how to pull the desired formatted record into a single line and then starting a new line when it gets to the new record. Records will always start with "name" and the name field will always be populated.

Any suggestions?
# 2  
Old 03-23-2010
whithout sorting :
Code:
#!/bin/bash
while read L1 L2
do
    case $L1 in
        name:)  echo -n "${L2%% *},"    ;;
        age:)    echo -n "$L2,"    ;;
        about:) echo "$L2"    ;;
    esac
done < infile

It seems that the wanted sorting is on 1:age 2:name, is it right?
# 3  
Old 03-23-2010
Code:
file=/home/amit/ars/amit.txt



grep name $file | awk -F " " ' { print $2 } ' > name

grep age $file | awk -F " " '{print $2}' > age

grep about $file | cut -d '"' -f2 > about

paste name age about | sed 's/  /;/g'

# 4  
Old 03-23-2010
This should work, again the sorting thing is a little odd so I've left division in so you can sort based on that, but it should be easily removed. Probably much more elegant ways of doing it:

cat file | tr -s "\n" " " | sed 's/name:/\nname:/g' | awk -F"[a-zA-Z]*:" '{print $2 $3 $4 $7}' | sed 's/ /,/g;s/ favorite//g;s/^ //g'

This could also potentially be broken if someone puts a : where they shouldn't!

Edit:

Or, a slightly neater way:

cat text | tr -s "\n" " " | sed 's/name:/\nname:/g' | awk -F"[name|division|age|favorite food|favorite color|about]*: " '{print $2","$3","$4","$7}'

Again leaving division in for sorting, but that should easily be removed with awk $3
# 5  
Old 03-23-2010
Quote:
Originally Posted by regexnub
...
The situation is I need to create a CSV file using the name, age, and about fields in the file and have it sorted by division in the order or A, C, and B. The sorting would be a bonus, I was thinking that the worse case scenario would be me importing the CSV file into excel and sorting it there. Why it needs to be in A,C,B? I cant figure out why.
I'd bet the sorting is done on name.
In case of a simple ASCII character sort -

Code:
John_Smith < Josh_Murray < Mary_Smith

Quote:
The desired output would be:

John_Smith,5,John likes to fish
Josh_Murray,5,Josh likes the beach
Mary_Smith,7,Mary will be moving next year

I've been trying to use egrep ...
... trying to use awk and sed, ...
Any suggestions?
This can be solved quite simply and elegantly by Perl.
If you have each comma-delimited line as an array element, then you can use Perl's inbuilt array sort operator -

Code:
$
$ cat -n f9
     1  name: John_Smith division: A
     2  age: 5
     3  favorite food: cake
     4  favorite color: blue
     5  about: "John likes to fish"
     6
     7  name: Mary_Smith division: B
     8  age: 7
     9  favorite food: french fries
    10  favorite color: red
    11  about: "Mary will be moving next year"
    12
    13  name: Josh_Murray division: C
    14  age: 5
    15  favorite food: starfish
    16  favorite color: blue
    17  about: "Josh likes the beach"
$
$
$ ##
$ perl -lne 'if (/name: (.*?) division.*$/ || /age: (\d+)$/ || /about: "(.*?)"$/){$s.=",$1"}
>            elsif (/^$/){push @a,$s;$s=""}
>            END{push @a, $s; foreach (sort @a){print substr($_,1)}}' f9
John_Smith,5,John likes to fish
Josh_Murray,5,Josh likes the beach
Mary_Smith,7,Mary will be moving next year
$
$

HTH,
tyler_durden
# 6  
Old 03-24-2010
maybe perl can be your best helper
Code:
my %hash;
my %seq=(A=>1,C=>2,B=>3);
local $/="\n\n";
while(<DATA>){
	if(/name:\s*(\S+)\s*division:\s*(\S+).*^age:\s*([0-9]+).*^about:\s*"([^"]+)"/sm){
		$hash{$2}=sprintf("%s,%s,%s",$1,$3,$4);
	}
}
foreach my $key(sort {$seq{$a}<=>$seq{$b}} keys %hash){
	print $hash{$key},"\n";
}
__DATA__
name: John_Smith division: A
age: 5
favorite food: cake
favorite color: blue
about: "John likes to fish"

name: Mary_Smith division: B
age: 7
favorite food: french fries
favorite color: red
about: "Mary will be moving next year"

name: Josh_Murray division: C
age: 5
favorite food: starfish
favorite color: blue
about: "Josh likes the beach"

# 7  
Old 03-24-2010
Code:
awk '/^name/ {name=$2} 
     /^age/ {age=$2} 
     /^about/ {split($0,b,"\"");about=b[2]} 
     /^$/ {print name,age,about;name=age=about=""}
     END {print name,age,about} ' OFS="," urfile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Mail a formatted csv file

I have the below script i am getting the csv in garbled format.Please suggest the changes. SUNOS ####################################################################### ####MAIN SCRIPT ####################################################################### today=`date "+%m-%d-%Y ... (3 Replies)
Discussion started by: rafa_fed2
3 Replies

2. Shell Programming and Scripting

Displaying Formatted Text on Virtual Terminal

Hi, I'm working on a project that requires formatted text to be displayed on the screen plugged into a Linux machine. I want to be able to control this text via a bash script and format it in a particular font and size. Changing the background colour would also be beneficial. Does anyone know... (3 Replies)
Discussion started by: lcoor65
3 Replies

3. Shell Programming and Scripting

print formatted text to the printer

Hello!!! I am using shell script that print some formated text on the screen (example) ======== hello I am ... ======== Is it possible to print this information to the printer exactly as I see it on the screen??? (6 Replies)
Discussion started by: tdev457
6 Replies

4. Programming

awk script to convert a text file into csv format

hi...... thanks for allowing me to start a discussion i am collecting usb usage details of all users and convert it into csv files so that i can export it into some database.. the input text file is as follows:- USB History Dump by nabiy (c)2008 (1) --- Kingston DataTraveler 130 USB... (2 Replies)
Discussion started by: certteam
2 Replies

5. Shell Programming and Scripting

String searching and output to a file in a formatted text

Hi, I'm very new to UNIX scripting and find quite difficult to understand simple UNIX syntax. Really appreciat if somebody could help me to give simple codes for my below problems:- 1) I need to search for a string "TTOH 8031950001" in a files which filename will be "*host*'. For example, the... (3 Replies)
Discussion started by: cuji
3 Replies

6. Shell Programming and Scripting

Format text to bold from perl script to csv

Hi everyone, is there any way in perl using which we can print the selective words in bold when we write the output to a csv file? Please find the example below 1. Filename: A 2. name age 12 3. city add 23 Line1 should only be bold. Outputs from other files being read in the... (2 Replies)
Discussion started by: ramakanth_burra
2 Replies

7. Shell Programming and Scripting

Extracting formatted text and numbers

Hello, I have a file of text and numbers from which I want to extract certain fields and write it to a new file. I would use awk but unfortunately the input data isn't always formatted into the correct columns. I am using tcsh. For example, given the following data I want to extract: and... (3 Replies)
Discussion started by: DFr0st
3 Replies

8. Shell Programming and Scripting

creating & sending formatted (with bolds & colors) CSV

Hi , I have a situation. Need is to create & send a formatted file with header in BOLD & colored & some sequel results as a content. I know echo -e \033 command, but its scope is limited in PUTTY. How to retain the formatting out of Putty; say after someone opens a email attachment... (2 Replies)
Discussion started by: infaWorld
2 Replies

9. UNIX for Advanced & Expert Users

Mutt - Word Document or Formatted text as a Message

Hi, I am writing a mailing script by using mutt command. I that i have facing a issues. because, i want to send Some Formatted text as the mail message. but, i try to send the Word Document file as the Mail message. it shows some junk characters in the mail. :confused:I think the mutt command is... (1 Reply)
Discussion started by: krsenkumar
1 Replies

10. Shell Programming and Scripting

Convert DATE string to a formatted text

Hi guys, i need your help. I need to convert a date like this one 20071003071023 , to a formated date like 20071003 07:10:23 . Could this be possible ? Regards, Osramos (6 Replies)
Discussion started by: osramos
6 Replies
Login or Register to Ask a Question