The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Command display output on console and simultaneously save the command and its output satimis UNIX for Dummies Questions & Answers 7 01-25-2009 08:27 PM
how to make a line BLINKING in output and also how to increase font size in output mail2sant Shell Programming and Scripting 3 04-14-2008 08:30 AM
top output new2ss UNIX for Advanced & Expert Users 4 03-11-2008 04:58 AM
awk output useless79 Shell Programming and Scripting 3 09-03-2007 11:21 AM
Why I don't get any output? Sharmin High Level Programming 6 09-17-2006 01:34 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 08-06-2008
Paradoxdruid Paradoxdruid is offline
Registered User
  
 

Join Date: Aug 2008
Posts: 5
Multicolumn csv output

I'm revising a quick and dirty script I wrote for my work that takes a series of data files (tab delimited) and condenses them into one. Essentially, each original file looks like these examples:
Code:
A1  data
B1 data
C1 data
Code:
A2 data
B2 data
C2 data
The script currently squashes them into the final form:
Code:
A1,data
A2,data
A3,data
B1,data
B2,data
B3,data
...
What I'm having difficulty implementing is instead changing the data format to something like:
Code:
A1,data,B1,data,C1,data
A2,data,B2,data,C2,data
A3,data,B3,data,C3,data
I'm attaching my script below... I'd appreciate any feedback-- I'm very new to this!

Code:
#!/bin/bash
#ATR Kinetic Analysis Script for KDE
#author: ParadoxDruid
#created March 13, 2008
#modified August 6th, 2008

#say hello
kdialog --msgbox "Signalyze ATR Data Converter\nPress OK to select working directory"
#navigate to proper directory
DIRECTORY=`kdialog --getexistingdirectory .`
#get necessary info from user
NAME=`kdialog --title "Trial Name" --inputbox "Name of your trial"`
echo $NAME" Trial Set" > $DIRECTORY/$NAME.csv
echo -e "Seq,Point,Integral,sd" >> $DIRECTORY/$NAME.csv
#filename set
SET=`kdialog --title "Sets" --inputbox "What is the name of your results files? (i.e.  kinetics-1_  )"`
#end filename set

#probe sequence detection
cat $DIRECTORY/"$SET"1.stx | sed '1,22d' > $DIRECTORY/$NAME.temp
SEQNUM=`cat $DIRECTORY/$NAME.temp | wc -l`
#end probe sequence detection

#data points auto-detection
POINTS=`ls $DIRECTORY | grep $SET | grep stx | sed 's/'$SET'//' | sed 's/\.stx//' | sort -n | tail -n1`
#end data points auto detection

#probe names routine
for ((c=1;c<$SEQNUM+1;c++)); do
sequences[$c]=`cat $DIRECTORY/$NAME.temp | head -n$c | tail -n1 | cut -f 1`
done
#end routine

#increment the number of data points, since the ATR doesn't use 0
let "TRIALS=$POINTS+1"

#define a function to grab data
seqgrab() {
for ((i=1;i<$TRIALS;i++)); do
data=`cat $DIRECTORY/"$SET"$i.stx | grep -w $1 | cut -f 5`
sd=`cat $DIRECTORY/"$SET"$i.stx | grep -w $1 | cut -f 6`
 echo -e $1 "," $i "," $data "," $sd  >> $DIRECTORY/$NAME.csv
done
}

#tell about our progress
dcopRef=`kdialog --progressbar "Initialising" ${#sequences[@]}`

#iterate through the data files
for ((n=1;n<${#sequences[@]}+1;n++)); do
seqgrab ${sequences[${n}]}
dcop $dcopRef setProgress $n
dcop $dcopRef setLabel "Working..."
done

rm $DIRECTORY/$NAME.temp
dcop $dcopRef close
kdialog --textbox $DIRECTORY/$NAME.csv 440 800
exit 0
Thanks again for any help!
  #2 (permalink)  
Old 08-06-2008
bakunin bakunin is offline Forum Staff  
Bughunter Extraordinaire
  
 

Join Date: May 2005
Location: In the leftmost byte of /dev/kmem
Posts: 1,628
You might want to use the "paste" command instead - seems like it could do all the work.

I hope this helps.

bakunin
  #3 (permalink)  
Old 08-06-2008
Paradoxdruid Paradoxdruid is offline
Registered User
  
 

Join Date: Aug 2008
Posts: 5
The paste command does look promising, and I feel silly for not knowing it.

However, I may have oversimplified my explanation above to try and make it legible.
Here's a snippet from two actual origin files:
Code:
Time 6
Probe_Name    Count    Net_Signal    Net_Signal_SD    Net_Integral    Net_Integral_SD    Proc_Control    
A1    5    0.04594    0.01175    0.81596    0.23182    OK    
B1    3    0.02464    0.00381    0.59647    0.15367    OK    
C1    5    0.13487    0.02862    2.54441    0.29700    OK
Code:
Time 7
Probe_Name    Count    Net_Signal    Net_Signal_SD    Net_Integral    Net_Integral_SD    Proc_Control    
A1    5    0.04545    0.01211    0.82307    0.24171    OK    
B1    3    0.02332    0.00557    0.56161    0.10771    OK    
C1    5    0.13672    0.02963    2.54276    0.26535    OK    
D1    5    0.14061    0.07675    2.58301    1.31850    OK
All I want to preserve is the first column (A1) and the last two numbers (0.81596 and 0.23182), in a time-dependent fashion, so that my output shows that data named A1 had value 0.81596 at time 6, and 0.82307 at time 7. A current actual final data snippet:
Code:
Seq,Point,Integral,sd
A1 ,1,0.81596,0.23182
A1 ,2,0.81793,0.2443
A1 ,3,0.82073,0.24254
A1 ,4,0.82307,0.24171
A1 ,5,0.81935,0.23554
B1 ,1,0.59647,0.15367
B1 ,2,0.57585,0
B1 ,3,0.55278,0.11597
B1 ,4,0.56161,0.10771
B1 ,5,0.49331,0.08419
C1 ,1,2.54441,0.297
Currently, I do this by iterating over the time/files with cut to grab the correct fields (greping the name) and save them, but I output it via echo, which makes it difficult to line up multiple columns.

I'll look at paste more, but I'd appreciate further ideas, too. There's probably an easy way to do with with awk or paste or something that i just haven't seen. Thanks!
  #4 (permalink)  
Old 08-06-2008
shamrock shamrock is offline Forum Advisor  
Registered User
  
 

Join Date: Oct 2007
Location: USA
Posts: 750
I agree with Bakunin. paste can do all that for you with a little help from tr

Code:
paste -s file1 file2 file3 ... | tr '\t' ','
  #5 (permalink)  
Old 08-06-2008
Paradoxdruid Paradoxdruid is offline
Registered User
  
 

Join Date: Aug 2008
Posts: 5
Quote:
Originally Posted by shamrock View Post
I agree with Bakunin. paste can do all that for you with a little help from tr
Code:
paste -s file1 file2 file3 ... | tr '\t' ','
I'm not sure I understand how I can use paste to make multiple columns.
When I stripped out header info from a few test files and ran a command like the above, it gave me
Code:
A1(time 1) and all numerical fields
B1(time 1) and all numerical fields
...

A1(time 2) and all numerical fields
B1(time 2) and all numerical fields
...
What I'm looking for is a final output like:
Code:
A1(time 1) and fields 5 and 6,B1(time 1) and fields 5 and 6, C1(time 1) and fields 5 and 6,...
A1(time 2) and fields 5 and 6,B1(time 2) and fields 5 and 6, C1(time 2) and fields 5 and 6,... 
...
where the columns are the data for different names (A1,B1, etc-- separate lines in the original file) and the rows are iterating through each file for the different timepoints.

Thanks again!
  #6 (permalink)  
Old 08-07-2008
summer_cherry summer_cherry is offline Forum Advisor  
Registered User
  
 

Join Date: Jun 2007
Location: Beijing China
Posts: 1,088
below one should be ok. But it supposes that there is no "|" in your file.

If not, just replace "|" with another special character which will never appear in your file.

Code:
paste -d"|" file1 file2 file3 | tr "|" "\n"
  #7 (permalink)  
Old 08-08-2008
Paradoxdruid Paradoxdruid is offline
Registered User
  
 

Join Date: Aug 2008
Posts: 5
Quote:
Originally Posted by summer_cherry View Post
below one should be ok. But it supposes that there is no "|" in your file.
If not, just replace "|" with another special character which will never appear in your file.
Code:
paste -d"|" file1 file2 file3 | tr "|" "\n"
I added cut -f1,5,6 to the end of this statement, and it does a great job of recreating my original, long script with a single line-- neat!

However, it's output is still single column, like this:
Code:
A1 value value
A2 value value
B1 value value
B2 value value
C1 value value
...
I'm still looking for a way to have multicolumn output, like:
Code:
A1 value value B1 value value C1 value value ...
A2 value value B2 value value C2 value value ...
...
Thank you all for the help along the way-- paste is a great new command to use.
Closed Thread

Bookmarks

Tags
shell csv bash

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 09:46 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0