Help with gawk array, loop in tcsh script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with gawk array, loop in tcsh script
# 1  
Old 03-21-2011
Help with gawk array, loop in tcsh script

Hi, I'm trying to break a large csv file into smaller files and use unique values for the file names. The shell script i'm using is tcsh and i'm after a gawk one-liner to get the desired outcome. To keep things simple I have the following example with the desired output.

fruitlist.csv
apples 20 40
pears 22 45
grapes 24 50
plums 26 55
peaches 28 60
apples 30 65
pears 32 70
grapes 34 75
plums 36 80
peaches 38 85
apples 40 90
pears 42 95
grapes 44 100
plums 46 105
peaches 48 110

uniquefruit.csv
apples
pears
grapes
plums
peaches

What I want is for gawk to print the rows containing apples in the fruitlist.csv file and redirect to a file called apples.csv. Similarly for pears I want all rows in fruitlist.csv containing pears to be redirected to pears.csv etc etc. Note: Each time I generate a fruitlist.csv file the unique values will be different. For this reason I have generated a uniquefruit.csv file and tried using it within a loop as follows:

Code:
 
#!/bin/tcsh -f
set ugs = `cat uniquefruit.csv`
foreach ug ($ugs)
  gawk -v fruit=$ug '{if(fruit ~ $1); print $0 >> fruit".csv"}' fruitlist.csv
end

I also tried the code:
Code:
 
gawk -F"," '{while ((getline fruit < uniquefruit.csv) > 0); {if(fruit ~ $1) print $0 >> fruit".csv"}' fruitlist.csv

In both cases I got the same result. The desired file names were generated ok. However, all the rows within fruitlist.csv were appended 5 times to each of the breakdown files (apples.csv, pears.csv etc etc.). Something appears to be going wrong when I assign the strings in uniquefruit.csv to the variable fruit.
Any help would be greatly appreciated,
Regards, theflamingmoe.

Last edited by theflamingmoe; 03-21-2011 at 01:44 PM.. Reason: Forgot to add table tags!!!
# 2  
Old 03-21-2011
how about this?
Code:
awk '{print >$1".csv"}' fruitlist.csv

# 3  
Old 03-22-2011
Thanks for the reply yinyuemi. I will try it at work tomorrow!!!
Regards,
Theflamingmoe

---------- Post updated 22-03-11 at 10:25 AM ---------- Previous update was 21-03-11 at 07:34 PM ----------

It still doesn't work. I ended up with 16 csv files. I only want 5 .csv files ie. apples.csv, pears.csv, grapes.csv, plums.csv and peaches.csv. Any rows that contain apples ie (apples, 20, 40), (apples, 30, 65), and (apples, 40, 90) I want redirected to the apples.csv file. The same for the other fruit unique to field 1 ($1). Do I need a BEGIN pattern before gawk is processed?

Cheers,
Theflamingmoe

---------- Post updated at 10:52 AM ---------- Previous update was at 10:25 AM ----------

For anyone interested I got it to work with the following code:

Code:
 
set ugs = `cat uniquefruit.csv`
foreach ug ($ugs)
gawk -v fruit=$ug '$1 ~ fruit {print $0 >> fruit".csv"}' fruitlist.csv 
end

Cheers,
Theflamingmoe
# 4  
Old 03-22-2011
Hi Theflamingmoe,

the following is my testing results, are these what you want?

Best,

Y

Code:
awk '{print >$1".csv"}' fruitlist.csv

ls *.csv
apples.csv  fruitlist.csv  grapes.csv  peaches.csv  pears.csv  plums.csv

cat apples.csv
apples 20 40
apples 30 65
apples 40 90

cat grapes.csv
grapes 24 50
grapes 34 75
grapes 44 100

# 5  
Old 03-22-2011
Hi, Theflamingmoe.

Your solution will pass over the file once for each unique key you have. For a short file that's likely not a problem. For a large file, it will be time-consuming.

The idea of yinyuemi worked for me. I used a slightly different filename scheme because I like suffixes to be created with the key:
Code:
gawk '{print  >> "body."$1}' input-filename

and it produced:
Code:
body.apples  body.grapes  body.peaches	body.pears  body.plums

with a sample -- from the grapes file being:
Code:
grapes 24 50
grapes 34 75
grapes 44 100

I used:
Code:
gawk GNU Awk 3.1.5

Best wishes ... cheers, drl
# 6  
Old 03-22-2011
Quote:
Originally Posted by yinyuemi
Hi Theflamingmoe,

the following is my testing results, are these what you want?

Best,

Y

Code:
awk '{print >$1".csv"}' fruitlist.csv

ls *.csv
apples.csv  fruitlist.csv  grapes.csv  peaches.csv  pears.csv  plums.csv

cat apples.csv
apples 20 40
apples 30 65
apples 40 90

cat grapes.csv
grapes 24 50
grapes 34 75
grapes 44 100


I think because it is a .CSV file you need modifications as below:-

Code:
awk -F"," '{print >> $1".csv"}' OFS="," fruitlist.csv

# 7  
Old 03-22-2011
Thanks for the replies guys. When I first tried Yinyuemi's code using gawk rather than awk, I got 16 different csv files. Is this because I needed to set both the OFS variable and the field separator to a comma as suggested by Ahmad.diab?? Thanks again for the comments guys......it's been a great help Smilie)
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Unable to print python array in shell script loop.

I am unable to loop print a python string array in my unix shell script: ~/readarr.sh '{{ myarr }}' more readarr.sh echo "Parameter 1:"$1 MYARRAY= $1 IFS= MYARRAY=`python <<< "print ' '.join($MYARRAY)"` for a in "$MYARRAY"; do echo "Printing Array: $a" done Can you... (10 Replies)
Discussion started by: mohtashims
10 Replies

2. Shell Programming and Scripting

Shell script to loop and store in array

I'm trying to achieve the follwoinig with no luck. Find the directories that are greater than 50GB in size and pick the owner of the directory as I would like to send an alert notification. du -sh * | sort -rh 139G Dir_1 84G Dir_2 15G Dir_3 ls -l Dir_1 drwx------ 2... (3 Replies)
Discussion started by: 308002184
3 Replies

3. Shell Programming and Scripting

awk loop using array:wish to store array values from loop for use outside loop

Here's my code: awk -F '' 'NR==FNR { if (/time/ && $5>10) A=$2" "$3":"$4":"($5-01) else if (/time/ && $5<01) A=$2" "$3":"$4-01":"(59-$5) else if (/time/ && $5<=10) A=$2" "$3":"$4":0"($5-01) else if (/close/) { B=0 n1=n2; ... (2 Replies)
Discussion started by: klane
2 Replies

4. Shell Programming and Scripting

awk script: loop through array

I have a large file where I want to extract the data by using awk script. I have made a small sample of the input data. I have in the awk script two condition . The first one is to collect the initial time and the second one to collect the end time. I stored the difference between (Time=end-start)... (8 Replies)
Discussion started by: ENG_MOHD
8 Replies

5. Shell Programming and Scripting

Array Variable being Assigned Values in Loop, But Gone when Loop Completes???

Hello All, Maybe I'm Missing something here but I have NOOO idea what the heck is going on with this....? I have a Variable that contains a PATTERN of what I'm considering "Illegal Characters". So what I'm doing is looping through a string containing some of these "Illegal Characters". Now... (5 Replies)
Discussion started by: mrm5102
5 Replies

6. Shell Programming and Scripting

how to prompt the user to enter an array in tcsh

Hello, I am writing a script that requires the user to enter a string of numbers: ex: 134 345 865 903 This command only allows for one variable to be entered: set "var" = $< and than once I got the array I want to change it to a list with each input on a different line: ... (1 Reply)
Discussion started by: smarones
1 Replies

7. Shell Programming and Scripting

gawk - How to loop through multidimensional array?

I have an awk script that I am writing and I needed to make use of a multidimensional array to hold some data... Which is all fine but I need to loop through that array now and I have no idea how to do that. for a regular array, the following works: ARRAY for(var in ARRAY) { ... } ... (5 Replies)
Discussion started by: trey85stang
5 Replies

8. Shell Programming and Scripting

Problem with File Names under tcsh loop

Hello, I have a question regarding file naming under a loop in tcsh. I have the following code: #!/bin/tcsh foreach file (test/ProteinDirectory/*) # The * is a bunch of ProteinFile1, ProteinFile2, ProteinFile3, etc. sh /bioinfo/home/dgendoo/THREADER/pGenThreader.sh $file $file ... (4 Replies)
Discussion started by: InfoSeeker
4 Replies

9. UNIX for Dummies Questions & Answers

Top level TCSH while Loop doen't work

Hey guys... I'm learning some shell scripting on OS X using the tcsh shell. For some reason... my while loop isn't executing right (or more likely I am doing something wrong.) Something as simple as this doesn't work: #!/bin/tcsh set g = 0 while ($g <10) echo "this" $g @ g =... (2 Replies)
Discussion started by: sprynmr
2 Replies

10. Shell Programming and Scripting

TCSH.need help.take input during a while/end loop

I am writting a script in csh and I am blanking out on how I code in the ability to process user input in the middle of a while/end loop. while(1) args.. end it is a simple script, and I want to add hotkey functions, like q to quit, z to zero counters, etc.. Google has not been very... (1 Reply)
Discussion started by: seg
1 Replies
Login or Register to Ask a Question