|
Search Forums:
|
|||||||
| Forums | Register | Forum Rules | Linux and Unix Links | Man Pages | Albums | FAQ | Users | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Help with sed and replacing white spaces with commas
Dear all, I am in a bit of a quandary. I have 400 text files which I need to edit and output in a very specific way. Here is a sample text file copied from gedit ... The columns will come out a bit messed up but when I cat <file>, it gives a table with six columns (0-28, tot lob vol, vcsf, scsf, gray, white): Code:
tot lob vol Vcsf Scsf gray white 0 0.000000 0.000000 0.000000 0.000000 1 0.000000 0.000000 0.000000 0.000000 2 0.000000 0.000000 0.000000 0.000000 3 26355.514477 0.000000 7986.582205 13035.582840 5333.349432 4 61027.455763 0.000000 7617.445767 26799.098600 26610.911396 5 16966.834205 0.000000 2342.000090 9508.623764 5116.210351 6 16675.247439 0.000000 2196.206707 9037.121760 5441.918973 7 57706.261817 0.000000 16951.324271 24216.177530 16538.760016 8 91215.992021 2102.113105 7351.708891 41168.535796 40593.634229 9 56586.444555 0.000000 6690.985687 29733.578183 20161.880685 10 24258.571350 0.000000 4461.691120 13908.275147 5888.605082 11 99730.945990 2593.261027 10288.256465 52107.175509 34742.252989 12 8293.678905 967.819905 55.835764 4209.396188 3060.627049 13 13760.413773 2139.336947 21.713908 6338.393179 5260.969738 14 25516.944025 0.000000 7529.556139 10519.871485 7467.516401 15 30830.647540 451.856088 3489.735233 14415.966999 12473.089220 16 26945.925979 0.000000 7874.910677 12887.721466 6183.293835 17 65250.293894 0.000000 7736.355264 29207.274410 28306.664221 18 17500.375947 0.000000 2880.711810 9932.561970 4687.102167 19 17848.832473 3.101987 2632.552860 9684.403021 5528.774605 20 51296.522939 0.000000 14967.086667 21813.171698 14516.264575 21 92378.203103 3180.570541 8504.614013 40641.198027 40051.820521 22 58887.084820 14.475939 8116.865654 29876.269579 20879.473649 23 25229.493242 0.000000 3892.993527 14250.527699 7085.972016 24 90686.586261 1761.928544 10282.052492 48120.088380 30522.516844 25 10664.630873 1746.418610 201.629147 5669.398010 3047.185106 26 12702.636249 2146.574917 26.883886 5388.151200 5141.026246 27 30967.134963 0.000000 9628.567257 13284.775786 8053.791920 28 34047.407929 150.963361 4818.419611 15476.846510 13601.178447 If you've read that far, I thank you already. Basically, I need to do the following: 1. Delete the first four rows 2. Delete the first column (numbers 1-28) 3. Paste each row to the end of the previous row, so one file comes out as one super long line 4. Comma-delimit all numbers 5. Add new first field at the beginning of the line containing the file name 6. Output all of this to a new text file which will take each of the 400 or so text files I have as above, convert them each to their own line consisting of steps 1-5. So, an example output file from just the above file would be: Code:
file1.txt, 26355.514477, 0.000000, 7986.582205, 13035.582840, 5333.349432, ..., 34047.407929, 150.963361, 4818.419611, 15476.846510, 13601.178447 <carriage return> file2.txt, etc.... I know zippo about programming but pasting together various snippets I've found online have come up with the following which does not seem to working exactly right ... my output is joining some numbers, particularly those from the end of one line and the beginning of another. Code:
#!/bin/sh
#script file placed in the directory above that which contains all the text files
#script run by typing ../script.sh from within the directory containing the files
files=`ls -l *_stats.txt`
# all files end in _stats.txt
for file in ${files}
do
cat ${file} | sed '1,4d'| awk '{print $2, $3, $4, $5, $6}' | sed 's/[:space:]+/,/g' >> output.txt
#read file, delete lines 1-4, print all but the first column, delete all whitespace #between numbers including carriage returns and put them all on one line with #comma-delimiters; then new line with the output of the next file
doneAny help would be most appreciated. It is sad, but I have spent HOURS trying to figure this out and I am pretty sure it would be ridiculously simple for anyone with basic bash scripting knowledge. Please help! Best wishes, Anthonz Last edited by Franklin52; 03-13-2010 at 09:41 AM.. Reason: Please use code tags! |
| Sponsored Links | |
|
|
|
#2
|
||||
|
||||
|
For removing the first 4 rows you can use the sed command as follows Code:
sed '1,4d' input_file For removing the first field which contains the numbers (0-28) you can use the sed command as follows Code:
sed 's/\(^[ ]*[0-9]\{1,2\}\)//g'Quote:
Heres a sed solution for this Code:
sed '$!N;s/\n/ /' Quote:
Code:
tr -s ' ' | sed 's/ /,/g' Here why I am using tr -s means it contains repeated space characters. I am squeezing the repeated characters into single space and I am converting the space to comma character. Last edited by thillai_selvan; 03-13-2010 at 04:07 AM.. |
| Sponsored Links | ||
|
|
|
#3
|
||||
|
||||
|
Use this , Code:
files=`ls *_stats.txt`
for file in ${files}
do
content=`cat ${file} | sed '1,4d'| sed -r "s/^[0-9]+\s(.+$)/\1/g" | sed -r "s/\s+/,/g" | tr '\n' '\0'`
echo "$file,$content" >> output.txt
done |
|
#4
|
|||
|
|||
|
Just modifying your code a little bit... Code:
files=$(ls -1 *_stats.txt)
for file in ${files}
do
cat ${file} | sed '1,4d' | awk '{printf ("'${file}'"==pfile?","$2","$3","$4","$5","$6:"\n""'${file}'"","$2","$3","$4","$5","$6)}{pfile="'${file}'"}' >> output.txt
doneLast edited by malcomex999; 03-13-2010 at 05:32 AM.. |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Many thanks for your thoughtful replies. I will give them a try after some sleep and post the results.
cheers |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
Hi, antonz: Welcome to the forums. The following generates output identical to your sample desired output. Code:
for f in *_stats.txt; do
printf '%s, ' "$f";
sed '1,4d' "$f" | cut -d' ' -f2- | paste -sd' ' - | sed 's/ /, /g'
done > output.txtor Code:
for f in *_stats.txt; do
printf '%s, ' "$f";
sed -n '1,4d;s/[^ ]* //;H;5h;${x;s/[ \n]/, /gp;}' "$f"
done > output.txtSafely prepending the filename to each line using sed is problematic, hence the lingering printf. Cheers, Alister Last edited by alister; 03-13-2010 at 11:12 AM.. |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
Another way: Code:
awk 'FNR < 5{next}
FNR==5{printf("%s%s", f?"\n":"", FILENAME); f=1}
{$1="";printf $0}
END{printf "\n"}
' ORS=", " OFS=", " *_stats.txt > output.txtRegards |
| Sponsored Links | ||
|
|
![]() |
| Tags |
| awk, regex, scripting, sed |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| parsing log files, removing spaces and replace with commas | caddyjoe77 | Shell Programming and Scripting | 6 | 06-09-2009 10:02 AM |
| Inserting commas and replacing backslashes with commas | kangaroo | UNIX for Dummies Questions & Answers | 3 | 11-11-2008 08:32 PM |
| How do i remove commas(,) & spaces | anushree.a | Linux | 5 | 05-26-2008 02:39 AM |
| replacing commas with tilde in csv file. | femig | Shell Programming and Scripting | 4 | 11-13-2006 11:41 AM |
| Removing empty spaces and adding commas | jazz | Programming | 4 | 11-13-2006 08:54 AM |
|
|