Read multiple text files and copy data to csv


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Read multiple text files and copy data to csv
# 22  
Old 04-01-2016
i ran this
Code:
awk -F= -vOFS="," '
NR == 1         {HD = "Filename=IP address=PC name=destination=UTC begin=UTC end=destination IP address=duration (ms)=local begin=local end=" \
                        "source IP address=actual extension=original called party=original calling party"
                 for (N=i=split (HD, T); i>0; i--) SRCH[T[i]]
                 for (i=1; i<N; i++) printf "%s%s", T[i], OFS
                 print T[N]
                 FN = FILENAME
                }
FNR==1 && L     {printf "%s%s", FN, OFS
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                 FN = FILENAME
                 delete PR
                }

$1 in SRCH      {PR[$1]=$NF
                 L=1
                }
END             {printf "%s%s", FN, OFS
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                }
' *.ipfx_audio

and got this

Code:
Filename,IP address,PC name,destination,UTC begin,UTC end,destination IP address,duration (ms),local begin,local end,source IP address,actual extension,original called party,original calling party
20090130T165309.474091.ipfx_audio,,10.11.40.61,D01309,,20090130T165309.474090,20090130T170003.303469,10.11.41.151,413829,20090130T165309.474091,20090130T170003.303469,10.11.0.241,,,
20090130T165558.886637.ipfx_audio,,10.11.40.67,D01332,,20090130T165558.886637,20090130T165839.247671,10.11.41.136,160361,20090130T165558.886637,20090130T165839.247671,10.11.0.240,,,
20090130T170010.074146.ipfx_audio,,10.11.40.61,D01309,,20090130T170010.074146,20090130T170019.325071,10.11.41.151,9250,20090130T170010.074146,20090130T170019.325071,10.11.41.97,,,

i would expect this (only one , between audio and 10
Code:
Filename,IP address,PC name,destination,UTC begin,UTC  end,destination IP address,duration (ms),local begin,local end,source IP  address,actual extension,original called party,original calling party
20090130T165309.474091.ipfx_audio,10.11.40.61,D01309,,20090130T165309.474090,20090130T170003.303469,10.11.41.151,413829,20090130T165309.474091,20090130T170003.303469,10.11.0.241,,,
20090130T165558.886637.ipfx_audio,10.11.40.67,D01332,,20090130T165558.886637,20090130T165839.247671,10.11.41.136,160361,20090130T165558.886637,20090130T165839.247671,10.11.0.240,,,
20090130T170010.074146.ipfx_audio,10.11.40.61,D01309,,20090130T170010.074146,20090130T170019.325071,10.11.41.151,9250,20090130T170010.074146,20090130T170019.325071,10.11.41.97,,,

---------- Post updated at 03:41 PM ---------- Previous update was at 08:21 AM ----------

hi

i think i managed to remove this extra comma but if you could check it ?

i removed one
Code:
%s

Code:
END             {printf "%s", FN, OFS

i ran this

Code:
awk -F= -vOFS="," '
NR == 1         {HD = "Filename=IP address=PC name=destination=UTC begin=UTC end=destination IP address=duration (ms)=local begin=local end=" \
                        "source IP address=actual extension=original called party=original calling party"
                 for (N=i=split (HD, T); i>0; i--) SRCH[T[i]]
                 for (i=1; i<N; i++) printf "%s%s", T[i], OFS
                 print T[N]
                 FN = FILENAME
                }
FNR==1 && L     {printf "%s%s", FN, OFS
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                 delete PR
                }

$1 in SRCH      {PR[$1]=$NF
                 L=1
                }
END             {printf "%s", FN, OFS
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                }
' *.ipfx_audio

and it gives me and it looks pretty in a csv
Code:
Filename,IP address,PC name,destination,UTC begin,UTC end,destination IP address,duration (ms),local begin,local end,source IP address,actual extension,original called party,original calling party
test.ipfx_audio,10.15.12.23,d06613,test.ipfx,160544,160555,15.25.45.26,5555,51456,4582,15.25.45.63,8127766,0779585752,78225822

did i break anything?

thanks and have a great WE
# 23  
Old 04-02-2016
I think you want
Code:
END             {printf "%s", FN

One %s for the FN variable.
And omit the OFS variable (that is assigned the comma at the very beginning).
Each %s should have a corresponding argument in the printf function.
This User Gave Thanks to MadeInGermany For This Post:
# 24  
Old 04-02-2016
hi,

i still need
Code:
-vOFS=","

as i want some comma :-)

but tidy up like this

Code:
awk -F= -vOFS="," '
NR == 1         {HD = "Filename=IP address=PC name=destination=UTC begin=UTC end=destination IP address=duration (ms)=local begin=local end=" \
                        "source IP address=actual extension=original called party=original calling party"
                 for (N=i=split (HD, T); i>0; i--) SRCH[T[i]]
                 for (i=1; i<N; i++) printf "%s%s", T[i], OFS
                 print T[N]
                 FN = FILENAME
                }
FNR==1 && L     {printf "%s", FN
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                 delete PR
                }

$1 in SRCH      {PR[$1]=$NF
                 L=1
                }
END             {printf "%s", FN
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                }
' *.ipfx_audio

is it all good?

thanks
# 25  
Old 04-04-2016
Yes looks good now.
This User Gave Thanks to MadeInGermany For This Post:
# 26  
Old 04-04-2016
nearly there. now i m doing this

Code:
find * -type f -name "*.ipfx_audio" -exec awk -F= -vOFS="," '
NR == 1         {HD = "Filename=IP address=PC name=destination=UTC begin=UTC end=destination IP address=duration (ms)=local begin=local end=" \
                        "source IP address=actual extension=original called party=original calling party"
                 for (N=i=split (HD, T); i>0; i--) SRCH[T[i]]
                 for (i=1; i<N; i++) printf "%s%s", T[i], OFS
                 print T[N]
                 FN = FILENAME
                }
FNR==1 && L     {printf "%s", FN
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                 delete PR
                }

$1 in SRCH      {PR[$1]=$NF
                 L=1
                }
END             {printf "%s", FN
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                }
' {} \;

but the result is

Code:
Filename,IP address,PC name,destination,UTC begin,UTC end,destination IP address,duration (ms),local begin,local end,source IP address,actual extension,original called party,original calling party
1/20090130T165309.474091.ipfx_audio,10.11.40.61,D01309,,20090130T165309.474090,20090130T170003.303469,10.11.41.151,413829,20090130T165309.474091,20090130T170003.303469,10.11.0.241,,,
Filename,IP address,PC name,destination,UTC begin,UTC end,destination IP address,duration (ms),local begin,local end,source IP address,actual extension,original called party,original calling party
1/20090130T165558.886637.ipfx_audio,10.11.40.67,D01332,,20090130T165558.886637,20090130T165839.247671,10.11.41.136,160361,20090130T165558.886637,20090130T165839.247671,10.11.0.240,,,
Filename,IP address,PC name,destination,UTC begin,UTC end,destination IP address,duration (ms),local begin,local end,source IP address,actual extension,original called party,original calling party
2/20090130T170010.074146.ipfx_audio,10.11.40.61,D01309,,20090130T170010.074146,20090130T170019.325071,10.11.41.151,9250,20090130T170010.074146,20090130T170019.325071,10.11.41.97,,,

how can I just have one
Code:
Filename,IP address,PC name,destination,UTC begin,UTC end,destination IP  address,duration (ms),local begin,local end,source IP address,actual  extension,original called party,original calling party

thanks
# 27  
Old 04-04-2016
For a relatively small number of files (as in your example here), just change the last line of the script from:
Code:
' {} \;

to:
Code:
' {} +

If there are a lot of files in the directory and/or there are a lot of files to be processed, you could try something more like (untested):
Code:
echo "Filename,IP address,PC name,destination,UTC begin,UTC end,destination IP address,duration (ms),local begin,local end,source IP address,actual extension,original called party,original calling party"
find . -type f -name "*.ipfx_audio" -exec awk -F= -vOFS="," '
NR == 1         {HD = "Filename=IP address=PC name=destination=UTC begin=UTC end=destination IP address=duration (ms)=local begin=local end=" \
                        "source IP address=actual extension=original called party=original calling party"
                 for (N=i=split (HD, T); i>0; i--) SRCH[T[i]]
                 # Remove two lines here that printed the header.
                 FN = substr(FILENAME, 3)	# Strip leading "./".
                }
FNR==1 && L     {printf "%s", FN
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                 delete PR
                }

$1 in SRCH      {PR[$1]=$NF
                 L=1
                }
END             {printf "%s", FN
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                }
' {} +

Either of the above suggestions should work and, if there is more than one file to process, run faster than your current script.
# 28  
Old 04-04-2016
i tried like this

Code:
find * -type f -name "*.ipfx_audio" -exec awk -F= -vOFS="," '
NR == 1         {HD = "Filename=IP address=PC name=destination=UTC begin=UTC end=destination IP address=duration (ms)=local begin=local end=" \
                        "source IP address=actual extension=original called party=original calling party"
                 for (N=i=split (HD, T); i>0; i--) SRCH[T[i]]
                 for (i=1; i<N; i++) printf "%s%s", T[i], OFS
                 print T[N]
                 FN = FILENAME
                }
FNR==1 && L     {printf "%s", FN
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                 delete PR
                }

$1 in SRCH      {PR[$1]=$NF
                 L=1
                }
END             {printf "%s", FN
                 for (i=1; i<N; i++) printf "%s%s", PR[T[i]], OFS
                 print PR[T[N]]
                }
' {} +

but i get this

Code:
Filename,IP address,PC name,destination,UTC begin,UTC end,destination IP address,duration (ms),local begin,local end,source IP address,actual extension,original called party,original calling party
1/20090130T165309.474091.ipfx_audio,10.11.40.61,D01309,,20090130T165309.474090,20090130T170003.303469,10.11.41.151,413829,20090130T165309.474091,20090130T170003.303469,10.11.0.241,,,
1/20090130T165309.474091.ipfx_audio,10.11.40.67,D01332,,20090130T165558.886637,20090130T165839.247671,10.11.41.136,160361,20090130T165558.886637,20090130T165839.247671,10.11.0.240,,,
1/20090130T165309.474091.ipfx_audio,10.11.40.61,D01309,,20090130T170010.074146,20090130T170019.325071,10.11.41.151,9250,20090130T170010.074146,20090130T170019.325071,10.11.41.97,,,

my structure is

Code:
$ ls -la
total 12
drwxrwx---+ 1 Administrators Domain Users     0 Apr  4 20:17 .
drwxrwx---+ 1 Administrators Schema Admins    0 Apr  1 11:05 ..
drwxrwx---+ 1 Administrators Domain Users     0 Apr  4 17:11 1
drwxrwx---+ 1 Administrators Domain Users     0 Apr  4 17:11 2
drwxrwx---+ 1 Administrators Domain Users     0 Apr  4 17:11 3

so i would expect some result for folder 2 and 3 like this

Code:
$ find * -type f -name "*.ipfx_audio"
1/20090130T165309.474091.ipfx_audio
1/20090130T165558.886637.ipfx_audio
2/20090130T170010.074146.ipfx_audio
3/20090130T170013.413480.ipfx_audio

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Generate files and use csv data to replace multiple variables in a template

I have a source csv file consists of first field as variable name, and the rest are site-specific information (converted from excel file, where site -specific values in columns). I am trying to create a file for every site using a template and replace the multiple variables with values from the... (3 Replies)
Discussion started by: apalex
3 Replies

2. UNIX for Beginners Questions & Answers

Copy data at specified location from multiple files

Hello everyone, Im super new to coding but increasingly in need of it at work. Im have task stacked because of this problems, that I cannot figure out how to solve looking on the internet after trying many many things that looked similar to me. I have multiple data files of the form (see below).... (2 Replies)
Discussion started by: Xfiles_fan
2 Replies

3. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

4. Shell Programming and Scripting

Read csv file, convert the data and make one text file in UNIX shell scripting

I have input data looks like this which is a part of a csv file 7,1265,76548,"0102:04" 8,1266,76545,"0112:04" I need to make the output data should look like this and the output data will be part of text file: 7|1265000 |7654899 |A| 8|12660000 |76545999 |B| The logic behind the... (6 Replies)
Discussion started by: RJG
6 Replies

5. UNIX for Beginners Questions & Answers

awk GSUB read field values from multiple text files

My program run without error. The problem I am having. The program isn't outputting field values with the column headers to file.txt. Each of the column headers in file.txt has no data. MEMSIZE SECOND SASFoundation Filename The output results in file.txt should show: ... (1 Reply)
Discussion started by: dellanicholson
1 Replies

6. Shell Programming and Scripting

Extracting data from specific rows and columns from multiple csv files

I have a series of csv files in the following format eg file1 Experiment Name,XYZ_07/28/15, Specimen Name,Specimen_001, Tube Name, Control, Record Date,7/28/2015 14:50, $OP,XYZYZ, GUID,abc, Population,#Events,%Parent All Events,10500, P1,10071,95.9 Early Apoptosis,1113,11.1 Late... (6 Replies)
Discussion started by: pawannoel
6 Replies

7. UNIX for Dummies Questions & Answers

Need help combining txt files w/ multiple lines into csv single cell - also need data merge

:confused:Hello -- i just joined the forums. I am a complete noob -- only about 1 week into learning how to program anything... and starting with linux. I am working in Linux terminal. I have a folder with a bunch of txt files. Each file has several lines of html code. I want to combine... (2 Replies)
Discussion started by: jetsetter
2 Replies

8. Shell Programming and Scripting

Read multiple files, parse data and append to a file

Hi..Can anyone suggest a simple way of achieving this. I have several files which ends with extension .vcf . I will give example with two files In the below files, we are interested in File 1: 38 107 C 3 T 6 C/T 38 241 C 4 T 5 C/T 38 247 T 4 C 5 T/C 38 259 T 3 C 6 T/C... (8 Replies)
Discussion started by: empyrean
8 Replies

9. Shell Programming and Scripting

Combine Multiple text or csv files column-wise

Hi All I am trying to combine columns from multiple text files into a single file using paste command but the record length being unequal in the different files the data is running over to the closest empty cell on the left. Please see below. What can i do to resolve this ? File 1 File... (15 Replies)
Discussion started by: venky_ibm
15 Replies

10. Shell Programming and Scripting

Read the data from multiple files and sum the value

Hi all, I have a requirement where i have to read multiple files using Shell Script in Korn Shell. each file will have the 3rd line as the amount field, i have to read this amount field and sum it for all the files. any idea on how to achieve this?? (i think i can achieve it using a loop,... (9 Replies)
Discussion started by: nvuradi
9 Replies
Login or Register to Ask a Question