Help with File processing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with File processing
# 1  
Old 10-10-2011
Question Help with File processing

I have a file with 3 columns, ID1, ID2, Text:
Code:
ID1; ID2; Text
1; X; aa
1; X; bb
1; Y; cc
1; Y; dd
2; X; ee
2; X; ff
2; Y; gg
2; Y; hh
2; Z; ii
2; Z; jj



Need o/p as:
Code:
ID; X; Y; Z
1; aa, bb; cc, dd; 
2; ee, ff; gg, hh; ii, jj



Like concatenation of Text column data and groupby based on ID.....
# 2  
Old 10-10-2011
Code:
awk -F\; 'END {
  print idc, id2c
  for (i = 0; ++i <= i1;) {
    printf "%s", id1[i] OFS    
    for (j = 0; ++j <= i2;)
      printf "%s", (key[id1[i], id2[j]] (j < i2 ? OFS : RS))
      }
  }
NR == 1 { idc = $1; next }  
{
  key[$1, $2] = key[$1, $2] ? key[$1, $2] sep $3 : $3
  _id1[$1]++ || id1[++i1] = $1 
  if (!_id2[$2]++) {
    id2[++i2] = $2
    id2c = id2c ? id2c OFS $2 : $2
    }    
  }' sep=, OFS=\; infile

If you have GNU awk the code could be shorter Smilie
# 3  
Old 10-10-2011
Data

Error:

Code:
$  awk -F\; 'END {
>   print idc, id2c
>   for (i = 0; ++i <= i1;) {
>     printf "%s", id1[i] OFS
>     for (j = 0; ++j <= i2;)
>       printf "%s", (key[id1[i], id2[j]] (j < i2 ? OFS : RS))
>       }
  }
> NR == 1 { idc = $1; next }
>   }
> NR == 1 { idc = $1; next }
> {
>   key[$1, $2] = key[$1, $2] ? key[$1, $2] sep $3 : $3
>   _id1[$1]++ || id1[++i1] = $1
  if (!_id2[$2]++) {
>   if (!_id2[$2]++) {
>     id2[++i2] = $2
>     id2c = id2c ? id2c OFS $2 : $2
>     }
>   }' sep=, OFS=\; test.file

awk: syntax error near line 6
awk: illegal statement near line 6
awk: syntax error near line 9
awk: bailing out near line 9

 $

# 4  
Old 10-10-2011
Try nawk, instead of awk:

Code:
nawk -F\; 'END { ...

# 5  
Old 10-10-2011
Quote:
Originally Posted by radoulov
Try nawk, instead of awk:

Code:
nawk -F\; 'END { ...

Its worked...but not exact o/p.
O/p:
Code:
$  nawk -F\; 'END {
>   print idc, id2c
  for (i = 0; ++i <= i1;) {
>   for (i = 0; ++i <= i1;) {
>     printf "%s", id1[i] OFS
>     for (j = 0; ++j <= i2;)
>       printf "%s", (key[id1[i], id2[j]] (j < i2 ? OFS : RS))
>       }
>   }
> NR == 1 { idc = $1; next }
> {
>   key[$1, $2] = key[$1, $2] ? key[$1, $2] sep $3 : $3
>   _id1[$1]++ || id1[++i1] = $1
>   if (!_id2[$2]++) {
>     id2[++i2] = $2
>     id2c = id2c ? id2c OFS $2 : $2
>     }
>   }' sep=, OFS=\; test.file
ID1; X; Y; Z;
1; aa, bb; cc, dd;;
2; ee, ff; gg, hh; ii, jj;
;;;;
 
$

Last line have ;;;;
How to get rid of this??
# 6  
Old 10-10-2011
Do you have an empty line at the end of the file? Try this:

Code:
awk -F\; 'END {
  print idc, id2c
  for (i = 0; ++i <= i1;) {
    printf "%s", id1[i] OFS    
    for (j = 0; ++j <= i2;)
      printf "%s", (key[id1[i], id2[j]] (j < i2 ? OFS : RS))
      }
  }
NR == 1 { idc = $1; next }  
NF {
  key[$1, $2] = key[$1, $2] ? key[$1, $2] sep $3 : $3
  _id1[$1]++ || id1[++i1] = $1 
  if (!_id2[$2]++) {
    id2[++i2] = $2
    id2c = id2c ? id2c OFS $2 : $2
    }    
  }' sep=, OFS=\; infile

This User Gave Thanks to radoulov For This Post:
# 7  
Old 10-10-2011
Data

Many thanks....
Yah it have an empty row at the end.

If possible, can u explain me the code....

---------- Post updated at 10:00 PM ---------- Previous update was at 09:43 PM ----------

But if I increase the i/p rows same empty semicolns are appearing......

Code:
ID1; ID2; Text
1; X; aa
1; X; bb
1; Y; cc
1; Y; dd
2; X; ee
2; X; ff
2; Y; gg
2; Y; hh
2; Z; ii
2; Z; jj
3; w; ll
3; w; mm
3; v; nn
3; u; oo

o/p:
Code:
ID1; X; Y; Z; w; v; u
1; aa, bb; cc, dd;;;;
2; ee, ff; gg, hh; ii, jj;;;
3;;;; ll, mm; nn; oo

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

awk - Rename output file, after processing, same as input file

I have one input file ABC.txt and one output DEF.txt. After the ABC is processed and created output, I want to rename ABC.txt to ABC.orig and DEF to ABC.txt. Currently when I am doing this, it does not process the input file as it cannot read and write to the same file. How can I achieve this? ... (12 Replies)
Discussion started by: High-T
12 Replies

2. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

3. Shell Programming and Scripting

Recursive file processing from a path and printing output in a file

Hi All, The script below read the path and searches for the directories/subdirectories and for the files. If files are found in the sub directories then read the content of the all files and put the content in csv(comma delimted) format and the call the write to xml function to write the std... (1 Reply)
Discussion started by: Optimus81
1 Replies

4. Shell Programming and Scripting

File Processing

i am having the input file as below 123456789: xxxxx12xxxxxxxxxxxxxxxxxx a_cnt 123456789: xxxxxxxxxxxxxxxxxxxxxxx a_cnt 123456789: a_cnt xxxxaq1wsxxxxxxxxxxxx12xxxxxxxxxx 123456789: xxxxxxxxxxxxasxxxx a_cnt i need the numbers in the backets of a_cnt O/p required as below 1 2 3 4... (2 Replies)
Discussion started by: expert
2 Replies

5. Shell Programming and Scripting

How to make parallel processing rather than serial processing ??

Hello everybody, I have a little problem with one of my program. I made a plugin for collectd (a stats collector for my servers) but I have a problem to make it run in parallel. My program gathers stats from logs, so it needs to run in background waiting for any new lines added in the log... (0 Replies)
Discussion started by: Samb95
0 Replies

6. Shell Programming and Scripting

How to processing the log file within certain dates based on the file name

Hi I am working on the script parsing specific message "TEST" from multiple file. The log file name looks like: N3.2009-11-26-03-05-02.console.log.tar.gz N4.2009-11-29-00-25-03.console.log.tar.gz N6.2009-12-01-10-05-02.console.log.tar.gz I am using the following command: zgrep -a --text... (1 Reply)
Discussion started by: shyork2001
1 Replies

7. Shell Programming and Scripting

how to change the current file processing to some other random file in awk ?

Hello, say suppose i am processing an file emp.dat the field of which are deptno empno empname etc now say suppose i want to change the file to emp.lst then how can i do it? Here i what i attempted but in vain BEGIN{ system("sort emp.dat > emp.lst") FILENAME="emp.lst" } { print... (2 Replies)
Discussion started by: salman4u
2 Replies

8. Shell Programming and Scripting

Checking for a control file before processing a data file

Hi All, I am very new to Shell scripting... I got a requirement. I will have few text files(data files) in a particular directory. they will be with .txt extension. With same name, but with a different extension control files also will be there. For example, Sample_20081001.txt is the data... (4 Replies)
Discussion started by: purna.cherukuri
4 Replies

9. Shell Programming and Scripting

Have a shell script check for a file to exist before processing another file

I have a shell script that runs all the time looking for a certain type of file and then it processes the file through a series of other scripts. The script is watching a directory that has files uploaded to it via SFTP. It already checks the size of the file to make sure that it is not still... (3 Replies)
Discussion started by: heprox
3 Replies
Login or Register to Ask a Question