Sort a the file & refine data column & row format


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort a the file & refine data column & row format
# 1  
Old 07-09-2011
Sort a the file & refine data column & row format

cat file1.txt
Code:
field1 "user1":
field2:"data-cde"
field3:"data-pqr"
field4:"data-mno"
 
field1 "user1":
field2:"data-dcb"
field3:"data-mxz"
field4:"data-zul"
 
field1 "user2":
field2:"data-cqz"
field3:"data-xoq"
field4:"data-pos"

Now i need to have the date like below.
i have just given only 3 sets of data & may file contains 1000 sets of data
i need to have the data below format. Your help is higly appricated
Code:
field1 field2 field3 field4
user1 data-cde data-pqr data-mno
user1 data-dcb data-mxz data-zul
user2 data-cqz data-xoq data-pos


Thanks & Regards
Chandrasekhar K

Last edited by Franklin52; 07-09-2011 at 03:17 PM.. Reason: Please use code tags
# 2  
Old 07-09-2011
Try:
Code:
awk -F"[ :]" -vRS= 'BEGIN{print "field1 field2 field3 field4"}{gsub("\"","");gsub("\n"," ");print $2,$5,$7,$9}' file

# 3  
Old 07-09-2011
Hi,

Using 'perl':
Code:
$ cat script.pl
use warnings;                                                                                                                                                                       
use strict;                                                                                                                                                                         
                                                                                                                                                                                    
@ARGV == 1 or die "Usage: perl $0 <input-file>\n";                                                                                                                                  
                                                                                                                                                                                    
my %field;                                                                                                                                                                          
my $printed_header = 0;                                                                                                                                                             
                                                                                                                                                                                    
while ( <> ) {                                                                                                                                                                      
        if ( /^\s*$/ ) {                                                                                                                                                            
        ## When found a blank line print data saved previously.                                                                                                                     
                                                                                                                                                                                    
                ## Print header once in the program.                                                                                                                                
                unless ( $printed_header ) {                                                                                                                                        
                        print_header();                                                                                                                                             
                        $printed_header = 1;                                                                                                                                        
                }                                                                                                                                                                   
                                                                                                                                                                                    
                print_data();                                                                                                                                                       
                                                                                                                                                                                    
        } else {                                                                                                                                                                    
        ## Data found, save it in a hash.                                                                                                                                           
                chomp;                                                                                                                                                              
                                                                                                                                                                                    
                ## $f -> field name.                                                                                                                                                
                ## $d -> data.                                                                                                                                                      
                my ($f,$d);                                                                                                                                                         
                                                                                                                                                                                    
                if ( /^field\d+:/ ) {                                                                                                                                               
                ## All fields but first one.                                                                                                                                        
                        ($f,$d) = split /:/;                                                                                                                                        
                } else {                                                                                                                                                            
                ## Field 1.                                                                                                                                                         
                        ($f,$d) = split;                                                                                                                                            
                        $d =~ s/:\s*$//;                                                                                                                                            
                }                                                                                                                                                                   
                $d =~ tr/"//d;                                                                                                                                                      
                $field{ $f } = $d;                                                                                                                                                  
        }                                                                                                                                                                           
}                                                                                                                                                                                   
                                                                                                                                                                                    
print_data();                                                                                                                                                                       
                                                                                                                                                                                    
sub print_data {                                                                                                                                                                    
        for my $key ( sort keys %field ) {                                                                                                                                          
                printf "%s ", $field{ $key };                                                                                                                                       
        }                                                                                                                                                                           
        print "\n";                                                                                                                                                                 

}

sub print_header {
        for my $key ( sort keys %field ) {
                printf "%s ", $key;
        }
        print "\n";
}
$ perl script.pl infile
field1 field2 field3 field4 
user1 data-cde data-pqr data-mno 
user1 data-dcb data-mxz data-zul 
user2 data-cqz data-xoq data-pos

Regards,
Birei
# 4  
Old 07-10-2011
Code:
awk -F'field[1-4]' -v RS= 'BEGIN{print "field1\tfield2\tfield3\tfield4"}{gsub("\"","");gsub(":","")gsub("\n"," ");sub(" ","",$2);print $2 $3 $4 $5 $6}' file1.txt


Last edited by Franklin52; 07-10-2011 at 10:47 AM.. Reason: Code tags
# 5  
Old 07-10-2011
Hi.

Using the regularity of the posted data with common utilities:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate flattening of fields with sed.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C sed paste

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " Results:"
echo "field1 field2 field3 field4"
sed -e '/^[ 	]*$/d' -e 's/field..//' -e 's/[":]//g' $FILE |
paste -d" " - - - -

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
GNU bash 3.2.39
GNU sed version 4.1.5
paste (GNU coreutils) 6.10

-----
 Input data file data1:
field1 "user1":
field2:"data-cde"
field3:"data-pqr"
field4:"data-mno"
 
field1 "user1":
field2:"data-dcb"
field3:"data-mxz"
field4:"data-zul"
 
field1 "user2":
field2:"data-cqz"
field3:"data-xoq"
field4:"data-pos"

-----
 Results:
field1 field2 field3 field4
user1 data-cde data-pqr data-mno
user1 data-dcb data-mxz data-zul
user2 data-cqz data-xoq data-pos

See man pages for details ... cheers, drl

( edit 1: minor typo )

Last edited by drl; 07-10-2011 at 09:36 AM..
# 6  
Old 08-04-2011
Reply .....Still it is not working

My actual data looks like below

i have given only format. i can't give exact data format of my requirement due to some reasons. I this set of data lines about 5000

I need to come up with information in below

exact format of my data set :
Line<space>Number1<space>"somedata":
LineNumber2:"somedata"
LineNumber3:"somedata"
LineNumber4:"somedata"

------------------------------------


ab cd "somedata1":
efgh:"somedata2"
ijkl:"somedata3"
monp:"somedata4"

ab cd "somedata5":
efgh:"somedata6"
ijkl:"somedata7"
monp:"somedata"
i need to get the ouput as

abcd efgh ijkl monop
somedata1 somedata2 somedata3 somedata4
somedata5 somedata6 somedata7 somedata8







I would happy if you i get some script which can give this info.

from my file i will need to get the 4000 lines of data.

Thanks in adavance.

Last edited by ckaramsetty; 08-04-2011 at 05:01 AM.. Reason: typo
# 7  
Old 08-05-2011
How about this awk script:

Code:
awk '
NF==0 {
  if(keystr) print substr(keystr,2);
  print substr(vals,2);
  vals=keystr=""
  next
}
{
    gsub(/[\":]/," ",$0);
    key=$1;
    for(i=2;i<NF;i++) key=key$i;
    if(!(key in have)) {
        keystr=keystr" "key
        have[key]
    }
    vals=vals" "$i }
END { print substr(vals,2) }' infile


Last edited by Chubler_XL; 08-05-2011 at 01:11 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell command to transpose Row & Column

Hi all, need a help with getting a one line command to do the following. i have an input file with rows of data containing credits for each office, the output should be one row for each office with all the credits in rows for that office, if its not there then it should say N/A. the credits are... (10 Replies)
Discussion started by: tech_frk
10 Replies

2. Shell Programming and Scripting

File Move & Sort by Name - Kick out Bad File Names & More

I have a dilemma, we have users who are copying files to "directory 1." These images have file names which include the year it was taken. I need to put together a script to do the following: Examine the file naming convention, ensuring it's the proper format (e.g. test-1983_filename-123.tif)... (8 Replies)
Discussion started by: Nvizn
8 Replies

3. Shell Programming and Scripting

awk - script help: column to row format of data allignment?

Experts Good day, I have the following data, file1 BRAAGRP1 A2X B2X C2X D2X BRBGRP12 A3X B3X Z10 D09 BRC1GRP2 LO01 (4 Replies)
Discussion started by: rveri
4 Replies

4. Shell Programming and Scripting

Sort data from column to row

Hi, I need somebody's help with sorting data with awk. I've got a file: 10 aaa 4584 12 bbb 6138 20 ccc 4417 21 ddd 7796 10 eee 7484 12 fff ... (5 Replies)
Discussion started by: killerbee
5 Replies

5. Shell Programming and Scripting

awk/sed to search & replace data in first column

Hi All, I need help in manipulating the data in first column in a file. The sample data looks like below, Mon Jul 18 00:32:52 EDT 2011,NULL,UAT Jul 19 2011,NULL,UAT 1] All field in the file are separated by "," 2] File is having weekly data extracted from database 3] For eg.... (8 Replies)
Discussion started by: gr8_usk
8 Replies

6. UNIX for Dummies Questions & Answers

How to remove duplicated based on longest row & largest value in a column

Hii i have a file with data as shown below. Here i need to remove duplicates of the rows in such a way that it just checks for 2,3,4,5 column for duplicates.When deleting duplicates,retain largest row i.e with many columns with values should be selected.Then it must remove duplicates such that by... (11 Replies)
Discussion started by: reva
11 Replies

7. Shell Programming and Scripting

replace & with &amp; xml file

Hello All I have a xml file with many sets of records like this <mytag>mydata</mytag> <tag2>data&</tag2> also same file can be like this <mytag>mydata</mytag> <tag2>data&</tag2> <tag3>data2&amp;data3</tag3> Now i can grep & and replace with &amp; for whole file but it will replace all... (4 Replies)
Discussion started by: lokaish23
4 Replies

8. Shell Programming and Scripting

sort the org_no & member_type column ascending

I have a FILE1.DAT with the following information 21111111110001343 000001004OLF-AA029100020091112 21111111110000060 000001004ODL-CH001000020091112 24444444440001416 000001045OLF-AA011800020091112 23333333330001695 000001039OLF-AA030600020091112 23333333330000111... (5 Replies)
Discussion started by: new2ksh
5 Replies

9. Shell Programming and Scripting

Format - Inventory Row data into Column - Awk - Nawk

Hi All, I have the following file that has computer data for various pcs in my network... Snap of the file is as follows ******************************************************************************* Serial 123456 Computer IP Address lo0:... (1 Reply)
Discussion started by: aavam
1 Replies

10. UNIX for Dummies Questions & Answers

Search for & edit rows & columns in data file and pipe

Dear unix gurus, I have a data file with header information about a subject and also 3 columns of n rows of data on various items he owns. The data file looks something like this: adam peter blah blah blah blah blah blah car 01 30 200 02 31 400 03 57 121 .. .. .. .. .. .. n y... (8 Replies)
Discussion started by: tintin72
8 Replies
Login or Register to Ask a Question