Merge two files with similar column entries


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge two files with similar column entries
# 1  
Old 04-08-2012
Error Merge two files with similar column entries

Hi ,

I have few files which contains user name and data transfer rate in MBs and this data is collected for year and for each month report is saved in 12 different files I have to merge all the files to prepare the final report
Files are as below


Quote:
Filename1 : January

#User Name #Data Transferred

A. Paul 300004
Jason 600000
Mayur Pandey 40000
Kelly H 459000
Ryan M 349000
Quote:
Filename2 : March

#User Name #Data Transferred

Senthl V R 600000
Mayur Pandey 40000
Kelly H 459000
Pratap S 349000
A. Paul 300004
Similarly I have 10 more files I have to merge all these files to one file and final output should be like below one:

Quote:
Final Report:

#Username #January .... #March ......... #December #Total

A. Paul 300004 300004 Not available 600008
Pratap S Not available 30000 32000 7899887
Kelly H 459000 459000 459000 3424448274
Mayur Pandey 40000 400000 400000 242424442
Senthl V R Not available 600000 34544 53546464
Jason 600000 Not available 3434343 43434355

I need a perl script to automate this rather using excel or doing manually .

These reports are generated monthly and names are not stored in sorted order. Names are stored in recent user who has used data transfer facility. And also there are occasions when a user have not used data transfer at all for particular month in this case name of the user will not appear in monthly report file. In this case I have to add not available in the month name column.




Thank you

Last edited by pratapsingh; 04-08-2012 at 06:37 PM..
# 2  
Old 04-08-2012
Imho. This is impossible because the "name" field in the sample provided consists of either one field (e.g. "Jason"), two fields ("Mayur Pandey") or three fields ("Senthl V R").

This is a system design issue.

Are you able to extract the base data in a consistent record format - complete with a field delimiter?
Can you do the processing in the original database without resorting to Shell?


Ps. The undelimited "Not availanble" and "Not available" field options change this program from "impossible" to "totally impossible".
# 3  
Old 04-08-2012
Sorry forgot to mention that .. all the fields are separated by \t TAB .
# 4  
Old 04-08-2012
Even if things are space separated, it's possible in awk. I didn't post anything when I first saw this because the request was for perl, and I don't do perl. Here's an awk just for yucks -- I would assume something similar is possible in perl.

Code:
awk '
    FILENAME != lfn { idx++; }
    {
        lfn = FILENAME;

        v = $NF;
        $NF = "";
        data[$0, idx] = v;
        seen[$0] = 1;
    }

    END {
        for( x in seen )
        {
            printf( "%s", x );
            for( i = 1; i <= idx; i++ )
                printf( "\t%s", data[x,i] ? data[x,i] : "Nothing" );
            printf( "\n" );
        }
    }
'  month1 month2 month3 month4



Output is separated by tabs, input doesn't matter. The assumption is that the last token on the record is the 'value' and that any token(s) before that are to be treated as a single unit.

EDIT: I also assumed that 'Not available' indicated missing data in the file and NOT that the string "not available" existed in the input file.

Last edited by agama; 04-08-2012 at 06:28 PM.. Reason: clarification
This User Gave Thanks to agama For This Post:
# 5  
Old 04-08-2012
"Not available" It was typo error .. I corrected the typo error in the post now .. I did not copy paste the file and selected few lines from the files and edited the post ..

Thanks Agama .. I will try your solution and will try to write the same thing in perl .. if this shell scripts work ..

---------- Post updated at 03:22 AM ---------- Previous update was at 03:05 AM ----------

@Agama

I tried running your script and it worked like a charm..

I am not sure about the logic you applied on it.
Could you let me know how this is being matched and printed?

Thank you
# 6  
Old 04-08-2012
Comments in the code, maybe more than you need, but I'd rather error on too much than too litte. Feel free to ask specific questions if something isn't clear.

Code:
awk '
    FILENAME != lfn { idx++; }      # if filename for this rec is different than last, bump index
    {
        lfn = FILENAME;             # save last filename seen for check above

        v = $NF;                    # save value from last token on record
        $NF = "";                   # delete the last token from the input record
        data[$0, idx] = v;          # use the current record (in this case the name) and the index to save value
                                    # this preserves order and allows us to detect missing values
        seen[$0] = 1;               # keep a list of user strings that we encountered
    }

    END {
        for( x in seen )            # for each user string we saw
        {
            printf( "%s", x );      # print the name
            for( i = 1; i <= idx; i++ )     # for each index (input file, in same order given on command
                printf( "\t%s", data[x,i] ? data[x,i] : "Nothing" );    # print the value saved, or "nothing" if no value in file
            printf( "\n" );         # final termination of output record for this id string.
        }
    }

'  month1 month2 month3 month4

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Merge two csv files using column name

Hi all, I have two separate csv files(comma delimited) file 1 and file 2. File 1 contains PAN,NAME,Salary AAAAA5467D,Raj,50000 AAFAC5467D,Ram,60000 BDCFA5677D,Kumar,90000 File 2 contains PAN,NAME,Dept,Salary ASDFG6756T,Karthik,ABC,450000 QWERT8765Y,JAX,CDR,780000... (5 Replies)
Discussion started by: Nivas
5 Replies

2. Shell Programming and Scripting

Add all 4 column entries for similar column ids

Hi, I want to write a script which will add the entries in all columns for the same column id. I can do it in excel, but I need to do this for 384 columns which will come down to 96 (384/4). How can I do this iteratively A A A A B B B B C C C C 1 0 1 0 2 1 4 5 3 4 5 6 2 0 0 2 3 5 70 100 1... (7 Replies)
Discussion started by: Diya123
7 Replies

3. Shell Programming and Scripting

How to merge two or more fields from two different files where there is non matching column?

Hi, Please excuse for often requesting queries and making R&D, I am trying to work out a possibility where i have two files field separated by pipe and another file containing only one field where there is no matching columns, Could you please advise how to merge two files. $more... (3 Replies)
Discussion started by: karthikram
3 Replies

4. Shell Programming and Scripting

Merge files based on the column value

Hi Friends, I have a file file1.txt 1|ABC|3|jul|dhj 2|NHU|4|kil|eu 3|hjd|34|hfd|43 file2.txt 1||3|KING|dhj 2|NHU||k| 3|hjd|34|hd|43 i want to merge file1.txt file2.txt based on the column null values in file2.txif there are any nulls in column values , (5 Replies)
Discussion started by: i150371485
5 Replies

5. UNIX for Advanced & Expert Users

merge two column multiple files into one

Hi I have multiple files each with two columns and I need to combine all those file into a tab delimited file. (multiple entry with same name separated by a comma) The content of the files are as follows: --- file1.txt: name var1 aaa xx aaa gg bbb yy ddd zz --- file2.txt ... (8 Replies)
Discussion started by: mary271
8 Replies

6. Shell Programming and Scripting

Help with merge two file based on similar column content

Input file 1: A1BG A1BG A1BG A1CF A1CF BCAS BCAS A2LD1 A2M A2M HAT . . Input file 2: A1BG All A1CF TEMP (5 Replies)
Discussion started by: perl_beginner
5 Replies

7. UNIX for Dummies Questions & Answers

Merge two files with two columns being similar

Hi everyone. How can I merge two files, where each file has 2 columns and the first columns in both files are similar? I want all in a file of 4 columns; join command removes the duplicate columns. 1 Dave 2 Mark 3 Paul 1 Apple 2 Orange 3 Grapes to get it like this in the 3rd file:... (9 Replies)
Discussion started by: Atrisa
9 Replies

8. Shell Programming and Scripting

Joining multiple files based on one column with different and similar values (shell or perl)

Hi, I have nine files looking similar to file1 & file2 below. File1: 1 ABCA1 1 ABCC8 1 ABR:N 1 ACACB 1 ACAP2 1 ACOT1 1 ACSBG 1 ACTR1 1 ACTRT 1 ADAMT 1 AEN:N 1 AKAP1File2: 1 A4GAL 1 ACTBL 1 ACTL7 (4 Replies)
Discussion started by: seqbiologist
4 Replies

9. Shell Programming and Scripting

Merge Two Files based on First column

Hi, I need to join two files based on first column of both files.If first column of first file matches with the first column of second file, then the lines should be merged together and go for next line to check. It is something like: File one: 110001 abc efd 110002 fgh dfg 110003 ... (10 Replies)
Discussion started by: apjneeraj
10 Replies

10. Shell Programming and Scripting

compare the column from 3 files and merge that line

I have 3 file, each of has got 80000 records. file1.txt ----------------------- ABC001;active;modify;accept; ABC002;notactive;modify;accept; ABC003;notactive;no-modify;accept; ABC004;active;modify;accept; ABC005;active;no-modify;accept; file2.txt ---------------------------... (8 Replies)
Discussion started by: ganesh_mak
8 Replies
Login or Register to Ask a Question