Data file manipulation


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Data file manipulation
# 1  
Old 03-23-2011
Data file manipulation

Hi,

I have two, double column data files (file1 and file2). I want to add the second column of file2 to as 3rd column of file1. But, the 3rd column values corresponds to the values of the 2nd column.
example:
Code:
file1:
X    Y
=========
x1  y2
x3  y4
x2  y4
x5  y3
=========
file2:
Y   Z
=========
y1  z1
y2  z2
y3  z3
y4  z4
y5  z5


Finally these result should look like:
X    Y   Z
=========
x1  y2  z2
x3  y4  z4
x2  y4  z4
x5  y3  z3
=========


Thanks a lot,
-G
PS: I know it kinda sounds confusing. But, that's the best I could come with.
# 2  
Old 03-23-2011
Code:
awk 'NR==FNR {a[$1]=$2; next} $2 in a {print $0, a[$2]}' f2 f1

Please let me know if there is any more cute and compact solution.
This User Gave Thanks to royalibrahim For This Post:
# 3  
Old 03-23-2011
MySQL RE: Data file manipulation

Hope this helps Smilie
Code:
#!/bin/sh

INPUT_PATH="/home/"
INPUT_FILE1="${INPUT_PATH}sample2.file1.txt"
INPUT_FILE2="${INPUT_PATH}sample2.file2.txt"
OUTPUT_FILE="${INPUT_PATH}sample2.file.out"
DUMMY_INPUT="${INPUT_FILE1}.dummy"

# Add the second column of file2 to file1 as the 3rd column
# Note: The 3rd column values corresponds to the values of the 2nd column

HEADER=`cat ${INPUT_FILE1} | sed q`
GRID=`cat ${INPUT_FILE1} | sed '$!d'`
DEL_HEADER=`cat ${INPUT_FILE1} | sed '1,2d' | sed '$d' > ${DUMMY_INPUT}`

echo "${HEADER}   Z" >> ${OUTPUT_FILE}
echo "${GRID}=" >> ${OUTPUT_FILE}

while read LINE
do
        REF_FILE1=`echo ${LINE} | awk '{ print $2 }'`
        REF_FILE2=`grep ${REF_FILE1} ${INPUT_FILE2} | awk '{ print $2 }'`
        echo "${LINE}  ${REF_FILE2}" >> ${OUTPUT_FILE}
done < ${DUMMY_INPUT}

echo "${GRID}=" >> ${OUTPUT_FILE}

rm ${DUMMY_INPUT}

exit


Last edited by Franklin52; 03-23-2011 at 06:24 AM.. Reason: Please use code tags, thank you
This User Gave Thanks to acmvillareal For This Post:
# 4  
Old 03-23-2011
Thanks royalibrahim and acmvillareal for your time and effort. Now I have to spend some time to try and understand the codes.

@royalibrahim
This is already compact (I'll have to try to understand how it works to assign cuteness points to it).

@acmvillareal
Though, your solution is a little long, I am glad that I can understand most of it. I know very little sed and awk.
# 5  
Old 03-23-2011
Hi.

Making use of GNU standard utilities join and sort, excluding headers:
Code:
join -1 2 -2 1 -o 1.1 0 2.2 <( sort -k2,2 $FILE1 ) <( sort -k1,1 $FILE2 )

producing:
Code:
x1 y2 z2
x5 y3 z3
x2 y4 z4
x3 y4 z4

See man page for details.

Best wishes ... cheers, drl
This User Gave Thanks to drl For This Post:
# 6  
Old 03-24-2011
Thanks drl,

That's an awesome piece of oneliner. I frequently use sort but have never used join before. From now on I am going to.
# 7  
Old 03-24-2011
Hi, gaurab.

You are welcome.

From my perspective, it's not the fact that the solution is a one-liner that is important. I often find one-liners to be too dense to understand, not to mention the problems with debugging.

The properties here that I think are important is that bash can set up temporary resources that act like files ( the " <( ... ) " syntax ), and that standard utilities are generally well-debugged, have been looked at for high performance, and have fewer anomalies compared to custom solutions.

Best wishes ... cheers, drl
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Data manipulation, Please help..

Hello, I have a huge set of data that needs to be reformatted. Here is a simple example to explain the process. I have number n=5 and a input with many numbers separated with comma: ... (11 Replies)
Discussion started by: liuzhencc
11 Replies

2. UNIX for Dummies Questions & Answers

Data Manipulation

Dear Sir, I have file input RGR001|108.28|-2.86489|100-120|RANGGAR RGR002|108.071|-2.69028|80-100|RANNGAR RGR003|108.168|-2.97053|50-80|RANNGAR RGR007|108.192722222|-2.766138889|0-50|RANGGARI want to create files by joining each rows with each rows below Output as below ... (4 Replies)
Discussion started by: radius
4 Replies

3. UNIX for Dummies Questions & Answers

Data manipulation

Hallo Team, I need to manipulate existing data file. Have a look at current data and expected data: Current Data: 27873517141 27873540000 27873515109 27873517140 27873540001 27873540000 27873501343 27873540000 27873517140 27873511292 27873645989 27873540000 27873540000... (7 Replies)
Discussion started by: kekanap
7 Replies

4. Shell Programming and Scripting

Data Manipulation on a .csv file

Hallo Friends, I need you help. My file has 5000 or so lines and currently looks like below(sample). Service Type,Origin,Destination,Rate Per Minute,Minimum Charge,Time Based Rate,Time Based From Day,Time Based To Day,Time Based From Time,Time Based To Time,Destination Prefix List,, VoIS... (3 Replies)
Discussion started by: kekanap
3 Replies

5. Shell Programming and Scripting

Manipulation of file data with UNIX

Hello , How all doing today.. I have a little doubt in Unix (6 Replies)
Discussion started by: adisky123
6 Replies

6. Shell Programming and Scripting

Populating File data with custom manipulation on file names

Hi, I am confused how to proceed firther please find the problem below: Input Files: DCIA_GEOG_DATA_OCEAN.TXT DCIA_GEOG_DATA_MCRO.TXT DCIA_GEOG_DATA_CVAS.TXT DCIA_GEOG_DATA_MCR.TXT Output File Name: MMA_RFC_GEOG_NAM_DIM_LOD.txt Sample Record(DCIA_GEOG_DATA_OCEAN.TXT):(Layout same for... (4 Replies)
Discussion started by: Arun Mishra
4 Replies

7. Shell Programming and Scripting

Data manipulation from a file

i have a file in follwing format 0110 1020 1011 1032 1020 2005 2003 1050 i want the output in such a way that all non zero numbers will be converted into 1 like this 0110 1010 1011 1011 1010 1001 1001 1010 (3 Replies)
Discussion started by: vaibhavkorde
3 Replies

8. Shell Programming and Scripting

Data manipulation from one file

HI all i have a file consisting of following numbers 0000 0000 0000 0000 0000 1010 0000 0100 0000 0000 0000 1111 0000 1010 0000 0100 (3 Replies)
Discussion started by: vaibhavkorde
3 Replies

9. UNIX for Dummies Questions & Answers

Data Manipulation

Hello I am currently having problems in mapulating a certain file which contains vaious data. Belos is a sample content Event=<3190> Client IP=<151.111.11.143> DNS=<abc.sbc.com> TransCount=<139> Client IP=<150.222.133.163> DNS=<xyz.yuu.com> TransCount=<3734> Event=<3120> Client... (11 Replies)
Discussion started by: khestoi
11 Replies

10. UNIX for Dummies Questions & Answers

UNIX - File/Table/Data manipulation

Hi, I have a table (e.g.): a 1 e 4 5 6 b 2 r 4 4 2 c 5 r 3 7 1 d 9 t 4 4 9 . . What I need to do is to set the values of some values in column 2 to negative values. For example, the values 2 and 9 should become -2 and -9 in the modified file. How should I go about... (2 Replies)
Discussion started by: pc2001
2 Replies
Login or Register to Ask a Question