Visit Our UNIX and Linux User Community


merge 2 files (without repeating any lines)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting merge 2 files (without repeating any lines)
# 1  
Old 09-28-2007
merge 2 files (without repeating any lines)

I need to add the content of file1 to file2 - all lines but not those existing in file2 already, so the "cat file1 >> file2" doesn't work.

For example,
file1:
100 xxxxxx str1
102 xxxxxx str2


File2:
50 xxxxxxx xxx
30 xxxxxxxxxxx
102 xxxxxx str2 xxxx
......

the result:
50 xxxxxxx xxx
30 xxxxxxxxxxx
102 xxxxxx str2 xxxx
.....
100 xxxxxx str1

Also, the second line in file1 & third in file2 can either be completely same or with the same patern - starting with a same string and having another same string anywhere in the line.

Please help!
Thank you so much.
(it's a bourne-sh)
# 2  
Old 09-28-2007
cat sort uniq

Hi,
I think you can use cat first to join these two files together, then sort them, and then use the uniq to delete the duplicated lines.

input:
Code:
a>
1
2
3
4
5
6
b>
3
1
2
3
5
342
45
234
2
3

output:
Code:
1
2
234
3
342
4
45
5
6

code:
Code:
cat a>>b | cat b | sort | uniq

# 3  
Old 09-29-2007
Hi summer_cherry

Theoretically, this would work. However, since the files involved are system config files, with certain lines grouped together for a reason, so I'd hate to sort them. BTW, it also means all the empty lines or empty lines starting # between sections would be gone- as they would be meaningless anyway.

I was thinking to circle through the file(2) for each line (from f1) to be addded -but surely there'd be a better solution (& I don't know enough about the utilities to figure it out)

Thank you.
# 4  
Old 09-29-2007
Quote:
Originally Posted by bluemoon1
I need to add the content of file1 to file2 - all lines but not those existing in file2 already, so the "cat file1 >> file2" doesn't work.

For example,
file1:
100 xxxxxx str1
102 xxxxxx str2


File2:
50 xxxxxxx xxx
30 xxxxxxxxxxx
102 xxxxxx str2 xxxx
......

the result:
50 xxxxxxx xxx
30 xxxxxxxxxxx
102 xxxxxx str2 xxxx
.....
100 xxxxxx str1

Also, the second line in file1 & third in file2 can either be completely same or with the same patern - starting with a same string and having another same string anywhere in the line.

Please help!
Thank you so much.
(it's a bourne-sh)
Hey,

You can get the first line from the first file and find is it there in second file ? If yes then dont append otherwise append the line in the second file.

YOu can use grep to search the first line from first file and awk to use the same line as pattern to be searched in the second file.
# 5  
Old 10-02-2007
varungupta:
This is actually what I'm doing now- I had some special key words added in a comment line indicating the start of the file1 & before adding anything I check file2 if that line exists already or not - however I don't think it is safe, as nothing prevents that line from getting deleted in file2 over time...

summer_cherry's solution would work perfectly however re-ordering the file is not acceptible.

Perhaps I need a merge function, to do a diff and add lines from file1 that are not in file2?

It'd be tricky though how to remove the lines that have been added from file2 later.

Any thoughts would be appreciated!
# 6  
Old 10-02-2007
Quote:
Originally Posted by bluemoon1
Any thoughts would be appreciated!
It is good practice to store configurations files that you change in RCS somewhere, so if you need to backtrack or do problem determination you can refer to previous revisions.
# 7  
Old 10-05-2007
Hi porter,
I do back up all the config files during the installation. The thing is, I can't simply restore those files during the uninstall. This is because of the fact that some of the files may have been changed over the time. The customers would have to lose their data if we restore the original files. If at the unisntall time we remove exactly what we added during the install, then it'd be safe.

Bluemoon

Previous Thread | Next Thread
Test Your Knowledge in Computers #17
Difficulty: Easy
Microsoft first developed the Java programming language.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Bash repeating lines for some files but not all

The bash below executes and seems to work fine on those files in which . However on those files where there is no additional CNV detected that line repeats multiple times instead of only once. I tried adding an END as all lines are printed but that doesn't help. I can not seem to solve this... (5 Replies)
Discussion started by: cmccabe
5 Replies

2. Shell Programming and Scripting

Deleting Repeating lines from a txt file via script

Hi, I'm having trouble in achieving the following scenario. There is a txt file with thousands of lines and few lines are repeated, which needs to be removed using a script. File.txt 20140522121432,0,12,ram Loc=India From=ram@xxx.com, To=ravi@yyy.com,, 1 2 3 4 . . 30... (18 Replies)
Discussion started by: Gautham
18 Replies

3. UNIX for Dummies Questions & Answers

Need help combining txt files w/ multiple lines into csv single cell - also need data merge

:confused:Hello -- i just joined the forums. I am a complete noob -- only about 1 week into learning how to program anything... and starting with linux. I am working in Linux terminal. I have a folder with a bunch of txt files. Each file has several lines of html code. I want to combine... (2 Replies)
Discussion started by: jetsetter
2 Replies

4. Shell Programming and Scripting

Compare last 90 logs and print repeating lines with >20

*log files are in date order sample logs... ciscoresets_20120314 ciscoresets_20120313 ciscoresets_20120312 ciscoresets_20120311 ciscoresets_20120310 cat ciscoresets_20120314 SYDGRE04,10,9 SYDGRE04,10,10 SYDGRE04,10,11 SYDGRE04,10,12 SYDGRE04,10,13 SYDGRE04,10,14 SYDGRE04,10,15... (2 Replies)
Discussion started by: slashbash
2 Replies

5. Shell Programming and Scripting

Printing the lines which are repeating in a files

Hi, I need to find the lines which are repeating in a file cat file1 abcdef 23-1 abcdef 24-1 bcdeff 25-0 ttdcfg 26-0 ttdcfg 20-0 bcdef1 25-0 bcdef2 25-0 bcdef3 25-0 bcdef4 25-0 bcdef4 00-0any help is greatly appreciated. Thanks in advance. In need to find which one are... (3 Replies)
Discussion started by: jpkumar10
3 Replies

6. Shell Programming and Scripting

Removing repeating lines from a data frame (AWK)

Hey Guys! I have written a code which combines lots of files into one big file(.csv). However, each of the original files had headers on the first line, and now that I've combined the files the headers are interspersed throughout the new combined data frame. For example, throughout the data... (21 Replies)
Discussion started by: gd9629
21 Replies

7. Shell Programming and Scripting

remove blank lines and merge lines in shell

Hi, I'm not a expert in shell programming, so i've come here to take help from u gurus. I'm trying to tailor a csv file that i got to make it work for the LOAD FROM command. I've a datatable csv of the below format - --in file format xx,xx,xx ,xx , , , , ,,xx, xxxx,, ,, xxx,... (11 Replies)
Discussion started by: dvah
11 Replies

8. UNIX for Dummies Questions & Answers

Remove groups of repeating lines

I know uniq exists, but am not sure how to remove repeating lines when they are groups of two different lines repeating themselves, without using sort. I need them to be sorted in the original order, just to remove repeats. cd /media/AUDIO/WAVE/9780743518673/mp3 ~/Desktop/mp3-to-m4b... (1 Reply)
Discussion started by: glev2005
1 Replies

9. Shell Programming and Scripting

Merging non-repeating columns of lines

Hello, I have file to work with. It has 5 columns. The first three, altogether, constitutes the position. The 4th column contains some values for downstream analysis and the fifth column contains some values that I want to add to 4th column (only if they happen to be in the same position). My... (5 Replies)
Discussion started by: menenuh
5 Replies

10. UNIX for Dummies Questions & Answers

Omit repeating lines

Can someone help me with the following 2 objectives? 1) The following command is just an example. It gets a list of all print jobs. From there I am trying to extract the printer name. It works with the following command: lpstat -W "completed" -o | awk -F- '{ print $1}' Problem is, I want... (6 Replies)
Discussion started by: TheCrunge
6 Replies

Featured Tech Videos