I have a requirement to compare two files with awk.
file1.txt
a
b
c
d
e
file2.txt
a
b
d
e
Your sample data files have an unequal number of lines but you did not state how to handle that. Is that a typo? If not, should comparison stop when the shorter file's end is reached? Or should the shorter file be treated as if it had enough empty lines to match the excess lines of the longer file? Or perhaps that should just be considered a match?
The following code uses paste, which treats a shorter file as if it had empty lines after end-of-file; it assumes that there are no tab characters in your lines (if there are, a different delimiter is required); and it prints out a line number when there's a difference.
Hi there
How can I compare data line by line from a file?
I need to compare the second value with the fourth to know if they are different.
If those values are different, I require to send my first value to the output until the complete file has been read.
This is my file:
0 FALSE... (1 Reply)
Hi
I want to compare a string at fixed position 10-20 for all the lines starting with 6. if they dont match it should take a copy of a line starting with 1 and insert it before the line starting with 6. How do i this? Please help
Eg
1 test 1 765533 7643743
6 yes 3 5363653 373833
7... (9 Replies)
Hello,
I have two files that I need to compare and print out the line from file2 that has the first 6 fields matching the first 6 fields in file1. Complicating this are the following restrictions
1. file1 is only a few thousand lines at most and file2 is greater than 2 million
2. I need to... (7 Replies)
Hi,
My awk program is failing. I figured out using command
od -c filename
that the last line of the file doesnt end with a new line character.
Mine is an automated process because of this data is missing.
How do i handle this?
I want to append new line character at the end of last... (2 Replies)
Hi Sorry to multipost. I am opening the new thread because the earlier threads head was misleading to my current doubt.
and i am stuck.
list=`cat /u/Test/programs`;
psg "ServTest" | awk -v listawk=$list '{
cmd_name=($5 ~ /^/)? $9:$8
for(pgmname in listawk)
... (6 Replies)
I have several hundreds of tiny files which need to be concatenated into one single line and all those in a single file. Some files have several blank lines. Tried to use this script but failed on it.
awk 'END { print r } r && !/^/ { print FILENAME, r; r = "" }{ r = r ? r $0 : $0 }' *.txt... (8 Replies)
Hi All,
I'm trying to figure out which are the trusted-ips and which are not using a script file.. I have a file named 'ip-list.txt' which contains some ip addresses and another file named 'trusted-ip-list.txt' which also contains some ip addresses. I want to read a line from... (4 Replies)
I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code:
awk... (34 Replies)
Hi all,
I have a log file say Test.log that gets updated continuously and it has data in pipe separated format. A sample log file would look like:
<date1>|<data1>|<url1>|<result1>
<date2>|<data2>|<url2>|<result2>
<date3>|<data3>|<url3>|<result3>
<date4>|<data4>|<url4>|<result4>
What I... (3 Replies)
My file (the output of an experiment) starts off looking like this,
_____________________________________________________________
Subjects incorporated to date: 001
Data file started on machine PKSHS260-05CP
**********************************************************************
Subject 1,... (9 Replies)
Discussion started by: samonl
9 Replies
LEARN ABOUT MOJAVE
paste
PASTE(1) BSD General Commands Manual PASTE(1)NAME
paste -- merge corresponding or subsequent lines of files
SYNOPSIS
paste [-s] [-d list] file ...
DESCRIPTION
The paste utility concatenates the corresponding lines of the given input files, replacing all but the last file's newline characters with a
single tab character, and writes the resulting lines to standard output. If end-of-file is reached on an input file while other input files
still contain data, the file is treated as if it were an endless source of empty lines.
The options are as follows:
-d list Use one or more of the provided characters to replace the newline characters instead of the default tab. The characters in list
are used circularly, i.e., when list is exhausted the first character from list is reused. This continues until a line from the
last input file (in default operation) or the last line in each file (using the -s option) is displayed, at which time paste
begins selecting characters from the beginning of list again.
The following special characters can also be used in list:
newline character
tab character
\ backslash character
Empty string (not a null character).
Any other character preceded by a backslash is equivalent to the character itself.
-s Concatenate all of the lines of each separate input file in command line order. The newline character of every line except the
last line in each input file is replaced with the tab character, unless otherwise specified by the -d option.
If '-' is specified for one or more of the input files, the standard input is used; standard input is read one line at a time, circularly,
for each instance of '-'.
EXIT STATUS
The paste utility exits 0 on success, and >0 if an error occurs.
EXAMPLES
List the files in the current directory in three columns:
ls | paste - - -
Combine pairs of lines from a file into single lines:
paste -s -d '
' myfile
Number the lines in a file, similar to nl(1):
sed = myfile | paste -s -d '
' - -
Create a colon-separated list of directories named bin, suitable for use in the PATH environment variable:
find / -name bin -type d | paste -s -d : -
SEE ALSO cut(1), lam(1)STANDARDS
The paste utility is expected to be IEEE Std 1003.2 (``POSIX.2'') compatible.
HISTORY
A paste command appeared in Version 32V AT&T UNIX.
BSD June 25, 2004 BSD