Some helpful individuals from the forums helped with this code below but it seems to be generating the same output for any string pattern I give it when I run it with a large file (200GB). However, when I run it with a small test file, that output is different for each string. Why do you think this could be happening?
Also, am I recording the length appropriately for all of the column 2 lines that have the same ID from column 1? Here's the code:
I'm only new to shell programming and have been given a task to do a program in .sh, however I've come to a point where I'm not sure what to do. This is my code so far:
# process all arguments (i.e. loop while $1 is present)
while ; do
# echo "Arg is $1"
case $1 in
-h*|-H*) echo "help... (4 Replies)
Hi,
I have a file with 3 columns in it that are comma separated and it has about 5000 lines. What I want to do is find the most common value in column 3 using awk or a shell script or whatever works! I'm totally stuck on how to do this.
e.g.
value1,value2,bob
value1,value2,bob... (12 Replies)
Hi guys,
I am stuck in this problem. Please help.
I have two files.
FILE1 (with records starting from '>' )
>TC1723_3 similar to Scific_A7Q9Q3
EMSPSQDYCDDYFKLTYPCTAGAQYYGRGALPVYWNYNYGAIGEALKLDLLNHPEYIEQN
ATMAFQAAIWRWMNPMKKGQPSAHDAFVGNWKP
>TC214_2 similar to Quiet_Ref100_Q8W2B2 Cluster;... (1 Reply)
I will be performing a task on several directories, each containing a large number of files (2500+) that follow a regular naming convention:
YYYY_MM_DD_XX.foo_bar.A.B.some_different_stuff.EXT
What I would like to do is automatically discover the part of the filenames that are common to all... (1 Reply)
I currently have publication lists for ~3 dozen faculty members. I need to find out how many publications are in common across all faculty members - person 1 with person 2, person 1 with person 3, person 2 with person 3, person 1 with both person 2 and person 3, etc.
One person may have
Last1,... (5 Replies)
Hello Everyone,
I am looking for a way to extract substrings to local variables. Here is the format of the string variable i am using :
/var/x/www && /usr/x/share/doc && /etc/x/logs
where the substrings i must extract are the "/var/x/www" and such.
I was originally thinking of using... (15 Replies)
Dear All,
I have 2 files. If field 1, 2, 4 and 5 matches in both file1 and file2, I want to print the whole line of file1 and file2 one after another in my output file.
File1:
sc2/80 20 . A T 86 F=5;U=4
sc2/60 55 . G T ... (1 Reply)
Hello gurus,
I have a lookup table
cat tmp1
\\\erw``~ 1
^774574574565665f\] 2
()42543^
and I`m trying to compare a bunch of strings such that, either the lookup table column 1, or the string to be looked up are substrings of each other (and return the second lookup column if yes).
... (2 Replies)
Hello, I need to find the intersection across 10 columns. Kindly help.
my file (INPUT.csv) looks like this
4_R 4_S 8_R 8_S 12_R 12_S 24_R 24_S
LOC_Os01g01010 LOC_Os01g01010 LOC_Os01g01010 LOC_Os04g48290 LOC_Os01g01010 LOC_Os01g01010... (1 Reply)
Discussion started by: Sanchari
1 Replies
LEARN ABOUT DEBIAN
plan9-join
JOIN(1) General Commands Manual JOIN(1)NAME
join - relational database operator
SYNOPSIS
join [ options ] file1 file2
DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If one of the file names is the
standard input is used.
File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in
each line.
There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con-
sists of the common field, then the rest of the line from file1, then the rest of the line from file2.
Input fields are normally separated spaces or tabs; output fields by space. In this case, multiple separators count as one, and leading
separators are discarded.
The following options are recognized, with POSIX syntax.
-a n In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.
-v n Like -a, omitting output for paired lines.
-e s Replace empty output fields by string s.
-1 m
-2 m Join on the mth field of file1 or file2.
-jn m Archaic equivalent for -n m.
-ofields
Each output line comprises the designated fields. The comma-separated field designators are either 0, meaning the join field, or
have the form n.m, where n is a file number and m is a field number. Archaic usage allows separate arguments for field designators.
-tc Use character c as the only separator (tab character) on input and output. Every appearance of c in a line is significant.
EXAMPLES
sort /etc/passwd | join -t: -1 1 -a 1 -e "" - bdays
Add birthdays to the /etc/passwd file, leaving unknown birthdays empty. The layout of /adm/users is given in passwd(5); bdays con-
tains sorted lines like
tr : ' ' </etc/passwd | sort -k 3 3 >temp
join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2'
Print all pairs of users with identical userids.
SOURCE
/src/cmd/join.c
SEE ALSO sort(1), comm(1), awk(1)BUGS
With default field separation, the collating sequence is that of sort -b -ky,y; with -t, the sequence is that of sort -tx -ky,y.
One of the files must be randomly accessible.
JOIN(1)