01-19-2019
Quote:
Are the ranges given in your first input file always in increasing numerical order for each $1,$4 set of values (as in your sample file f1)? If they are we can use that information to make your code run faster.
Yes, these should always be sorted like in
f1
Quote:
Is the fifth subfield of $4 in your second input file always identical to the $1 value on the same input line (as in your sample files)? If they are, we can use that information to make your code run faster.
Yes, this will always be the case if
$4 is found as in
f1
Quote:
You note that your input files fields are separated by tabs. Do you want the output file to be tab delimited too; or do you want the output to be delimited by spaces as shown in your sample output?
f1 will always be
tab-delimited except for a whitespace after
$3 and
$4, but the output would be
tab-delimited I did and
OFS="\t" but I think the whitespaces are making that not work
You are correct in that I meant to be looking for inclusive endpoints so the
>=/<= is what I should have used.
Quote:
Is it your intent to print the line containing exon if either endpoint is in an entry in the first input file for that $1,$4 pair, or should it only print the exon line if both endpoints are in range?
I used the
|| statement to make sure the script works as expected but it could be
&& as both coordinates should lie within the endpoints (trying to think of a situation where its not the case and not coming up with anything).
Thank you very much
.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
hello experts,
I have a file: File1 Sample Test1
This is a Test
Sample Test2
Another Test
Final Test3
A final Test
I can use sed to delete the line with specific text
ie: sed '/Test2/d' File1.txt > File2.txt
How can I delete the line with the matching text and the line immediately... (6 Replies)
Discussion started by: orahi001
6 Replies
2. Shell Programming and Scripting
Hi,
I wish to use a column, as inputted by a user from command line, for pattern matching.
awk file:
{
if($1 ~ /^8/)
{
print $0> "temp2.csv"
}
}
something like this, but i want '$1' to be any column as selected by the user from command line.
... (1 Reply)
Discussion started by: invinclible0009
1 Replies
3. Shell Programming and Scripting
Dear All,
I would like to add values of a field, if the lines match in a certain field. Then I would like to divide the sum though the number of lines that have a matched field. This is the Input:
Input:
Test1 5
Test1 10
Test2 2
Test2 5
Test2 13
Test3 4
Output:
Test1 7.5
Test1 7.5... (6 Replies)
Discussion started by: DerSeb
6 Replies
4. Shell Programming and Scripting
Sample file:
This is line one,
this is another line,
this is the PRIMARY INDEX line
l ;
This is another line
The command should find the line with “PRIMARY INDEX” and remove the last character from the line preceding it (in this case , comma) and remove the first character from the line... (5 Replies)
Discussion started by: KC_Rules
5 Replies
5. Shell Programming and Scripting
Hi,
I want to achieve something similar to what described in another post:
The difference is I want to add the line if the pattern is not found.
File 1:
A123, valueA, valueB
B234, valueA, valueB
C345, valueA, valueB
D456, valueA, valueB
E567, valueA, valueB
F678, valueA, valueB
... (11 Replies)
Discussion started by: jyu3
11 Replies
6. Shell Programming and Scripting
Hi there,
I'm trying to use awk to print out the entire line that contains a match to a certain regex and then append some text,plus the match to the end of the line.
So far I have:
awk -F: '{print "RG:Z:" $2}' file
Which prints out the match I want plus the additional text, but I'm stuck... (3 Replies)
Discussion started by: jim_lad
3 Replies
7. Shell Programming and Scripting
Hello Help,
2356798 7689867 999 000
123678 20385907 9797 666
17978975 87468976 968978 98798
I am trying to have out put which actually look for the third column value of 9797 and then it insert line there after with first, second column value exactly as the previous line and replace the third... (3 Replies)
Discussion started by: Indra2011
3 Replies
8. Shell Programming and Scripting
The bash bash below extracts the oldest folder from a directory and stores it in filename
That result will match a line in bold in input. In the matching line there is an_xxx digit in italics that
(once the leading zero is removed) will match a line in link. That is the lint to print in output.... (2 Replies)
Discussion started by: cmccabe
2 Replies
9. Shell Programming and Scripting
In the awk I am trying to add :p.=? to the end of each $9 that matches the pattern NM_. The below executes andis close but I can not seem to figure out why the :p.=? repeats in the split as in the green in the current output. I have added comments as well. Thank you :).
file
... (4 Replies)
Discussion started by: cmccabe
4 Replies
10. UNIX for Beginners Questions & Answers
In the awk below I am trying to cp and paste each matching line in f2 to $3 in f1 if $2 of f1 is in the line in f2 somewhere. There will always be a match (usually more then 1) and my actual data is much larger (several hundreds of lines) in both f1 and f2. When the line in f2 is pasted to $3 in... (4 Replies)
Discussion started by: cmccabe
4 Replies
CUT(1) BSD General Commands Manual CUT(1)
NAME
cut -- cut out selected portions of each line of a file
SYNOPSIS
cut -b list [-n] [file ...]
cut -c list [file ...]
cut -f list [-w | -d delim] [-s] [file ...]
DESCRIPTION
The cut utility cuts out selected portions of each line (as specified by list) from each file and writes them to the standard output. If no
file arguments are specified, or a file argument is a single dash ('-'), cut reads from the standard input. The items specified by list can
be in terms of column position or in terms of fields delimited by a special character. Column and field numbering start from 1.
The list option argument is a comma or whitespace separated set of increasing numbers and/or number ranges. Number ranges consist of a num-
ber, a dash ('-'), and a second number and select the columns or fields from the first number to the second, inclusive. Numbers or number
ranges may be preceded by a dash, which selects all columns or fields from 1 to the last number. Numbers or number ranges may be followed by
a dash, which selects all columns or fields from the last number to the end of the line. Numbers and number ranges may be repeated, overlap-
ping, and in any order. It is not an error to select columns or fields not present in the input line.
The options are as follows:
-b list
The list specifies byte positions.
-c list
The list specifies character positions.
-d delim
Use delim as the field delimiter character instead of the tab character.
-f list
The list specifies fields, separated in the input by the field delimiter character (see the -d option). Output fields are separated
by a single occurrence of the field delimiter character.
-n Do not split multi-byte characters. Characters will only be output if at least one byte is selected, and, after a prefix of zero or
more unselected bytes, the rest of the bytes that form the character are selected.
-s Suppress lines with no field delimiter characters. Unless specified, lines with no delimiters are passed through unmodified.
-w Use whitespace (spaces and tabs) as the delimiter. Consecutive spaces and tabs count as one single field separator.
ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of cut as described in environ(7).
EXIT STATUS
The cut utility exits 0 on success, and >0 if an error occurs.
EXAMPLES
Extract users' login names and shells from the system passwd(5) file as ``name:shell'' pairs:
cut -d : -f 1,7 /etc/passwd
Show the names and login times of the currently logged in users:
who | cut -c 1-16,26-38
SEE ALSO
colrm(1), paste(1)
STANDARDS
The cut utility conforms to IEEE Std 1003.2-1992 (``POSIX.2'').
HISTORY
A cut command appeared in AT&T System III UNIX.
BSD
August 8, 2012 BSD