Reading and appending a row from file1 to file2 using awk or sed
Hi, I wanted to add each row of file2.txt to entire length of file1.txt given the sample data below and save it as new file. Any idea how to efficiently do it. Thank you for any help.
I have an Awk string-compare problem and have searched the internet and forums for a solution i could use but cannot find a solution i understand to make work with my particular problem:
I need to compare (field1 field2 field3 of File1) against (field1 of File2) and if they match print out... (6 Replies)
I'm trying to compare two files using AWK, where if field2 of both files match, replace field1 of file1 with field1 of file2 and if there is no match just print the line of file1.
file1.txt (has empty first field)
:ABBATOM:B:H:1992
:ABBA TROJAN:B:H:1993
:ABBES FIRST HOPE:B:M:1997
:ABBEYS... (4 Replies)
Hi All,
as you can see I'm pretty new to this board. :D
I'm struggling around with small script to search a few fields in another file.
Basically I have file1 looking like this:
15:38:28 sz:10001 pr:14.16
15:38:28 sz:10002 pr:18.41
15:38:29 sz:10003 pr:19.28
15:38:30 sz:10004... (1 Reply)
Hi All,
I have to arrange one of the text file by deleting specific lines.
cat file1.txt
3595 3595 -0.00842773 -0.0085077 0.00368851
12815 12815 -0.00929239 0.00439785 0.0291697
3747 3747 -0.00974353 0.00228922 0.0225058
3574 3574 -0.00711399 -0.00315748 0.0141206
....
12734... (7 Replies)
I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string.
I'm looking to match column1 in file1 to the number... (3 Replies)
Hi.
How can we print those rows of file2 which are mentioned in file1. first character of file1 is a row number.. for eg
file1
1:abc
3:ghi
6:pqr
file2
a abc
b def
c ghi
d jkl
e mno
f pqr
... (6 Replies)
My original files are like this below and I distinguish them from the AP_ID (file1 has 572 and file2 has 544). Also, the header on file1 has “G_” pre-pended. NOTE: these are only snippets of very large files and much of the data is not present here.
Original File 1:
... (36 Replies)
I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited.
I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Trying to use awk to:
update $2 in file2 with the $2 value in file1, if $1 in file1 matches $13 in file2, which is tab-delimeted. The $2values may already be the same so in that case nothing happens and the next line is processed.
There are exactly 4,605 unique $13 values. Thank you :).
... (4 Replies)
This is a question that is related to one I had last August when I was trying to sort/merge two files by millsecond time column (in this case column 6).
The script (below) that helped me last august by RudiC solved the puzzle of sorting/merging two files by time, except it gets lost when the... (0 Replies)
Discussion started by: aachave1
0 Replies
LEARN ABOUT DEBIAN
seqfmt
SEQFMT(5) User Manuals SEQFMT(5)NAME
seqfmt - Sequences formats
DESCRIPTION
This document illustrates some common formats used for sequences representation.
EMBL
ID MMVASPHOS standard; RNA; EST; 140 BP.
AC X97897;
DE M.musculus mRNA for protein homologous to
DE vasodilator-stimulated phosphoprotein
SQ Sequence 140 BP; 25 A; 58 C; 39 G; 17 T; 1 other;
ttctcccaga agctgactct atggngaccc cgagagagac tgagcagaac 60
ccccgcaccc ctgcacttcc aatcaggggc gccccgggag cactccccgt 120
ccgccctccg cgcagccatg 140
//
FASTA
>MMVASPHOS
ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
ccgccctccgcgcagccatg
GCG
!!NA_SEQUENCE 1.0
(No documentation)
dna1.txt Length: 88 Nov 22, 2001 14:38 Type: N Check: 3818 ..
1 TAGTCGTAGT CGGAGCGATG CTGACGATGA CGATGACGAT CGTAGCTGAT
51 CGATCGAGCT GATGCTGATC GAGCTAGCTG ATCGATCG
GDE
#sample1
TTCAAGAGAAACAGCGGCCAAGGAAAAGACTCGGCATGATTGTCCATAGCTTACAAAGCG
#sample2
TTCAAGAGAAACAGCGGCTGGGGGAAAGACTCGTCCTGATTGCCTGTAGATGGTAAAGCG
GENBANK
LOCUS HUMHBV1 130 bp DNA PRI 17-JUN-1993
DEFINITION Human DNA/endogenous Hepatitis B virus (HBV) DNA, left
host viral junction.
ACCESSION M15770
BASE COUNT 32 a 43 c 29 g 26 t
ORIGIN
1 agcgggcagt gcagctgctt ggacagcagg ggtgtttctt caacccaggc
61 ctcctgtcac aacaggccca ttcaattctg aacctgcaag ccaactccaa
121 cctcttttcc cagggggaac caaaaaccct
//
IG
; comment
U03518
AACCTGCGGAAGGATCATTACCGAGTGCGGGTCCTTTGGGCCCAACCTCCCATCCGTGTC
TATTGTACCCTGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCTCTG
TGAGTTGATTGAATGCAATCAGTTAAAACTTTCAACAATGGATCTCTTGGTTCCGGC1
NBRF
>P1;CCHU
cytochrome c [validated] - human
MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW
GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE*
PIR
ENTRY CCHU #type complete
TITLE cytochrome c [validated] - human
ACCESSIONS A31764; A05676; I55192; A00001
SUMMARY #length 105 #molecular-weight 11749 #checksum 3247
SEQUENCE
5 10 15 20 25 30
1 M G D V E K G K K I F I M K C S Q C H T V E K G G K H K T G
31 P N L H G L F G R K T G Q A P G Y S Y T A A N K N K G I I W
61 G E D T L M E Y L E N P K K Y I P G T K M I F V G I K K K E
91 E R A D L I A Y L K K A T N E
///
RAW
ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
ccgccctccgcgcagccatg
Warning: This format cannot handle more than one sequence per file.
SWISSPROT
ID 100K_RAT STANDARD; PRT; 149 AA.
AC Q62671;
DE 100 kDa protein (EC 6.3.2.-).
SQ SEQUENCE 149 AA; 17004 MW; D06484B8BC29112E CRC64;
MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK
PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN
SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV
//
SEE ALSO squizz(1), alifmt(5)AUTHOR
Nicolas Joly (njoly@pasteur.fr), Institut Pasteur.
Unix 2009-05-19 SEQFMT(5)