Awk match multiple columns in multiple lines in single file Post: 302662493

Sponsored Content

Top Forums Shell Programming and Scripting Awk match multiple columns in multiple lines in single file Post 302662493 by jacobs.smith on Tuesday 26th of June 2012 04:00:05 PM

06-26-2012

Banned

Awk match multiple columns in multiple lines in single file

Hi,

Input

Code:

7488	7389	chr1.fa	chr1.fa
3546	9887	chr5.fa	chr9.fa
7387	7898	chrX.fa	chr3.fa
7488	7389	chr21.fa	chr3.fa
7488	7389	chr1.fa	chr1.fa
3546	9887	chr9.fa	chr5.fa
7898	7387	chrX.fa	chr3.fa

Desired Output

Code:

7488	7389	chr1.fa	chr1.fa	2
3546	9887	chr5.fa	chr9.fa	2
7387	7898	chrX.fa	chr3.fa	2
7488	7389	chr21.fa	chr3.fa	1
7488	7389	chr1.fa	chr1.fa	2
3546	9887	chr9.fa	chr5.fa	2
7898	7387	chrX.fa	chr3.fa	2

I want to count each line's occurrence and print its occurrence in the fifth column.

Even though the first and second columns (second and sixth records) are interchanged and fourth and fifth columns (first and fifth records) are changed, it still needs to be counted.

So, far I tried this and got the undesired output below

Code:

awk -F, 'NR==FNR{a[$0]++;next}{print $0 "\t" a[$0]}' input input

Code:

7488	7389	chr1.fa	chr1.fa	2
3546	9887	chr5.fa	chr9.fa	1
7387	7898	chrX.fa	chr3.fa	1
7488	7389	chr21.fa	chr3.fa	1
7488	7389	chr1.fa	chr1.fa	2
3546	9887	chr9.fa	chr5.fa	1
7898	7387	chrX.fa	chr3.fa	1

---------- Post updated at 04:00 PM ---------- Previous update was at 03:34 PM ----------

Hi Corona,

Each line's occurence

For ex:

Code:

hello world
world hello

should be considered the same while reading the input. Then the output will be

Code:

hello world 2
world hello 2

because we are considering hello world is present two times in the file.

jacobs.smith

View Public Profile for jacobs.smith

Find all posts by jacobs.smith

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Single column to multiple columns in awk

Hi - I'm new to the awk programming language. I'm trying to print a single column of data to several columns, and I found an article on iTWorld.com (ITworld.com - Printing in columns). It looks like the mkCols2 script is very close to what I need to do, but it looks like the end of the code...

2. Shell Programming and Scripting

Awk multiple lines with 3rd column onto a single line?

3. Shell Programming and Scripting

Filtering issues with multiple columns in a single file

Hi, I am new to unix and would greatly appreciate some help. I have a file containing multiple colums containing different sets of data e.g. File 1: John Ireland 27_December_69 Mary England 13_March_55 Mike France 02_June_80 I am currently using the awk...

4. Shell Programming and Scripting

Awk multiple lines with 4th column on to a single line

This is related to one of my previous post.. I have huge file currently I am using loop to read file and checking each line to build this single record, its taking much much time to parse those records.. I thought there should be a way to do this in awk or sed. I found this code in this forum...

5. Shell Programming and Scripting

Simple awk match for multiple lines

Is there a simple way to use awk to match multiple lines?? Somehow using \n isn't working for me. Ultimately I'm trying to insert "WWW" 3 lines above "eee". input aaa bbb ccc ddd eee fff output aaa bbb WWW ccc ddd eee

6. Shell Programming and Scripting

Combining columns from multiple files into one single output file

Hi, I have 3 files with one column value as shown File: a.txt ------------ Data_a1 Data_a2 File2: b.txt ------------ Data_b1 Data_b2 Data_b3 Data_b4 File3: c.txt ------------ Data_c1 Data_c2 Data_c3 Data_c4 Data_c5

7. Shell Programming and Scripting

Reading multiple values from multiple lines and columns and setting them to unique variables.

Hello, I would like to ask for help with csh script. An example of an input in .txt file is below, the number of lines varies from file to file and I have 2 or 3 columns with values. I would like to read all the values (probably one by one) and set them to independent unique variables that...

8. Shell Programming and Scripting

Merging multiple lines to columns with awk, while inserting commas for missing lines

Hello all, I have a large csv file where there are four types of rows I need to merge into one row per person, where there is a column for each possible code / type of row, even if that code/row isn't there for that person. In the csv, a person may be listed from one to four times...

9. Shell Programming and Scripting

Removing multiple lines from input file, if multiple lines match a pattern.

GM, I have an issue at work, which requires a simple solution. But, after multiple attempts, I have not been able to hit on the code needed. I am assuming that sed, awk or even perl could do what I need. I have an application that adds extra blank page feeds, for multiple reports, when...

10. Shell Programming and Scripting

Removing carriage returns from multiple lines in multiple files of different number of columns

Hello Gurus, I have a multiple pipe separated files which have records going over multiple Lines. End of line separator is \n and records going over multiple lines have <CR> as separator. below is example from one file. 1|ABC DEF|100|10 2|PQ RS T|200|20 3| UVWXYZ|300|30 4| GHIJKL|400|40...

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Single column to multiple columns in awk

Discussion started by: astroDave

2. Shell Programming and Scripting

Awk multiple lines with 3rd column onto a single line?

Discussion started by: SoMoney

3. Shell Programming and Scripting

Filtering issues with multiple columns in a single file

Discussion started by: crunchie

4. Shell Programming and Scripting

Awk multiple lines with 4th column on to a single line

Discussion started by: Vasan

5. Shell Programming and Scripting

Simple awk match for multiple lines

Discussion started by: pxalpine

6. Shell Programming and Scripting

Combining columns from multiple files into one single output file

Discussion started by: vfrg

7. Shell Programming and Scripting

Reading multiple values from multiple lines and columns and setting them to unique variables.

Discussion started by: FMMOLA

8. Shell Programming and Scripting

Merging multiple lines to columns with awk, while inserting commas for missing lines

Discussion started by: RalphNY

9. Shell Programming and Scripting

Removing multiple lines from input file, if multiple lines match a pattern.

Discussion started by: jxfish2

10. Shell Programming and Scripting

Removing carriage returns from multiple lines in multiple files of different number of columns

Discussion started by: dJHa

LEARN ABOUT DEBIAN

bup-margin