Sponsored Content
Top Forums Shell Programming and Scripting Appending lines with word frequencies, ordering and indexing a column Post 302484593 by Ghetz on Saturday 1st of January 2011 08:58:02 PM
Old 01-01-2011
Appending lines with word frequencies, ordering and indexing a column

Dear All,

I have the following input data:
Code:
w1	20	g1
w1	10	g1
w2	12	g1
w2	23	g1
w3	10	g1
w3	17	g1
w3	12.5	g1
w3	21	g1
w4	11	g1
w4	13.2	g1
w4	23	g1
w4	18	g1

First I seek to find the word frequencies in col1 and sort col2 in ascending order for each change in a col1 word. Second, append the frequencies and orders to each line such as:

Code:
W	Z	U	freq(W)	Z-order

w1	10	g1	2	1
w1	20	g1	2	2
w2	12	g1	2	1
w2	23	g1	2	2
w3	10	g1	4	1
w3	12.5	g1	4	2
w3	17	g1	4	3
w3	21	g1	4	4
w4	11	g1	4	1
w4	13.2	g1	4	2
w4	18	g1	4	3
w4	23	g1	4	4

I trying to complete the following code but not making any headway:

Code:
awk 'NR==FNR{words[++nwords]=$1;next}
{for(i=1;i<=NF;i++)freq[$i]++}
END{for(w=1;w<=nwords;w++)
print words[w], freq[words[w]]+0}' infile

I therefore need your help.

Many thanks,

Ghetz
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

need help appending lines/combining lines within a file...

Is there a way to combine two lines onto a single line...append the following line onto the previous line? I have the following file that contains some blank lines and some lines I would like to append to the previous line... current file: checking dsk c19t2d6 checking dsk c19t2d7 ... (2 Replies)
Discussion started by: mr_manny
2 Replies

2. Shell Programming and Scripting

appending column file

Hi all, I have two files with the same number of lines the first file is a.dat and looks like 0.000 1.000 1.000 2.000 ... the fields are tab separated the second file is b.dat and looks like 1.2347 0.546 2.3564 0.321 ... the fields are tab separated I would like to have a file c.dat... (4 Replies)
Discussion started by: f_o_555
4 Replies

3. Shell Programming and Scripting

trying to make an AWK code for ordering numbers in a column from least to highest

Hi all, I have a large column of numbers like 5.6789 2.4578 9.4678 13.5673 1.6589 ..... I am trying to make an awk code so that awk can easily go through the column and arrange the numbers from least to highest like 1.6589 2.4578 5.6789 ....... can anybody suggest, how can I do... (5 Replies)
Discussion started by: ananyob
5 Replies

4. Homework & Coursework Questions

word ordering problem HELP please (linux)

Hi guys I need you ,please help me i have to do this for tomorow and i don't understand how to do Q1 : Order the words of RADIO.txt by frequency Q2 : Order the words of RADIO.txt in alphabétique order Q3 : Order the words of RADIO.txt par ordre "rhymique" (exemple, put togeder words which are... (1 Reply)
Discussion started by: Lili
1 Replies

5. Shell Programming and Scripting

Search the word to be deleted and delete lines above this word starting from P1 to P3

Hi, I have to search a word in a text file and then I have to delete lines above from the word searched . For eg suppose the file is like this: Records P1 10,23423432 ,77:1 ,234:2 P2 10,9089004 ,77:1 ,234:2 ,87:123 ,9898:2 P3 456456 P1 :123,456456546 P2 abc:324234 (2 Replies)
Discussion started by: vsachan
2 Replies

6. Shell Programming and Scripting

Re ordering lines - Awk

Is it possible to re-order certain rows as columns (of large files). Few lines from the file for reference. input Splicing Factor: Tra2beta, Motif: aaguguu, Cutoff: 0.5000 Sequence Position Genomic Coordinate K-mer Score 97 chr1:67052604 uacuguu 0.571 147... (3 Replies)
Discussion started by: quincyjones
3 Replies

7. Shell Programming and Scripting

Appending a word to the last line

Hi, I would like to append input given id at last line of file. For ex: In the following sample.txt file i would like to append the input given user id (after id6,id7) but it is adding on the next line instead same line. Sample.txt read=id1,id2,id3 write=id4,id5,id6 Thanks Raveendran (8 Replies)
Discussion started by: raveendran.l
8 Replies

8. UNIX for Dummies Questions & Answers

Search word in 3rd column and move it to next column (4th)

Hi, I have a file with +/- 13000 lines and 4 column. I need to search the 3rd column for a word that begins with "SAP-" and move/skip it to the next column (4th). Because the 3rd column need to stay empty. Thanks in advance.:) 89653 36891 OTR-60 SAP-2 89653 36892 OTR-10 SAP-2... (2 Replies)
Discussion started by: AK47
2 Replies

9. Shell Programming and Scripting

Indexing each repeating pattern of rows in a column using awk/sed

Hello All, I have data like this in a column. 0 1 2 3 0 3 4 5 6 0 1 2 3 etc. where 0 identifies the start of a pattern in my data. So I need the output like below using either awk/sed. 0 1 (2 Replies)
Discussion started by: ks_reddy
2 Replies

10. UNIX for Beginners Questions & Answers

How to search for a word in column header that fully matches the word not partially in awk?

I have a multicolumn text file with header in the first row like this The headers are stored in an array called . which contains I want to search for each elements of this array from that multicolumn text file. And I am using this awk approach for ii in ${hdr} do gawk -vcol="$ii" -F... (1 Reply)
Discussion started by: Atta
1 Replies
HISTO(1)						      General Commands Manual							  HISTO(1)

NAME
histo - compute 1-dimensional histogram of N data columns SYNOPSIS
histo [-c][-p] xmin xmax nbins histo [-c][-p] imin imax DESCRIPTION
Histo bins columnular data on the standard input between the given minimum and maximum values. If three command line arguments are given, the third is taken as the number of data bins between the first two real numbers. If only two arguments are given, they are both assumed to be integers, and the number of data bins will be equal to their difference plus one. The bins are always of equal size. The output is N+1 columns of data (for N columns input), where the first column is the centroid of each division, and each row corresponds to the frequencies for each column around that value. If the -c option is present, then histo computes the cumulative histogram for each column instead of the straight frequencies. The upper value of each bin is printed also instead of the centroid. This may be useful in computing percentiles, for example. Values below the minimum specified are still counted in the cumulative total. The -p option tells histo to report the percentage of the total number of input lines rather than the absolute counts. In the case of a cumulative total, this yields the percentile values directly. Values above the maximum are counted as well as values below in this case. All input data is interpreted as real values, and columns must be white-space separated. If any value is less than the minimum or greater than the maximum, it will be ignored unless the -c option is specified. EXAMPLE
To count data values between -1 and 1 in 50 bins: histo -1 1 50 < input.dat To count frequencies of integers between 0 and 255: histo 0 255 < input.dat AUTHOR
Greg Ward SEE ALSO
cnt(1), neaten(1), rcalc(1), rlam(1), tabfunc(1), total(1) RADIANCE
9/6/96 HISTO(1)
All times are GMT -4. The time now is 05:13 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy