Sponsored Content
Top Forums Shell Programming and Scripting Appending lines with word frequencies, ordering and indexing a column Post 302484704 by pravin27 on Monday 3rd of January 2011 12:03:05 AM
Old 01-03-2011
Hi Ghetz,

Sorry ....

Try this,

Code:
sort -nk2,1 inputfile -o inputfile; awk 'NR==FNR{if(s!=$1){j=0;s=$1}a[$1]=++j;next} {if(p!=$1){i=0;p=$1}if(a[p]){$4=a[p]}$5=++i;print }' OFS="\t" inputfile inputfile


Last edited by pravin27; 01-03-2011 at 01:35 AM..
This User Gave Thanks to pravin27 For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

need help appending lines/combining lines within a file...

Is there a way to combine two lines onto a single line...append the following line onto the previous line? I have the following file that contains some blank lines and some lines I would like to append to the previous line... current file: checking dsk c19t2d6 checking dsk c19t2d7 ... (2 Replies)
Discussion started by: mr_manny
2 Replies

2. Shell Programming and Scripting

appending column file

Hi all, I have two files with the same number of lines the first file is a.dat and looks like 0.000 1.000 1.000 2.000 ... the fields are tab separated the second file is b.dat and looks like 1.2347 0.546 2.3564 0.321 ... the fields are tab separated I would like to have a file c.dat... (4 Replies)
Discussion started by: f_o_555
4 Replies

3. Shell Programming and Scripting

trying to make an AWK code for ordering numbers in a column from least to highest

Hi all, I have a large column of numbers like 5.6789 2.4578 9.4678 13.5673 1.6589 ..... I am trying to make an awk code so that awk can easily go through the column and arrange the numbers from least to highest like 1.6589 2.4578 5.6789 ....... can anybody suggest, how can I do... (5 Replies)
Discussion started by: ananyob
5 Replies

4. Homework & Coursework Questions

word ordering problem HELP please (linux)

Hi guys I need you ,please help me i have to do this for tomorow and i don't understand how to do Q1 : Order the words of RADIO.txt by frequency Q2 : Order the words of RADIO.txt in alphabétique order Q3 : Order the words of RADIO.txt par ordre "rhymique" (exemple, put togeder words which are... (1 Reply)
Discussion started by: Lili
1 Replies

5. Shell Programming and Scripting

Search the word to be deleted and delete lines above this word starting from P1 to P3

Hi, I have to search a word in a text file and then I have to delete lines above from the word searched . For eg suppose the file is like this: Records P1 10,23423432 ,77:1 ,234:2 P2 10,9089004 ,77:1 ,234:2 ,87:123 ,9898:2 P3 456456 P1 :123,456456546 P2 abc:324234 (2 Replies)
Discussion started by: vsachan
2 Replies

6. Shell Programming and Scripting

Re ordering lines - Awk

Is it possible to re-order certain rows as columns (of large files). Few lines from the file for reference. input Splicing Factor: Tra2beta, Motif: aaguguu, Cutoff: 0.5000 Sequence Position Genomic Coordinate K-mer Score 97 chr1:67052604 uacuguu 0.571 147... (3 Replies)
Discussion started by: quincyjones
3 Replies

7. Shell Programming and Scripting

Appending a word to the last line

Hi, I would like to append input given id at last line of file. For ex: In the following sample.txt file i would like to append the input given user id (after id6,id7) but it is adding on the next line instead same line. Sample.txt read=id1,id2,id3 write=id4,id5,id6 Thanks Raveendran (8 Replies)
Discussion started by: raveendran.l
8 Replies

8. UNIX for Dummies Questions & Answers

Search word in 3rd column and move it to next column (4th)

Hi, I have a file with +/- 13000 lines and 4 column. I need to search the 3rd column for a word that begins with "SAP-" and move/skip it to the next column (4th). Because the 3rd column need to stay empty. Thanks in advance.:) 89653 36891 OTR-60 SAP-2 89653 36892 OTR-10 SAP-2... (2 Replies)
Discussion started by: AK47
2 Replies

9. Shell Programming and Scripting

Indexing each repeating pattern of rows in a column using awk/sed

Hello All, I have data like this in a column. 0 1 2 3 0 3 4 5 6 0 1 2 3 etc. where 0 identifies the start of a pattern in my data. So I need the output like below using either awk/sed. 0 1 (2 Replies)
Discussion started by: ks_reddy
2 Replies

10. UNIX for Beginners Questions & Answers

How to search for a word in column header that fully matches the word not partially in awk?

I have a multicolumn text file with header in the first row like this The headers are stored in an array called . which contains I want to search for each elements of this array from that multicolumn text file. And I am using this awk approach for ii in ${hdr} do gawk -vcol="$ii" -F... (1 Reply)
Discussion started by: Atta
1 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 03:27 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy