need to remove invariant characters


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting need to remove invariant characters
# 8  
Old 08-22-2012
Try this updated solution:

Code:
gawk '
{
   key[NR]=$1;
   for(i=1;i<length($2);i+=2) {
       site[i,NR]=substr($2,i,2)
       if(i>maxi)maxi=i
   }
}
END {
    c=0
    for(i=1;i<=maxi;i+=2) {
           v=""
           mismatch=0
           for(r=1;r<=NR;r++) {
              if(v=="" && !(site[i,r]~"-"))v=site[i,r];
              else if(length(v)&&site[i,r]!=v) {
                 if(++mismatch>1) {
                     keep[++c]=i;
                     r=NR
                 }
              }
           }
        }
        for(r=1;r<=NR;r++) {
           printf "%s ", key[r]
           for(i=1;i<=c;i++) printf "%s", site[keep[i],r]
           printf "\n"
        }
}' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove first 2 characters and last two characters of each line

here's what im trying to do. i have a file containing lines similar to this: data.txt: 1hsRmRsbHRiSFZNTTA1dlEyMWFkbU5wUW5CSlIyeDFTVU5SYjJOSFRuWmpia0ZuWXpKV2FHTnRU 1lKUnpWMldrZFZaMG95V25oYQpSelEyWTBka2QyRklhSHBrUjA1b1kwUkJkd3BOVXpWM1lVaG5k... (5 Replies)
Discussion started by: SkySmart
5 Replies

2. UNIX for Dummies Questions & Answers

How do I remove ^M characters with VI

I have a file with all kinds of ^M at the end of each line. How the heck can these be removed? I tried a global search and replace, but it doesn't seem to work. Thanks! (8 Replies)
Discussion started by: HmmBerger
8 Replies

3. Shell Programming and Scripting

Remove whitespaces in the n first characters?

I assume removing whitespaces in the n first characters of a string would be an easy task for sed? If so, how? (7 Replies)
Discussion started by: KidCactus
7 Replies

4. UNIX for Dummies Questions & Answers

How to Remove Special Characters

Dear Members, We have a file which contains some special characters. I need to replace these special character by a new line character(\n). The Special character is \x85. I am not sure what this character means and how we can remove it. Any inputs are greatly appreciated. Thanks... (5 Replies)
Discussion started by: sandeep_1105
5 Replies

5. Shell Programming and Scripting

Remove characters from file name

Here is my code. for file in *1.3.html ; do mv "$file" `echo $file | tr '.1.3' ''` ; done For some reason I am getting an error. mv: file.idlesince.1.3.html and file.idlesince.1.3.html are identical Could this be done a different way? (5 Replies)
Discussion started by: mrlayance
5 Replies

6. UNIX for Advanced & Expert Users

remove characters

hi i have a file with these strings: 123_abc_X1116990 how to get rid of 123_abc_ and keep only X1116990? I have columns of these: 123_abc_X1134640 123_dfg_X1100237 123_tyu_X1103112 123_tyui_X1116990 thx (5 Replies)
Discussion started by: melanie_pfefer
5 Replies

7. UNIX for Dummies Questions & Answers

How to remove Characters before '~'

Hi, I am having a file which contains records as follows: DETAIL_KEY~12344|ACTIVE_PASSIVE~Y|AVG_SIZE_OF_RESPONSE~123123131 DETAIL_KEY~12344|ACTIVE_PASSIVE~Y|AVG_SIZE_OF_RESPONSE~123123131 DETAIL_KEY~12344|ACTIVE_PASSIVE~Y|AVG_SIZE_OF_RESPONSE~123123131... (4 Replies)
Discussion started by: Amey Joshi
4 Replies

8. UNIX for Dummies Questions & Answers

Remove control characters

Hi, When I do a man and save it into a file, I end up getting a lot of control characters. How can I remove them?? I tried this: /1,$ s/^H//g But I get an error saying "no previous regular expression". Can someone help me with this. Thanks, Aravind (5 Replies)
Discussion started by: aravind_mg
5 Replies
Login or Register to Ask a Question
NEATEN(1)						      General Commands Manual							 NEATEN(1)

NAME
neaten - neaten up output columns SYNOPSIS
neaten [ format ] DESCRIPTION
Neaten reads from its standard input and neatens up columns separated by white space using the specified format. The format is a string consisting of a positive integer followed by an alignment character and another integer. The alignment character is usually a decimal point ('.'), but it can be any non-digit. The alignment character is used as the central point of each column. The total column field width will be the number to the left of the alignment character plus one for the alignment character itself plus the number to the right of the alignment character. If a field does not contain the alignment character, it will be printed to the left of where the alignment character would have appeared. If a field is too long to print within the specified format, the entire field will be printed and that row will not be aligned with the rest. The default format is "8.8". EXAMPLE
To examine a file with columns of numbers: neaten 10.8 < input | more BUGS
Columns wider than the total width of the format specification will be printed without any separating white space. The program does not do anything special with tabs on the input. AUTHOR
Greg Ward SEE ALSO
cnt(1), rcalc(1), rlam(1), total(1) RADIANCE
11/15/93 NEATEN(1)