Sponsored Content
Top Forums Shell Programming and Scripting Uppercase/lowercase comparison of one character per line with awk?? Post 302385023 by ivpz on Thursday 7th of January 2010 01:32:08 AM
Old 01-07-2010
Uppercase/lowercase comparison of one character per line with awk??

Another frustrating scripting problem from a biologist trying to manipulate a file with several millions line. For each of the line I need to compare the uppercase A or C or G or T with the lowercase a or c or g or t. If there are more uppercases, a + should be added to a new column, otherwise a - is added. Many of the lines are duplicated or even triplicated, etc... This is to allow the comparison of only one character at a time in the order of ACGT. And to make it even more complicated, comparison on the last line of the repeated lines should be between the . and , where if there are more . than , a + should be added.

Below are the examples of some of my data. The columns with numbers are the count of uppercase ACGT and lowercase acgt respectively.
Code:
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0

And this is what I'll like to get:
Code:
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0  +
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0  +
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0  +
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0  -
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0  -
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0  -
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0  -
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0  +
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0  -
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0  +

I've tried awk with if conditions but I guess it is too simple. Any suggestions or help will be very much appreciated!

Last edited by Scott; 01-07-2010 at 02:38 AM.. Reason: Added code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

uppercase to lowercase

Greetings & Happy New Years To All! A client of mine FTP'ed their files up to the server and it all ended up being in UPPERCASE when it all should be in lowercase. Is there a builtin command or a script anyone knows of that will automagically convert all files to lowercase? Please advise asap... (4 Replies)
Discussion started by: webex
4 Replies

2. Shell Programming and Scripting

How convert lowercase or uppercase

It will only accept one argument where it should be upper or lowercase. if user choose to convert filnames to upper case than it should convert to upper or vice versa. if no action taken by the user then should not do anything any of the files in the current directory. (5 Replies)
Discussion started by: Alex20
5 Replies

3. Shell Programming and Scripting

UPPERCASE to lowercase with no overwriting?

Hey, I've just started learning shell script today. How would I write a bash script file that changes file names from uppercase to lowercase in that directory, the program should warn the user and NOT overwrite the existing file if it's already in lowercase? for example in a directory i... (1 Reply)
Discussion started by: lgd923
1 Replies

4. AIX

Lowercase to Uppercase

Inside a script I have 2 variables COMP=cy and PT=t. further down the same script I require at the same line to call those 2 variables the first time uppercase and after lowercase ${COMP}${PT}ACE,${COMP}${PT}ace. Can somebody help me Thanks in advance George Govotsis (7 Replies)
Discussion started by: ggovotsis
7 Replies

5. UNIX for Dummies Questions & Answers

uppercase to lowercase

i have no variable and no file i just want to convert AJIT to ajit with some command in UNIX can anybody help (4 Replies)
Discussion started by: ajit.yadav83
4 Replies

6. Shell Programming and Scripting

indentation and lowercase to uppercase

hi, i need to write a bash script that does two things. the program will take from the command line a file name, which is a C code, and an integer, which is the size of my indentation i would then have to indent every nested code by the number of columns provided by the user in the... (1 Reply)
Discussion started by: kratos.
1 Replies

7. UNIX Desktop Questions & Answers

Unix: lowercase to uppercase

I just started to learn unix... and i needed to make a basic script. i need to 1. read a file (.txt) 2. count the words of EVERY sentece 3. sentences with odd number of words need to be converted into lowercase sentences with even number of words need to be converted into uppercase ... (6 Replies)
Discussion started by: chilli1988
6 Replies

8. UNIX for Dummies Questions & Answers

UPPERCASE to lowercase

Hi All, i have a file and i want to convert all uppercase letters to lowercase letters which are in my file. how can i do this. Thanx (3 Replies)
Discussion started by: temhem
3 Replies

9. Shell Programming and Scripting

Convert lowercase to uppercase

listprocs.sh contains ps -ef | grep "swikar" 1) Write a shell script to convert an input file to all upper case. Name your shell script toupper.sh. Hint: tr ' ' ' ' will convert all lower case letters to upper case To use your script, try the following command: cat... (1 Reply)
Discussion started by: swikar
1 Replies

10. Shell Programming and Scripting

Uppercase to lowercase

Hello, I have a list of files in a directory whose names are all in uppercasse, including the file format for eg *.MP3 . I would like to convert these to the normal way we write it ie ABC.MP3 to be converted to Abc.mp3 . I know that this can be done manually by using a lot of "mv" or rename... (6 Replies)
Discussion started by: ajayram
6 Replies
fstrcmp(1)						      General Commands Manual							fstrcmp(1)

NAME
fstrcmp - fuzzy comparison of strings SYNOPSIS
fstrcmp [ -p ] first-string second-string fstrcmp -w first-string second-string fstrcmp -a first-file second-file fstrcmp -s needle haystack... fstrcmp --version DESCRIPTION
The fstrcmp command is used to make fuzzy comparisons between strings. The "edit distance" between the strings is printed, with 0.0 mean- ing the strings are utterly un-alike, and 1.0 meaning the strings are identical. You may need to quote the string to insulate them from the shell. OPTIONS
The fstrcmp command understands the following options: -a --files-as-bytes This option is used to compare two files as arrays of bytes. See fmemcmp(3) for more information. -p --pair This option is used to compare two strings as arrays of bytes. This is the default. See fstrcmp(3) for more information. -s --select This option is used to select the closest needle from the provided haystack alternatives. The most similar (single) choice is printed. If none are particularly similar, nothing is printed. See fstrcmp(3) for more information. See below for example. -V --version This option may be used to print the version of the fstrcmp command, and then exit. -w --wide-pair This option is used to compare two multi-byte character strings. See fstrcoll(3) for more information. EXIT STATUS
The fstrcmp command exits with status 1 on any error. The fstrcmp command only exits with status 0 if there are no errors. EXAMPLE
The fstrcmp --select option may be used in a shell script to improve error messages. case "$action" in start) start ;; stop) stop ;; restart) stop start ;; *) echo "$0: action "$action" unknown" 1>&2 guess=`fstrcmp --select "$action" stop start restart` if [ "$guess" ] then echo "$0: did you mean "$guess" instead?" 1>&2 fi exit 1 ;; esac Thus, the error message frequently suggests the correct action in the face of simple finger problems on the command line. SEE ALSO
fstrcmp(3) fuzzy comparison of strings fstrcoll(3) fuzzy comparison of two multi-byte character strings fstrcmpi(3) fuzzy comparison of strings, integer variation COPYRIGHT
fstrcmp version 0.4 Copyright (C) 2009 Peter Miller Peter Miller <pmiller@opensource.org.au> The comparison code is derived from the fuzzy comparison functions in GNU Gettext 0.17. The GNU Gettext comparison functions were, in turn, derived from GNU Diff 2.7. Copyright (C) 1988-2009 Free Software Foundation fstrcmp(1)
All times are GMT -4. The time now is 07:27 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy