Sponsored Content
Top Forums Shell Programming and Scripting Uppercase/lowercase comparison of one character per line with awk?? Post 302385023 by ivpz on Thursday 7th of January 2010 01:32:08 AM
Old 01-07-2010
Uppercase/lowercase comparison of one character per line with awk??

Another frustrating scripting problem from a biologist trying to manipulate a file with several millions line. For each of the line I need to compare the uppercase A or C or G or T with the lowercase a or c or g or t. If there are more uppercases, a + should be added to a new column, otherwise a - is added. Many of the lines are duplicated or even triplicated, etc... This is to allow the comparison of only one character at a time in the order of ACGT. And to make it even more complicated, comparison on the last line of the repeated lines should be between the . and , where if there are more . than , a + should be added.

Below are the examples of some of my data. The columns with numbers are the count of uppercase ACGT and lowercase acgt respectively.
Code:
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0

And this is what I'll like to get:
Code:
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0  +
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0  +
.......GGGG,.G,,G...G.,.T...G.,..,.,,^F, 0 0 8 1 0 0 0 0  +
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0  -
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0  -
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0  -
,gc,,cga,g,c,,,,,,, 0 0 0 0 1 3 3 0  -
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0  +
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0  -
.....,,..,,...,,......,...cA.c,cC. 1 1 0 0 0 3 0 0  +

I've tried awk with if conditions but I guess it is too simple. Any suggestions or help will be very much appreciated!

Last edited by Scott; 01-07-2010 at 02:38 AM.. Reason: Added code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

uppercase to lowercase

Greetings & Happy New Years To All! A client of mine FTP'ed their files up to the server and it all ended up being in UPPERCASE when it all should be in lowercase. Is there a builtin command or a script anyone knows of that will automagically convert all files to lowercase? Please advise asap... (4 Replies)
Discussion started by: webex
4 Replies

2. Shell Programming and Scripting

How convert lowercase or uppercase

It will only accept one argument where it should be upper or lowercase. if user choose to convert filnames to upper case than it should convert to upper or vice versa. if no action taken by the user then should not do anything any of the files in the current directory. (5 Replies)
Discussion started by: Alex20
5 Replies

3. Shell Programming and Scripting

UPPERCASE to lowercase with no overwriting?

Hey, I've just started learning shell script today. How would I write a bash script file that changes file names from uppercase to lowercase in that directory, the program should warn the user and NOT overwrite the existing file if it's already in lowercase? for example in a directory i... (1 Reply)
Discussion started by: lgd923
1 Replies

4. AIX

Lowercase to Uppercase

Inside a script I have 2 variables COMP=cy and PT=t. further down the same script I require at the same line to call those 2 variables the first time uppercase and after lowercase ${COMP}${PT}ACE,${COMP}${PT}ace. Can somebody help me Thanks in advance George Govotsis (7 Replies)
Discussion started by: ggovotsis
7 Replies

5. UNIX for Dummies Questions & Answers

uppercase to lowercase

i have no variable and no file i just want to convert AJIT to ajit with some command in UNIX can anybody help (4 Replies)
Discussion started by: ajit.yadav83
4 Replies

6. Shell Programming and Scripting

indentation and lowercase to uppercase

hi, i need to write a bash script that does two things. the program will take from the command line a file name, which is a C code, and an integer, which is the size of my indentation i would then have to indent every nested code by the number of columns provided by the user in the... (1 Reply)
Discussion started by: kratos.
1 Replies

7. UNIX Desktop Questions & Answers

Unix: lowercase to uppercase

I just started to learn unix... and i needed to make a basic script. i need to 1. read a file (.txt) 2. count the words of EVERY sentece 3. sentences with odd number of words need to be converted into lowercase sentences with even number of words need to be converted into uppercase ... (6 Replies)
Discussion started by: chilli1988
6 Replies

8. UNIX for Dummies Questions & Answers

UPPERCASE to lowercase

Hi All, i have a file and i want to convert all uppercase letters to lowercase letters which are in my file. how can i do this. Thanx (3 Replies)
Discussion started by: temhem
3 Replies

9. Shell Programming and Scripting

Convert lowercase to uppercase

listprocs.sh contains ps -ef | grep "swikar" 1) Write a shell script to convert an input file to all upper case. Name your shell script toupper.sh. Hint: tr ' ' ' ' will convert all lower case letters to upper case To use your script, try the following command: cat... (1 Reply)
Discussion started by: swikar
1 Replies

10. Shell Programming and Scripting

Uppercase to lowercase

Hello, I have a list of files in a directory whose names are all in uppercasse, including the file format for eg *.MP3 . I would like to convert these to the normal way we write it ie ABC.MP3 to be converted to Abc.mp3 . I know that this can be done manually by using a lot of "mv" or rename... (6 Replies)
Discussion started by: ajayram
6 Replies
UNIQ(1) 						    BSD General Commands Manual 						   UNIQ(1)

NAME
uniq -- report or filter out repeated lines in a file SYNOPSIS
uniq [-c | -d | -u] [-i] [-f num] [-s chars] [input_file [output_file]] DESCRIPTION
The uniq utility reads the specified input_file comparing adjacent lines, and writes a copy of each unique input line to the output_file. If input_file is a single dash ('-') or absent, the standard input is read. If output_file is absent, standard output is used for output. The second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are not adjacent, so it may be necessary to sort the files first. The following options are available: -c Precede each output line with the count of the number of times the line occurred in the input, followed by a single space. -d Only output lines that are repeated in the input. -f num Ignore the first num fields in each input line when doing comparisons. A field is a string of non-blank characters separated from adjacent fields by blanks. Field numbers are one based, i.e. the first field is field one. -s chars Ignore the first chars characters in each input line when doing comparisons. If specified in conjunction with the -f option, the first chars characters after the first num fields will be ignored. Character numbers are one based, i.e. the first character is character one. -u Only output lines that are not repeated in the input. -i Case insensitive comparison of lines. DIAGNOSTICS
The uniq utility exits 0 on success, and >0 if an error occurs. COMPATIBILITY
The historic +number and -number options have been deprecated but are still supported in this implementation. SEE ALSO
sort(1) STANDARDS
The uniq utility is expected to be IEEE Std 1003.2 (``POSIX.2'') compatible. HISTORY
A uniq command appeared in Version 3 AT&T UNIX. BSD
June 6, 1993 BSD
All times are GMT -4. The time now is 08:27 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy