Sponsored Content
Top Forums Shell Programming and Scripting Getting the non-homogenous letter row from a text file Post 302817471 by Lucky Ali on Thursday 6th of June 2013 12:38:47 AM
Old 06-06-2013
Getting the non-homogenous letter row from a text file

I do have a large tab delimited file with the following format

Code:
CCCCCGCCCCCCCCCCcCCCCCCCCCCCCCCCC 23 65 3 4
AAAAAAAAAAAAAAAAaAAAAAAAAAAAAAAAA 24 6 89 90
TGTTTTTTTTTTTTGGtTTTTTTTTTTTTTTTT 2 4 8 90
TTTT-TTTTTTTTTTTtTTTTTTTTTTTTTTTT 1 34 89 50
GGGGGGGGGGGGGGGGTGGGGGGGGGGGGGGGG 87 6 78 66
TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 8 78 45 61
AAAAATAAAAAAGGGAAAAAAAAAAAAAAAAAA 78 8 9 23

Each row/line will have 33 letters and each line will only have multiple occurrences of etters out of a pool of ATGC (also small atgc). some may have also '-'. I would like to extract those lines (rows) that have a non-homogenious letters or if one or more letter is different compared to the rest, grap that entire column.

This is the desired out put.

Code:
CCCCCGCCCCCCCCCCcCCCCCCCCCCCCCCCC 23 65 3 4
TGTTTTTTTTTTTTGGtTTTTTTTTTTTTTTTT 2 4 8 90
GGGGGGGGGGGGGGGGTGGGGGGGGGGGGGGGG 87 6 78 66
AAAAATAAAAAAGGGAAAAAAAAAAAAAAAAAA 78 8 9 23

Please let me know the best way to do this in awk.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Changing the column for a row in a text file and adding another row

Hi, I want to write a shell script which increments a particular column in a row from a text file and then adds another row below the current row with the incremented value . For Eg . if the input file has a row : abc xyz lmn 89 lm nk o p I would like the script to create something like... (9 Replies)
Discussion started by: aYankeeFan
9 Replies

2. Shell Programming and Scripting

how can I bcp out a table into a text file including the header row in the text file

Hi All, I need to BCP out a table into a text file along with the table headers. Normal BCP out command only bulk copies the data, and not the headers. I am using the following command: bcp database1..table1 out file1.dat -c -t\| -b1000 -A8192 -Uuser -Ppassword -efile.dat.err Regards,... (0 Replies)
Discussion started by: shilpa_acc
0 Replies

3. UNIX for Dummies Questions & Answers

find positions of a letter in a text file

Hi, I would like to know how can I get all the positions of a letter, let say letter C in a text file. sample input file: hcck pgog hlhhc desired output file: 2 3 13 Many thanks! (2 Replies)
Discussion started by: fadista
2 Replies

4. UNIX for Dummies Questions & Answers

How do you delete cells from a space delimited text file given row and column number?

How do you delete cells from a space delimited text file given row and column number? Letś say the row number is r and the column number is c. Thanks! (5 Replies)
Discussion started by: evelibertine
5 Replies

5. UNIX for Dummies Questions & Answers

Adding a column to a text file with row numbers

Hi, I would like to add a new column containing the row numbers to a text file. How do I go about doing that? Thanks! Example input: A X B Y C D Output: A X 1 B Y 2 C D 3 (5 Replies)
Discussion started by: evelibertine
5 Replies

6. Shell Programming and Scripting

Select row from file and text

Hi all! I would like to solve a problem but I have no clue of how do it!I will be grateful if someone could help me! Briefly I have a big file like this: >ENSMUSG00000000204 | ENSMUST00000159637 GGCGAGGCTTACGCCATTTTACCTCAGCGAGCATTCATAAAGCTGCGAGCATTCATACAG >ENSMUSG00000000204 |... (3 Replies)
Discussion started by: giuliangiuseppe
3 Replies

7. Shell Programming and Scripting

Search row by row from one file to another file if match is found print few colums of file 2

this is the requirement list.txt table1 table2 table3 testfile.txt name#place#data#select * from table1 name2#place2#data2#select * from table 10 innerjoin table3 name2#place2#data2#select * from table 10 output name place table1 name2 place table3 i tried using awk (7 Replies)
Discussion started by: vamsekumar
7 Replies

8. Shell Programming and Scripting

Replace specific letter in a file by other letter

Good afternoon all, I want to ask how to change some letter in my file with other letter in spesific line eg. data.txt 1 1 1 0 0 0 0 for example i want to change the 4th line with character 1. How could I do it by SED or AWK. I have tried to run this code but actually did not... (3 Replies)
Discussion started by: weslyarfan
3 Replies

9. Shell Programming and Scripting

Column to row and position data in a text file

Hi everyone.. I have a list of values in a file... a, b, c, 1, 2, 3, aaaa, bbbbb, I am interested in converting this column to a row.. "text",aaaa, bbbb a,1 (7 Replies)
Discussion started by: manihi
7 Replies

10. Shell Programming and Scripting

Print every alternate column in row in a text file

Hi, I have a comma separated file. I would like to print every alternate columns into a new row. Example input file: Name : John, Age : 30, DOB : 30-Oct-2018 Example output: Name,Age,DOB John,30,30-Oct-2018 (3 Replies)
Discussion started by: Lini
3 Replies
fspec(4)							   File Formats 							  fspec(4)

NAME
fspec - format specification in text files DESCRIPTION
It is sometimes convenient to maintain text files on the system with non-standard tabs, (tabs that are not set at every eighth column). Such files must generally be converted to a standard format, frequently by replacing all tabs with the appropriate number of spaces, before they can be processed by system commands. A format specification occurring in the first line of a text file specifies how tabs are to be expanded in the remainder of the file. A format specification consists of a sequence of parameters separated by blanks and surrounded by the brackets <: and :>. Each parameter consists of a keyletter, possibly followed immediately by a value. The following parameters are recognized: ttabs The t parameter specifies the tab settings for the file. The value of tabs must be one of the following: o A list of column numbers separated by commas, indicating tabs set at the specified columns. o A '-' followed immediately by an integer n, indicating tabs at intervals of n columns. o A '-' followed by the name of a ``canned'' tab specification. Standard tabs are specified by t-8, or equivalently, t1,9,17,25, etc. The canned tabs that are recognized are defined by the tabs(1) command. ssize The s parameter specifies a maximum line size. The value of size must be an integer. Size checking is performed after tabs have been expanded, but before the margin is prepended. mmargin The m parameter specifies a number of spaces to be prepended to each line. The value of margin must be an integer. d The d parameter takes no value. Its presence indicates that the line containing the format specification is to be deleted from the converted file. e The e parameter takes no value. Its presence indicates that the current format is to prevail only until another format specification is encountered in the file. Default values, which are assumed for parameters not supplied, are t-8 and m0. If the s parameter is not specified, no size checking is performed. If the first line of a file does not contain a format specification, the above defaults are assumed for the entire file. The following is an example of a line containing a format specification: * <:t5,10,15 s72:> * If a format specification can be disguised as a comment, it is not necessary to code the d parameter. SEE ALSO
ed(1), newform(1), tabs(1) SunOS 5.10 3 Jul 1990 fspec(4)
All times are GMT -4. The time now is 04:31 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy