You can't expect characters that are used to split a string to be part of the result. If you split "1,2,3,4" on the comma, by definition the comma is not an allowed member of a field. Same goes with a bracket expression such as "[ACGT]"; splitting on such an expression forbids A, C, G, and T from occurring in a field.
Assuming I understood what were trying to do, the semicolons in your bracket expressions are incorrect. Characters in a bracket expression should not be delimited. To split on the four letters "A", "C", "G", and "T", "[ACGT]" is all that's needed. Adding those semicolons will cause splitting on semicolons as well.
Looking at your data:
If you just want to print the highlighted base sequence, and if its always preceded by the final number in the line, the following will do:
Or if the base sequence always begins at the 4th character past the final underscore:
Regards,
Alister
I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this.
For example:
split -l 3000000 filename.txt
This is very slow and it splits the file with 3 million records in each... (10 Replies)
Hi,
I have some output in the form of:
#output:
abc123
def567
hij890
ghi324
the above is in one column, stored in the variable x ( and if you wana know about x... x=sprintf(tolower(substr(someArray,1,1)substr(userArray,3,1)substr(userArray,2,1)))
when i simply print x (print x) I get... (7 Replies)
I did a lot of search on this forum on spiting file; found a lot, but my requirement is a bit different, please guide.
Master file:
x:start:5
line1:23
line2:12
2:90
x:end:5
x:start:2
45:56
22:90
x:end:2
x:start:3
line1:23
line2:12
x:end:3
x:start:2
line5:23 (1 Reply)
Hello Friends,
Im trying to split a string. When i use first method of awk like below i have an error:
method1 (I specified the FS as ":" so is this wrong?)
servert1{root}>awk -f split.txt
awk: syntax error near line 2
awk: bailing out near line 2
split.txt:... (5 Replies)
Dear colleagues! I want to create a script which will take each file from the list and then parse it filename with awk/split. I do it this way:
for file in `cat /$FileListFN`; do
echo `awk '
{N=split(FILENAME,FNParts,"_")}
{for (i=1; i<=N; i++)
... (10 Replies)
Hello;
I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is... (5 Replies)
I would like to split a string of numbers "1-2,4-13,16,19-20,21-25,31-32" and output these with awk into
-dFirstPage=1 -dLastPage=2 file.pdf -dFirstPage=4 -dLastPage=13 file.pdf -dFirstPage=16 -dLastPage=16 file.pdf file.pdf -dFirstPage=19 -dLastPage=20 file.pdf -dFirstPage=21 -dLastPage=25... (3 Replies)
Hello,
I have the following input file:
A=1;B=2;C=3;D=4
A=4;B=6;C=7;D=9
I wish to have the following output
1 2 3 4
4 6 7 9
Can awk split be used to do this?
I have done this without using split, but the process is quite tedious.
Any help is appreciated! (4 Replies)
I am trying to run the awk below. My question is when I split the input, then run anotherawk to perform a calculation using that splitas the input there are no issues. When I try to combine them the output is not correct, is the split not working or did I do it wrong? Thank you :).
input
... (8 Replies)
Discussion started by: cmccabe
8 Replies
LEARN ABOUT DEBIAN
plan9-split
SPLIT(1) General Commands Manual SPLIT(1)NAME
split - split a file into pieces
SYNOPSIS
split [ option ... ] [ file ]
DESCRIPTION
Split reads file (standard input by default) and writes it in pieces of 1000 lines per output file. The names of the output files are xaa,
xab, and so on to xzz. The options are
-n n Split into n-line pieces.
-l n Synonym for -n n, a nod to Unix's syntax.
-e expression
File divisions occur at each line that matches a regular expression; see regexp(7). Multiple -e options may appear. If a subex-
pression of expression is contained in parentheses (...), the output file name is the portion of the line which matches the subex-
pression.
-f stem
Use stem instead of x in output file names.
-s suffix
Append suffix to names identified under -e.
-x Exclude the matched input line from the output file.
-i Ignore case in option -e; force output file names (excluding the suffix) to lower case.
SOURCE
/src/cmd/split.c
SEE ALSO sed(1), awk(1), grep(1), regexp(7)SPLIT(1)