Not sure I understand what Scrutinizer is aiming at, but as a slight modification of his proposal try
---------- Post updated at 22:14 ---------- Previous update was at 22:12 ----------
In addition to what Scrutinizer and RudiC have already suggested, you could also try:
which, with your input file, produces the output:
If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
The first example is simple but outputs an unwanted leading space. The 2nd and 3rd produce the desired output, one using the field separator to split fields and one using split() to split fields. The last two then use a for loop to print the desired fields (note that with the ERE used for FS and the split(), field 1 is always an empty string.)
This User Gave Thanks to Don Cragun For This Post:
I hadn't noticed that this thread had been closed when I posted my last suggestion. And, since I received a private message asking how my code worked, I'm going to reopen this thread. I agree that these two threads are related, but I feel that this thread is mostly about using field delimiters other than the default sequences of one or more blank (space and tab) characters, while the other thread is mostly about deleting selected fields from input lines.
From the private e-mail:
Quote:
Hi Don,
The code you provided:
worked great, but I was wondering if you could help me understand what this code is saying...
What does the '[^[:digit:]]*' represent?
It is an extended regular expression (aka ERE) that matches any string of characters (specified by the asterisk at the end) that are not decimal digits (specified by the bracket expression [^[:digit:]] where [:digit:] inside square brackets refers to a single digit in the current locale and the circumflex as the first character in the bracket expression reverses the set of matched characters). Since this is an option-argument to the awk -F option, that ERE as the input field separator for lines being read by awk.
Quote:
and why are you adding two strings "%s%s"
The 1st string printed is the data in the field. The 2nd string printed is the field separator or the line terminator.
Quote:
lastly, what purpose does the semicolon ":" serve at the end of the code?
That is a colon; not a semicolon. In awk (as in C and C++) the expression:
evaluates to true_result if the logical_expression evaluates to true and evaluates to false_result otherwise. In this case:
returns a <newline> character to be printed as the line terminator if i is the number of the last field on the input line; otherwise it returns a <space> character to be printed as a field separator in the current output line.
Quote:
Sorry for all these questions, I just want to know exactly how it works instead of just copying and pasting into my script.
Never apologize for asking questions. We want you to learn how this stuff works.
Quote:
Thanks again!
Rabu
I hope this helps. Let us know if it is still not clear.
I am trying to run the awk below. My question is when I split the input, then run anotherawk to perform a calculation using that splitas the input there are no issues. When I try to combine them the output is not correct, is the split not working or did I do it wrong? Thank you :).
input
... (8 Replies)
I would like to split a string of numbers "1-2,4-13,16,19-20,21-25,31-32" and output these with awk into
-dFirstPage=1 -dLastPage=2 file.pdf -dFirstPage=4 -dLastPage=13 file.pdf -dFirstPage=16 -dLastPage=16 file.pdf file.pdf -dFirstPage=19 -dLastPage=20 file.pdf -dFirstPage=21 -dLastPage=25... (3 Replies)
Hello;
I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is... (5 Replies)
Hi Folks,
I have lines that look like this:
>m110730_101608_00120_c100168052554400000315046108261127_s1_p0/7/29_426ACGTGCTATGCGG
>m110730_101608_00120_c100168052554400000315046108261127_s1_p0/7/469_894ACGTGCTATGCGG
I want to split all lines into:
... (4 Replies)
Dear colleagues! I want to create a script which will take each file from the list and then parse it filename with awk/split. I do it this way:
for file in `cat /$FileListFN`; do
echo `awk '
{N=split(FILENAME,FNParts,"_")}
{for (i=1; i<=N; i++)
... (10 Replies)
Hello Friends,
Im trying to split a string. When i use first method of awk like below i have an error:
method1 (I specified the FS as ":" so is this wrong?)
servert1{root}>awk -f split.txt
awk: syntax error near line 2
awk: bailing out near line 2
split.txt:... (5 Replies)
I did a lot of search on this forum on spiting file; found a lot, but my requirement is a bit different, please guide.
Master file:
x:start:5
line1:23
line2:12
2:90
x:end:5
x:start:2
45:56
22:90
x:end:2
x:start:3
line1:23
line2:12
x:end:3
x:start:2
line5:23 (1 Reply)
Hi,
I have some output in the form of:
#output:
abc123
def567
hij890
ghi324
the above is in one column, stored in the variable x ( and if you wana know about x... x=sprintf(tolower(substr(someArray,1,1)substr(userArray,3,1)substr(userArray,2,1)))
when i simply print x (print x) I get... (7 Replies)
I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this.
For example:
split -l 3000000 filename.txt
This is very slow and it splits the file with 3 million records in each... (10 Replies)