03-15-2013
Complex text parsing with speed/performance problem (awk solution?)
I have 1.6 GB (and growing) of files with needed data between the 11th and 34th line (inclusive) of the second column of comma delimited files. There is also a lot of stray white space in the file that needs to be trimmed. They have DOS-like end of lines.
I need to transpose the 11th through 34th lines of col2 from each data file and append them as new rows to an existing file. I also need to add several variables to the front and back of each output line which will be parsed/calculated from the data file names and file metadata.
Input:
...,...
xxx, 9
xxx. 10
xxx, 11 <--need 11th through 34th row in col2.
...,...
xxx, 34
xxx, 35
xxx, 36
...,...
Output:
var1,var2,var3,var4,var5,var6,11,12,13,...,32,33,34,/original/directory/path/of/data/file/,original_data_file_name
Then the entire file including rows previously in it need to be sorted by several of the columns, and duplicate lines removed (excluding some columns from the duplicate determination).
My dos2unix|head|foot|cut|tr(remove whitespace)|tr(change eol to comma)|echo(vars,std_in,vars) works but is way too slow!
I'm thinking there is a way to do the selecting, whitespace removal, transpose with padding of variables on both ends of the output line in one awk command which should speed things up a whole lot, but I am not that good at awk.
Mike
6 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I need help with a problem that I have not been able to figure out.
I have a file that is about 650K lines. Records are seperated by
blank lines, fields seperated by new lines. I was trying to make
a report that would add up 2 fields and associate them with a CP.
example output would be... (11 Replies)
Discussion started by: timj123
11 Replies
2. Shell Programming and Scripting
I have a log file that has many SQL statements/queries/blocks and their resultant output (success or failure) added to each of them. I need to pick up all the statements which caused errors and write them to a separate file.
On most cases, the SQL statement is a single line, like DROP . And if... (1 Reply)
Discussion started by: exchequer598
1 Replies
3. Shell Programming and Scripting
I don't know if this is a big issue or not, but I'm having difficulties. I apoligize for the upcoming essay :o.
I'm writing a script, similar to a paint program that edits images, but in the form of ANSI block characters. The program so far is working. I managed to save the image into a file,... (14 Replies)
Discussion started by: tinman47
14 Replies
4. Shell Programming and Scripting
hello,
i have a complex awk problem...
i have two tables, one with a value (0 to 1) and it's corresponding p-value, like this:
1. table:
______________________________
value p-value
... ...
0.254 0.003
0.245 0.005
0.233 0.006
... ...
______________________________
and a... (6 Replies)
Discussion started by: dietmar13
6 Replies
5. Shell Programming and Scripting
I have a awk script that parses many millions of lines so performance is critical. At one point I am extracting some variables from a space delimited string.
alarm = $11; len = split(alarm,a," "); ent = a; chem = a; for (i = 5; i<= len; i++) {chem = chem " " a}It works but is slow. Adding the... (7 Replies)
Discussion started by: Michael Stora
7 Replies
6. Shell Programming and Scripting
Hello fellow unix geeks,
I am having a small dilemna trying to parse a log file I have. Below is a sample of what it will look like:
MY_TOKEN1(group) TOKEN(other)|SSID1
MY_TOKEN2(group, group2)|SSID2
What I need to do is only keep the MY_TOKEN pieces and where there are multiple... (7 Replies)
Discussion started by: dagamier
7 Replies
LEARN ABOUT DEBIAN
axports
AXPORTS(5) Linux Programmer's Manual AXPORTS(5)
NAME
/etc/ax25/axports - AX.25 port configuration file.
DESCRIPTION
axports is an ASCII file that contains information about each of the physical AX.25 ports that are to be used. When dealing with an AX.25
utility such as call, it takes an argument that is the port name. This port name is a reference to the line within axports, which has that
name as its first argument. The information on each line contains enough information to bind the command to a particular physical AX.25
interface, this binding is done by matching the callsign on the line in axports with the callsign of the port set by kissattach.
The lines within axports must either be a comment line, which starts with a # in the first column, or a port description in the following
format, each field being delimited by white space:
name callsign speed paclen window description
The field descriptions are:
name is the unique identifier of the port. This is the name given as the port argument of many of the AX.25 support pro-
grams. This is not in any way related to actual device identities, just unique
callsign the callsign of the physical interface to bind to.
speed this is the speed of interface, a value of zero means that no speed will be set by kissattach(8).
paclen is the default maximum packet size for this interface.
window the default window size for this interface.
description a free format description of this interface, this field extends to the end of the line. This field may contain spaces.
FILES
/etc/ax25/axports
SEE ALSO
call(1), ax25(4), axparms(8), kissattach(8).
Linux 2008-Feb-04 AXPORTS(5)