I have a text file that contains 4 million lines, each line contains 2 fields(colon as field separator). as shown:
Here I have to split the second field(can be up to 40,000 fields) by comma into an array for analysis, but I find the "split" function is too slow.
I tried to find an alternative to replacing the split function. I think I found one but there's still something that can not be achieved without your help. now I have this code:
This code has "split"ed the second field fast enough, but I don't know how to store the splitted data in an array in the *first* AWK program as shown above.
How to solve this problem? or you have other alternatives to replacing the split function?
Hi all!
I am relatively new to UNIX staff, and I have come across a problem:
I have a big directory, which contains 100 smaller ones. Each of the 100 contains a file ending in .txt , so there are 100 files ending in .txt
I want to split each of the 100 files in smaller ones, which will contain... (4 Replies)
$mystring = "name:blk:house::";
print "$mystring\n";
@s_format = split(/:/, $mystring);
for ($i=0; $i <= $#s_format; $i++) {
print "index is $i,field is $s_format";
print "\n";
}
$size = $#s_format + 1;
print "total size of array is $size\n";
i am expecting my size to be 5, why is it... (5 Replies)
I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this.
For example:
split -l 3000000 filename.txt
This is very slow and it splits the file with 3 million records in each... (10 Replies)
Hi,
I have some output in the form of:
#output:
abc123
def567
hij890
ghi324
the above is in one column, stored in the variable x ( and if you wana know about x... x=sprintf(tolower(substr(someArray,1,1)substr(userArray,3,1)substr(userArray,2,1)))
when i simply print x (print x) I get... (7 Replies)
Hi... I have a question regarding the split function in PERL.
I have a very huge csv file (more than 80 million records). I need to extract a particular position(eg : 50th position) of each line from the csv file. I tried using split function. But I realized split takes a very long time.
Also... (1 Reply)
Hi... I have a question regarding the split function in PERL.
I have a very huge csv file (more than 80 million records). I need to extract a particular position(eg : 50th position) of each line from the csv file. I tried using split function. But I realized split takes a very long time.
Also... (0 Replies)
Hi... I have a question regarding the split function in PERL.
I have a very huge csv file (more than 80 million records). I need to extract a particular position(eg : 50th position) of each line from the csv file. I tried using split function. But I realized split takes a very long time.
Also... (1 Reply)
my @d =split('\|', $_);
west|ACH|3|Y|LuV|N||N||
Qt|UWST|57|Y|LSV|Y|Bng|N|KT|
It Returns d as 8 for First Line, and 9 as for Second Line . I want to Process Both the Files, How to Handle It. (3 Replies)
Hello;
I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is... (5 Replies)
Discussion started by: yifangt
5 Replies
LEARN ABOUT CENTOS
trace-cmd-split
TRACE-CMD-SPLIT(1)TRACE-CMD-SPLIT(1)NAME
trace-cmd-split - split a trace.dat file into smaller files
SYNOPSIS
trace-cmd split [OPTIONS] [start-time [end-time]]
DESCRIPTION
The trace-cmd(1) split is used to break up a trace.dat into small files. The start-time specifies where the new file will start at. Using
trace-cmd-report(1) and copying the time stamp given at a particular event, can be used as input for either start-time or end-time. The
split will stop creating files when it reaches an event after end-time. If only the end-time is needed, use 0.0 as the start-time.
If start-time is left out, then the split will start at the beginning of the file. If end-time is left out, then split will continue to the
end unless it meets one of the requirements specified by the options.
OPTIONS -i file
If this option is not specified, then the split command will look for the file named trace.dat. This options will allow the reading of
another file other than trace.dat.
-o file
By default, the split command will use the input file name as a basis of where to write the split files. The output file will be the
input file with an attached '.#' to the end: trace.dat.1, trace.dat.2, etc.
This option will change the name of the base file used.
-o file will create file.1, file.2, etc.
-s seconds
This specifies how many seconds should be recorded before the new file should stop.
-m milliseconds
This specifies how many milliseconds should be recorded before the new file should stop.
-u microseconds
This specifies how many microseconds should be recorded before the new file should stop.
-e events
This specifies how many events should be recorded before the new file should stop.
-p pages
This specifies the number of pages that should be recorded before the new file should stop.
Note: only one of *-p*, *-e*, *-u*, *-m*, *-s* may be specified at a time.
If *-p* is specified, then *-c* is automatically set.
-r
This option causes the break up to repeat until end-time is reached (or end of the input if end-time is not specified).
trace-cmd split -r -e 10000
This will break up trace.dat into several smaller files, each with at most
10,000 events in it.
-c
This option causes the above break up to be per CPU.
trace-cmd split -c -p 10
This will create a file that has 10 pages per each CPU from the input.
SEE ALSO trace-cmd(1), trace-cmd-record(1), trace-cmd-report(1), trace-cmd-start(1), trace-cmd-stop(1), trace-cmd-extract(1), trace-cmd-reset(1),
trace-cmd-list(1), trace-cmd-listen(1)AUTHOR
Written by Steven Rostedt, <rostedt@goodmis.org[1]>
RESOURCES
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
COPYING
Copyright (C) 2010 Red Hat, Inc. Free use of this software is granted under the terms of the GNU Public License (GPL).
NOTES
1. rostedt@goodmis.org
mailto:rostedt@goodmis.org
06/11/2014 TRACE-CMD-SPLIT(1)