06-14-2013
If you can post a sample data it may be helpful to determine the actual code.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
How to cut data from big file
my file around 30 gb
I tried "head -50022172 filename > newfile.txt ,and tail -5454283 newfile.txt. It's slowy.
afer that I tried sed -n '46467831,50022172p' filename > newfile.txt ,also slow
Please recommend me , faster command to cut some data from... (4 Replies)
Discussion started by: almanto
4 Replies
2. Shell Programming and Scripting
My input file:
data_5 Ali 422 2.00E-45 102/253 140/253 24
data_3 Abu 202 60.00E-45 12/23 140/23 28
data_1 Ahmad 256 7.00E-45 120/235 140/235 22
data_4 Aman 365 8.00E-45 15/65 140/65 20
data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies
3. Shell Programming and Scripting
Hi,
I did read a few posts on the subjects, tried out a few solutions, but did not solve my problem.
https://www.unix.com/302121568-post11.html
https://www.unix.com/shell-programming-scripting/137953-large-file-columns-into-rows-etc-4.html
Please help. Problem very similar to the second link... (15 Replies)
Discussion started by: genehunter
15 Replies
4. Shell Programming and Scripting
Hello,
I have a big data file (160 MB) full of records with pipe(|) delimited those fields. I`m sorting the file on the first field.
I'm trying to sort with "sort" command and it brings me 6 minutes.
I have tried with some transformation methods in perl but it results "Out of memory". I was... (2 Replies)
Discussion started by: rubber08
2 Replies
5. Red Hat
Hey guys, we will be interested in learning from your experience in using Linux in Big Data projects. Has anyone used Hadoop, or MapR or Horton Works on Linux and any experiences you may have had on these. I am more interested in knowing if a certain distribution of Linux is better supported for... (1 Reply)
Discussion started by: johnsmith111
1 Replies
6. Shell Programming and Scripting
Hi all
I have a big file which I have attached here.
And, I have to fetch certain entries and arrange in 5 columns
Name Drug DAP ID disease approved or notIn the attached file data is arranged with tab separated columns in this way:
and other data is... (2 Replies)
Discussion started by: manigrover
2 Replies
7. What is on Your Mind?
Hello,
I have been working as Solaris/Linux Admin since past 8 years. I am looking options for my profile change, but there is some limitation. I worked as 24x7 support for admin, server support, high availability, etc. But been worked on developing side and scripting part.
When I search for Big... (2 Replies)
Discussion started by: nightup2222
2 Replies
8. Shell Programming and Scripting
Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns.
I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies
9. Shell Programming and Scripting
Hi All,
I am trying to get some lines from a file i did it with while-do-loop. since the files are huge it is taking much time. now i want to make it faster.
The requirement is the file will be having 1 million lines.
The format is like below.
##transaction, , , ,blah, blah... (38 Replies)
Discussion started by: mad man
38 Replies
10. Shell Programming and Scripting
Hi all,
I have a file like this I want to extract only those regions which are big and continous
chr1 3280000 3440000
chr1 3440000 3920000
chr1 3600000 3920000 # region coming within the 3440000 3920000. so i don't want it to be printed in output
chr1 3920000 4800000
chr1 ... (2 Replies)
Discussion started by: amrutha_sastry
2 Replies
sample(1) BSD General Commands Manual sample(1)
NAME
sample -- Profile a process during a time interval
SYNOPSIS
sample pid | partial-executable-name [duration [samplingInterval]] [-wait] [-mayDie] [-fullPaths] [-e] [-file filename]
DESCRIPTION
sample is a command-line tool for gathering data about the running behavior of a process. It suspends the process at specified intervals (by
default, every 1 millisecond), records the call stacks of all threads in the process at that time, then resumes the process. The analysis
done by sample is called ``sampling'' because it only checks the state of the program at the sampling points. The analysis may miss execu-
tion of some functions that are not executing during one of the samples, but sample still provides useful data about commonly executing func-
tions.
At the end of the sampling duration, sample produces a report showing which functions were executing during the sampling. The data is con-
densed into a call tree, showing the functions seen on the stack and how they were called. (This tree is a subset of the actual call tree
for the execution, since some functions may not have been executing during any of the sampling events.) The tree is displayed textually,
with called functions indented one level to the right of the callee.
In the call tree, if a function calls more than one function then a vertical line is printed to visually connect those separate children
functions, making it easier to see which functions are at the same level. The characters used to draw those lines, such as + | : ! are arbi-
trary and have no specific meaning.
ARGUMENTS
The user of sample specifies a target process (either by process id, or by name), the duration of the sampling run (in seconds), and a sam-
pling rate (in milliseconds).
If the sampling duration is not specified, a default of 10 seconds is used. Longer sampling durations provide better data by collecting more
samples, but could also be confusing if the target process performs many different types of operations during that period.
The default sampling rate is 1 millisecond. Fast sampling rates provide more samples and a better chance to capture all the functions that
are executing.
-wait tells sample to wait for the process specified (usually as a partial name or hint) to exist, then start sampling that process. This
option allows you to sample from an application's launch.
-mayDie tells sample to immediately grab the location of symbols from the application, on the assumption that the application may exit or
crash at any point during the sampling. This ensures that sample can give information about the call stacks even if the process no longer
exists.
-fullPaths tells sample to show the full path to the source code (rather than just the file name) for any symbol in a binary image for which
debug information is available. The full path was the path to the source code when the binary image was built.
-e tells sample to open the output in TextEdit automatically when sampling completes.
-file filename tells sample the full path to where the output should be written. If this flag is not specified, results are written to a
file in /tmp called <application name>_<date>_<time>.<XXXX>.sample.txt, where each 'X' is replaced by a random alphanumeric character.
If neither the -e nor -file flags are given, the output gets written to stdout as well as saved to the file in /tmp.
SEE ALSO
filtercalltree(1), spindump(8)
The Xcode developer tools also include Instruments, a graphical application that can give information similar to that provided by sample. The
Time Profiler instrument graphically displays dynamic, real-time CPU sampling information.
BSD
Mar. 16, 2013 BSD