Sponsored Content
Top Forums UNIX for Advanced & Expert Users Which cut command is more efficient? Post 302507397 by methyl on Wednesday 23rd of March 2011 06:01:56 PM
Old 03-23-2011
@Corona688
Yes, 36 cores (9x4). CPU power not an issue. Regularly running over 30,000 concurrent processes.

Bottleneck on reading large files is invariably the disc system, closely followed by the software. This is where reading moderate size files with "cat" scores over the read function in some unix utilities. I recognise that "cut" is actually one of the better ones.

For the advanced user with large data files I am not averse to using "dd" or "cpio" (or both) to read from the disc in an optimum manner.

On a single core system running ancient unix it was very important to minimise the number of concurrent processes. This is really not the case nowadays unless you happen to be running unix on a home system.

Back to the O/P.
The conventional answer is that running more processes is less efficient. On a modern large system with multiple processors (i.e. the norm) it can be more efficient to run a pipeline of multiple efficient processes than to run a single inefficient process.
The "Useless use of cat" brigade have clearly never used a modern computer where apparent inefficiencies are in fact covered by proper utilisation of the software and hardware as a team.
By applying lateral thought we can deduce that hardware design evolution is actually targetted towards making inefficient processes efficient. We can take advantage of that by tactical use of the previously-inefficent processes.

Nuff said.

Last edited by methyl; 03-23-2011 at 07:07 PM.. Reason: spellin, verbosity
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

cut command

how do you show just the used disk space. using the cut and df command?? or does anyone have any other suggestions on how to do it a better way? (3 Replies)
Discussion started by: rookie22
3 Replies

2. Shell Programming and Scripting

Cut command

Hi, I want to cut from a particular position to a particular position and retain the rest. I tried this cut -c31-51 file1.txt > file2.txt But The characters from the position 31 to 51 were only present in file2.txt. Is there a way to reverse this i.e to retain the rest except from... (1 Reply)
Discussion started by: ragavhere
1 Replies

3. Shell Programming and Scripting

Help with cut command

Gurus, I need help with the cut command : I have a file with garbage charaters at the beginning of each record; but these characters are not of the same length; First record has 3 garbage chars to be removed; rest have 2; If the length was consistent across all the records, I could have... (3 Replies)
Discussion started by: tru_tell
3 Replies

4. Shell Programming and Scripting

Cut Command error cut: Bad range

Hi Can anyone what I am doing wrong while using cut command. for f in *.log do logfilename=$f Log "Log file Name: $logfilename" logfile1=`basename $logfilename .log` flength=${#logfile1} Log "file length $flength" from_length=$(($flength - 15)) Log "from... (2 Replies)
Discussion started by: dgmm
2 Replies

5. UNIX for Dummies Questions & Answers

Cut pid from ps using cut command

hay i am trying to get JUST the PID from the ps command. my command line is: ps -ef | grep "mintty" | cut -d' ' -f2 but i get an empty line. i assume that the delimiter is not just one space character, but can't figure out what should i do in order to do that. i know i can use awk or cut... (8 Replies)
Discussion started by: ran ber
8 Replies

6. Shell Programming and Scripting

Cut command

hi, i have a file abc,"an,ab",cde,efg abc,anab,cde,efg and need to cut the second field so the output should be abc,cde,efg and i have used cut -d',' -f1-1,3- but its giving me abc,ab",cde,efg abc,cde,efg (4 Replies)
Discussion started by: ATWC
4 Replies

7. UNIX for Beginners Questions & Answers

Cut command: can't make it cut fields

I'm a complete beginner in UNIX (and not a computer science student either), just undergoing a tutoring course. Trying to replicate the instructions on my own I directed output of the ls listing command (lists all files of my home directory ) to My_dir.tsv file (see the screenshot) to make use of... (9 Replies)
Discussion started by: scrutinizerix
9 Replies

8. Shell Programming and Scripting

Combining awk command to make it more efficient

VARIABLE="jhovan 5259 5241 0 20:11 ? 00:00:00 /proc/self/exe --type=gpu-process --channel=5182.0.1597089149 --supports-dual-gpus=false --gpu-driver-bug-workarounds=2,45,57 --disable-accelerated-video-decode --gpu-vendor-id=0x80ee --gpu-device-id=0xbeef --gpu-driver-vendor... (3 Replies)
Discussion started by: SkySmart
3 Replies

9. Shell Programming and Scripting

Efficient way to combine command

im currently running the following command to grab all arguments in front of a script, directly from the process table. # cat /tmp/allmyprocs ubuntu 9933 27793 0 03:29 pts/0 00:00:00 /bin/sh ./prying.sh ubuntu 9941 9933 0 03:29 pts/0 00:00:00 sh ubuntu 9952 9941 0 03:29... (1 Reply)
Discussion started by: SkySmart
1 Replies
serialize(1)						      General Commands Manual						      serialize(1)

NAME
serialize - force target process to run serially with other processes SYNOPSIS
command [command_args] pid] DESCRIPTION
The command is used to force the target process to run serially with other processes also marked by this command. The target process can be referred to by pid value, or it can be invoked directly on the command. Once a process has been marked by the process stays marked until process completion unless is reissued on the serialized process with the option. The option causes the pid specified with the option to return to normal timeshare scheduling algorithms. This call is used to improve process throughput, since process throughput usually increases for large processes when they are executed serially instead of allowing each program to run for only a short period of time. By running large processes one at a time, the system makes more efficient use of the CPU as well as system memory, since each process does not end up constantly faulting in its working set, to only have the pages stolen when another process starts running. As long as there is enough memory in the system, processes marked by behave no differently from other processes in the system. However, once memory becomes tight, processes marked by are run one at a time with the highest priority processes being run first. Each process will run for a finite interval of time before another serialized process is allowed to run. Options supports the following options: Indicates the process specified by pid should be returned to timeshare scheduling. Indicates the pid of the target process. If neither option is specified, is invoked on the command line passed in. RETURN VALUE
returns the following value: Successful completion. Invalid pid specification, nonnumeric entry, or pid specification is that of a special system process. Could not execute the specified command. No such process. Must be root or a member of a group having the privilege to execute ERRORS
fails under the following condition and sets (see errno(2)) to the following value: The pid passed in does not exist. EXAMPLES
Use to force a database application to run serially with other processes marked for serialization: Force a currently running process with a pid value of 215 to run serially with other processes marked for serialization: Return a process previously marked for serialization to normal timeshare scheduling. The pid of the target process for this example is WARNINGS
The user has no way of forcing an execution order on serialized processes. AUTHOR
was developed by HP. SEE ALSO
setprivgrp(1M), getprivgrp(2), serialize(2). serialize(1)
All times are GMT -4. The time now is 09:02 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy