In-place move to the top of the file of the final 5 lines without using any variables or temp files:
Note: The process substitution implementations I've seen require either /dev/fd or named pipes. If /dev/fd isn't available, and if the shell cannot create the fifo in its usual temp dir, the shell may need to be informed of a suitable alternative location.
Regards,
Alister
Last edited by alister; 10-26-2013 at 06:26 PM..
Reason: Corrected text to match tail's five line count
These 3 Users Gave Thanks to alister For This Post:
Hi Alister, that looks ingenious . I presume you mean tail -n 50000 and wc -m? Could you elaborate why the file needs be redirected read/write on stdout?
I tried it on Linux with tail -n 5 and this works fine,
But on OSX 10.9 (bash 3 and bash 4) I got:
and the file ended up consisting of the last 5 lines an empty line and the last 5 lines again.
On Solaris (bash 3 and using XPG4 utilities):
On HPUX (bash 4)
And the file became 0 length
On AIX:
Last edited by Scrutinizer; 10-26-2013 at 06:43 PM..
This User Gave Thanks to Scrutinizer For This Post:
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi.
Both of the solutions, shell variable and "dd move" are about the same as far as system resources go. I used alister's amended solution; his first failed (the one that went out with an email notification).
For a file of 14,754,910 lines, about 1 GB, the times for the shell solution keeping the last 50,000 lines were:
and for the clever "dd move" solution:
There were 2 wc executions for verification in both runs. Leaving those out:
and
The results actually surprised me -- I thought the shell would be slower. I saw paging being used a few times, slightly more often with the dd. The shell was run first, so cache advantage, if any, went to dd ... cheers, drl
This system:
Last edited by drl; 10-26-2013 at 06:39 PM..
Reason: Typo.
Thank you for catching the line count mismatch. My testing involved moving just the last 5 lines of a 15 line file. I changed the text to match -n5.
No, I definitely did not intend wc -m. dd will seek bs*seek bytes, not characters.
Thank you for testing this construct on so many platforms. It appears that all of the failures are the result of leading blank(s) emitted by most wc implementations but not by GNU wc (with which I tested). This hypothesis is consistent with your error messages and supported by a quick peek at code. From a BSD wc implementation used by OS X:
Perhaps you could confirm by using tr to delete any spaces?
Alternatively, depending on how dd converts the text to an int, leading blanks might not be a problem if protected from shell parsing. Perhaps simply double quoting the command substitution will do (although this feels fragile):
The read/write nature of tee's stdout is not relevant. The utility of <> in this case is that it leaves the file descriptor's offset at 0 and allows tail's output (via tee) to write to the beginning of the file without truncation (which dd will perform afterwards). >> and > are both unsuitable since the former appends all writes and the latter truncates before the first write.
Regards,
Alister
---------- Post updated at 06:20 PM ---------- Previous update was at 06:02 PM ----------
Quote:
Originally Posted by drl
The results actually surprised me -- I thought the shell would be slower.
I'm not surprised. My command substitution, process substitution, and the pipeline within it require more work to establish, and, once running, require more context switching to move data around.
One advantage of using all those pipes is that memory consumption is not a function of the amount of data to be moved.
That said, neither performance nor resource consumption motivated me. I was only trying to see if I could accomplish it without reading the file twice and without explicit memory storage.
Quote:
Originally Posted by drl
Using echo with arbitrary text can produce unexpected results. It's best to use printf '%s\n' "$v1".
More importantly, that solution cannot handle trailing blank lines properly, since command substitution always strips them. This shortcoming may be perfectly acceptable in some situations and an utter dealbreaker in others.
Regards,
Alister
Last edited by alister; 10-26-2013 at 07:40 PM..
These 2 Users Gave Thanks to alister For This Post:
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi, alister.
Quote:
Originally Posted by alister
Using echo with arbitrary text can produce unexpected results. It's best to use printf '%s\n' "$v1" .
Thanks for the reminder. I usually use my function: print like echo, because the name is shrtr, and uses printf -- but sometimes I forget ... cheers, drl
I have been searching and trying to come up with an awk that will perform the following on a
converted text file (original is a pdf).
1. Since the first two lines are (begin with) text they are removed
2. if $1 is a number then all text is merged (combined) into one line until the next... (3 Replies)
Hello everyone,
Although it seems easy, I've been stuck with this problem for a moment now and I can't figure out a way to get it done.
My problem is the following:
I have a file where each line is a sequence of IP addresses, example :
10.0.0.1 10.0.0.2
10.0.0.5 10.0.0.1 10.0.0.2... (5 Replies)
I have two files, a keepout.txt and a database.csv. They're unsorted, but could be sorted.
keepout:
user1
buser3
anuser19
notheruser27
database:
user1,2343,"information about",field,blah,34
user2,4231,"mo info",etc,stuff,43
notheruser27,4344,"hiya",thing,more thing,423... (4 Replies)
Hey Gang-
I have a list of servers. I want to exclude servers that begin with and end with certain characters. Is there an easy command to do this?
Example
wvm1234dev
wvm1234pro
uvm1122dev
uvm1122bku
uvm1344dev
I want to exclude any lines that start with "wvm" OR "uvm" AND end... (7 Replies)
Hi,
I'm not a expert in shell programming, so i've come here to take help from u gurus.
I'm trying to tailor a csv file that i got to make it work for the LOAD FROM command.
I've a datatable csv of the below format -
--in file format
xx,xx,xx ,xx , , , , ,,xx,
xxxx,, ,, xxx,... (11 Replies)
A small question
I have a test.txt file
I have contents as:
a:google
b:yahoo
:
c:facebook
:
d:hotmail
How do I remove the line with :
my output should be
a:google
b:yahoo
c:facebook
d:hotmail (5 Replies)
Hi gurus,
i'm trying to remove a number of lines from a large file using the following command:
sed '1,5000d' oldfile > newfile
Somehow the lines in the old file are not deleted...
Am I doing this wrongly? Any suggestions? :confused:
Thanks! :)
wee (10 Replies)
All,
I have a text file with several entries like below:
personname
personname.domain.com
I know there is a way to use vi to remove only the personname.domain.com line. Can someone help? I believe that it involves /s/g/ something...I just can't remember the exact syntax.
Thanks (2 Replies)
Hi,
There seems to some hack attempts in my site. I have attached the index page of my site and I need to remove the below lines from the index page. The below lines are at the center of the file.
-->
</style>
<script>E V A L( unescape(... (5 Replies)