I wrote an awk script to filter "uninteresting" commands from my ~/.bash_history (I know about HISTIGNORE, but I don't want to exclude these commands from my current session's history, I just want to avoid persisting them across sessions).
The history file can contain multi-line entries with embedded newlines, and entries are separated by timestamps. Given an input file like:
the script filters out single-line ls, man, and cat commands, producing:
Notice that multi-line entries are unfiltered -- I figure if they're interesting enough to warrant multiple lines, they're worth remembering.
I've been reading about Sed's multiline capabilities and I'm curious how its hold-space and pattern-space might be manipulated to acheive the same filtering as my Awk script. Rather than use Gnu-sed's -z flag to treat the whole file as a single massive pattern space, I'm looking for a solution that uses commands such as h,H,x,G,N,etc. to accumulate lines in the hold space and swap/delete lines as necessary.
Here's the Awk script:
Last edited by ivanbrennan; 08-02-2017 at 12:31 AM..
Reason: adjust line spacing
Since you are only excluding single line commands, you could just peak ahead one line using the N command and only leave out those entries:
or
with GNU sed or BSD sed:
Hm... peaking ahead one line won't let me distinguish a single-line command (which should be excluded if it contains ls|cat|man) from the beginning of a multiline command (which should be kept even if it contains ls|cat|man).
For example, if the exclusion pattern was "xxx", the following input,
would result in this output:
The second record should have passed through unmodified since it has multiple lines, but instead it's head was removed and the rest got tacked onto the previous record.
I was thinking something like, when you reach a timestamp, exchange pattern-space with hold-space (x). Now hold-space is ready to start accumulating the oncoming entry and pattern-space holds whichever entry was previously accumulated. I should be able to perform whatever substitution is necessary on pattern-space now to filter out commands I'm not interested in, since I have the full entry. That gets complicated a bit trying to correctly handle the first and last lines of the file.
Hi,
Your awk script is'nt ok here.
I must change the first line.
You can try that with sed, I think it's ok.
cat & man with space (ie cat lefile or man tr)
It's more hard with ls.
Apparently mawk doesn't support regex repetitions, and maybe not POSIX character classes either.
I couldn't get the desired results from your sed snippet. Not sure why though.
---------- Post updated at 08:20 PM ---------- Previous update was at 08:12 PM ----------
I finally came up with something that works. It's nasty, and I don't doubt there's a better way, but it was satisfying to at least get something working.
I benchmarked it against my original awk script, as well as against the following gsed script:
Run on a ~50,000 line file, I get the following results:
sed: 80 milliseconds
awk: 70 milliseconds
gsed: 60 milliseconds
This User Gave Thanks to ivanbrennan For This Post:
Apparently mawk doesn't support regex repetitions, and maybe not POSIX character classes either.
[..]
Indeed the mawk version that gets installed by distributions supports neither. I think the latest version does, but you would need to get the source and compile yourself..
--
Your approach seems to also leave out one line commands that do not contain ls man or cat.
Last edited by Scrutinizer; 08-03-2017 at 03:56 AM..
Because d directly jumps to the next cycle, and the input line is not modified in the condition branch, the following code does not need a negated condition.
This User Gave Thanks to MadeInGermany For This Post:
Hi,
Could you please provide me command to filter contents between date in a log file?
Say for example, in a log file I want to capture contents between date May 01 from 5am to 9 am.
OS -- Linux
Regards,
Maddy (1 Reply)
I have to hit a very large database to pull fields of information.
I have a script that runs multiple instance of the same query against the data base and writes contents to a file.
The script terminates before the file is completely written to confirmed by
ps -ef | grep <script name>... (3 Replies)
Hi Experts,
i have a file like below
****
table name is xyz
row count for previous day 10
row count for today 20
diff between previous and today 10
scan result PASSED
****
table name is abc
row count for previous day 90
row count for today 35
diff between previous and today 55... (4 Replies)
Hi,
I'm a newbie with scripting so I'd appreciate any help.
I have a file import.txt with below text
AA_IDNo=IDNoHere
AA_Name=NameHere
AA_Address=AddressHere
AA_Telephone=TelephoneHere
AA_Sex=SexHere
AA_Birthday=BirthdayHere
What I need is that the Lines for Name, Address and... (3 Replies)
I've been using sed to help with reformatting some html content into latex slides using the beamer class. Since I'm new to sed, I've been reading a lot about it but I'm stuck on this one problem.
I have text that looks like this:
*******************
line of text that needs to be... (4 Replies)
Good day.
Trying to make a sed script to take text file in a certain format and turn it into mostly formatted html.
I'm 95% there but this last bit is hurting my head finally.
Here's a portion of the text-
Budgeting and Debt:
Consumer Credit Counseling of Western PA
CareerLink
112... (5 Replies)
I have a file which is having fileds separtaed by delimiter.
Ex:
C;4498;qwa;cghy;;;;40;;222122
C;4498;sample;city;;;;34 2;;222123
C;4498;qwe;xcbv;;;;34-2;;222124
C;4498;jj;sffz;;;;41;;222120
C;4498;eert;qwq;;;;34 A;;222125
C;4498;jj;szxzzd;;;;34;;222127
out of these records I... (3 Replies)
Morning, people!
I'd like to call upon your expertise again, this time for a sed endeavor.
I've already searched around the forums, didn't find anything that helped yet.
background: Solaris 9.x, it's a closed system and there are restrictions to what is portable to it. So let's assume I... (4 Replies)
I need an assistance in file generation using awk, sed or anything...
I have a big file that i need to filter desired parts only. The objective is to select (and print) the report # having the string "apple" on 2 consecutive lines in every report. Please note that the "apple" line has a HEX... (1 Reply)