How to remove everything after a word containing string?


Login or Register to Reply

 
Thread Tools Search this Thread
# 8  
Old 01-25-2019
@nezabudka: The shell cycle is used to create the respective output files. If you do it all in awk, redirect output immediately within it. Like
Code:
awk '
FNR == NR       {T[$1]
                 next
                }
FNR == 1        {if (FN) close (FN)
                 FN = "report_" FILENAME
                }
                {for (t in T) sub (t".*", t)
                 print > FN
                }
'   readfile source*

These 2 Users Gave Thanks to RudiC For This Post:
000vikas (01-26-2019) nezabudka (01-25-2019)
# 9  
Old 01-25-2019
Quote:
Originally Posted by nezabudka
Boris
Why do you delete the file extension and immediately add it?
Code:
> report_$(basename "${file/.txt}").txt

And why do you run it in a cycle, only you lose time. Try it first
Code:
awk 'NR==FNR {T[$0]; next} {for(i=1;i<=NF;i++){for(t in T) if ($i ~ t) $0=$i}} 1' readfile source*.txt

To localize the error, try adding a filter, for example
Code:
awk 'NR==FNR {T[$0]; next} NF > 1 && NF < 4 {for(i=1;i<=NF;i++){for(t in T) if ($i ~ t) $0=$i}} 1' readfile source*.txt

PS: readfile and source are the same as I posted under this thread.

Hello,
I am sorry for the headache.
I posted in next script that I have more files to be processed.
So, I edited the main post.
What I typed in first post gives expected result.
I need time to check what was wrong at my end.

Thank you
Boris

--- Post updated at 05:09 PM ---

Hello Again,
Here is the output:

Code:
root@house:~/test# awk 'NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1' readfile source
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1:            ^ syntax error
awk: cmd. line:1: error: invalid subscript expression
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1:                                         ^ syntax error
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1:                                              ^ 1 is invalid as number of arguments for sub

Thank you
Boris

--- Post updated at 11:20 PM ---

Hello,
As I faced problems with awk, sorted out in below algorithm shortly:

-First line removed in sourcefile
-Grep all lines containing COL1 in sourcefile > output1
-Grep all lines not-containing COL1 in sourcefile > output2
-Paste -d '\n' both output files

I am sorry for the suffer I caused.

Thank you
Boris

Last edited by baris35; 01-25-2019 at 11:21 PM.. Reason: extra info
# 10  
Old 01-26-2019
Quote:
Originally Posted by baris35
...

Hello Again,
Here is the output:

Code:
root@house:~/test# awk 'NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1' readfile source
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1:            ^ syntax error
awk: cmd. line:1: error: invalid subscript expression
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1:                                         ^ syntax error
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1:                                              ^ 1 is invalid as number of arguments for sub

Thank you
Boris
...
Wouldn't it make serious sense to read and try to understand the error message(s)? And, compare your code to the proposal given in post #6 ... see the difference?
# 11  
Old 01-26-2019
Quote:
Originally Posted by RudiC
Wouldn't it make serious sense to read and try to understand the error message(s)? And, compare your code to the proposal given in post #6 ... see the difference?
No, I do not see difference when I run both seperately.
As I do not understand awk, I put codes given in #6 between echo to see what is printing:
s2.sh
Code:
while read COL1 COL2
do
echo " awk 'NR==FNR {T[$1]; next} {for (t in T) if ($1 ~ t) $0 = $1} 1' readfile source "
done<readfile

output
Code:
awk 'NR==FNR {T[]; next} {for (t in T) if ( ~ t) ./s2.sh = } 1' readfile source
awk 'NR==FNR {T[]; next} {for (t in T) if ( ~ t) ./s2.sh = } 1' readfile source
awk 'NR==FNR {T[]; next} {for (t in T) if ( ~ t) ./s2.sh = } 1' readfile source

When I run without echo:

Code:
#!bin/bash
#test1
http://www.aa.bb.cc
#test2
http://www.11.rr.cd
#test3
http://www.22.qq.fc
#!bin/bash
#test1
http://www.aa.bb.cc
#test2
http://www.11.rr.cd
#test3
http://www.22.qq.fc
#!bin/bash
#test1
http://www.aa.bb.cc
#test2
http://www.11.rr.cd
#test3
http://www.22.qq.fc

When I put the second code in #6,

s2.sh
Code:
while read COL1 COL2
do
echo " awk 'NR==FNR {T[$1]; next} {for (t in T) sub (t".*", t)} 1' readfile source "
done<readfile

output:
Code:
 awk 'NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1' readfile source
 awk 'NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1' readfile source
 awk 'NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1' readfile source

When I run s2.sh without echo:

Code:
#!bin/bash
#test1
http://www.aa.bb
#test2
http://www.11.rr.cd
#test3
http://www.22.qq.fc
#!bin/bash
#test1
http://www.aa.bb
#test2
http://www.11.rr.cd
#test3
http://www.22.qq.fc
#!bin/bash
#test1
http://www.aa.bb
#test2
http://www.11.rr.cd
#test3
http://www.22.qq.fc

Sorted out with grep + sed + paste commands. Awk is more complicated for me.

Thank you
Boris
# 12  
Old 01-26-2019
What in the shown result of s2.sh does not satisfy your needs? Looks perfect to me, considering the code you presented.

Quote:
Originally Posted by baris35
No, I do not see difference when I run both seperately.
The difference is the T array index is $1 in the working code, and empty in your error case. I desparately try to understand why "Awk is more complicated for" you and you forgo the efficient complete solutions presented to you in posts #7 or #8, falling back to highly inefficient band aid pseudo solutions.

Consider the case the bumper has fallen off your car. The professional repair shop grabs their MIG welder, welds the screw nuts back to the carrier beam, and with a ratchet screws the bumper back on. Amateurs use chewing gum to fill the gaps, and scotch tape to glue the bumper back.


awk (or perl, sed, etc.) is the MIG welder and the ratchet at your finger tips.

Last edited by RudiC; 01-26-2019 at 07:06 AM..
# 13  
Old 01-26-2019
So, should I tell awk to search which column to be looked up? Sed+grep are like medium frequency welding technology for me. So far, awk seems not-comprehensible, even after your detailed explanation. I need to read more and more..

Thank you for your time.
Boris

--- Post updated at 07:16 AM ---

Quote:
Originally Posted by RudiC
What in the shown result of s2.sh does not satisfy your needs? Looks perfect to me, considering the code you presented.
Hello Rudic,
It requires:
Code:
perl -i -ne 'print unless ${$_}++' output

Don't worry, the problem solved with a bit longer way

Kind regards
Boris
This User Gave Thanks to baris35 For This Post:
nezabudka (01-27-2019)
# 14  
Old 01-27-2019
I apologize to the author of the topic for rejection.
I think too beautiful to work correctly.
Quote:
Originally Posted by RudiC
Code:
awk '
FNR == NR       {T[$1]
                 next
                }
FNR == 1        {if (FN) close (FN)
                 FN = "report_" FILENAME
                }
                {for (t in T) sub (t".*", t)
                 print > FN
                }
'   readfile source*

I have simplified:
Code:
awk 'NR==FNR {T[$1]; next} {for (t in T) sub (t".*", t)} 1' readfile source

Although the condition of the task to remove the remaining substring after the match.
But logically, it is still necessary to store the address fully in which the entire match is found.
Example
result: http://www._aa.bb
result as I imagine it: http://www.aa.bb.cc
Code:
awk 'NR==FNR {T[$1]; next} {for (t in T) sub(t, "\r"t)
$0 = gensub(/\r([^ ]*).*/, "\\1", 1)} 1' readfile source

Regards to RudiC, yet the decision is very beautiful and concise.
There was something to learn
This User Gave Thanks to nezabudka For This Post:
Neo (01-28-2019)
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Please remove the word from posts ptappeta Post Here to Contact Site Administrators and Moderators 4 11-06-2015 12:01 AM
Remove string perl with first or last word is in a list cyrine Shell Programming and Scripting 2 01-10-2015 12:25 PM
Remove word before a character wahi80 Shell Programming and Scripting 2 10-13-2014 01:30 PM
Remove not only the duplicate string but also the keyword of the string in Perl askari Shell Programming and Scripting 2 03-12-2014 08:09 PM
Remove word with sed aydj Shell Programming and Scripting 5 03-11-2014 07:43 AM
Remove last word of a string? sea Shell Programming and Scripting 1 02-24-2014 10:51 AM
How to remove first word? pokhraj_d Shell Programming and Scripting 6 12-25-2013 02:14 AM
Remove 1st word and _ from string vedanta Shell Programming and Scripting 1 05-21-2013 09:16 AM
sed command to remove a word from string anand.shah Shell Programming and Scripting 7 12-11-2012 07:25 AM
want to remove last word. javeedkaleem Shell Programming and Scripting 7 09-22-2011 04:40 AM
Replace a word in a string starting with another word mukeshbaranwal Shell Programming and Scripting 2 07-26-2011 03:01 AM
grep part of word or Another word from a string linuxadmin Shell Programming and Scripting 2 05-30-2011 01:25 AM
remove characters from string based on occurrence of a string victor369 Shell Programming and Scripting 5 02-03-2011 08:37 PM
Remove particular word from file darshakraut Shell Programming and Scripting 4 09-10-2009 08:45 AM
how to remove first word malaysoul Shell Programming and Scripting 1 05-29-2008 03:33 AM