Script to process a list of items and uncomment lines with that item in a second file
Hello,
I have a src code file where I need to uncomment many lines.
The lines I need to uncomment look like,
The comment is the "C" in the first column. This needs to be deleted so that there are 6 spaces preceding "CALL". The key on this line is 'Gmax'. This lets me know that the line needs to be uncommented.
I have a list of such keys
Each key (including the single quotes) will occur only once in the src file being processed. I need to process the list file to look in the src file and uncomment the proper lines. There are about 300 keys in the list file.
This is what I tried,
This simply reads the list file and one at a time looks for the items in the file to modify. The awk code looks for the presence of the key on each line (including the single quotes) and if found prints the substring skipping the first character. When the key is not found on the line, the line is printed unmodified. After the key is processed, the akw modified file is renamed to be the file awk is working on for the next loop.
This works as far as I can tell. I am writing an entire copy of the modified file for each key in the list file, so this is not very efficient. The file renaming at the end of the loop is a bit kludgy as well. This only takes about 7 seconds to run, so maybe I am being picky and should just let it be but I thought I would ask if there were other suggestions.
How about using a single line sed like this:-
It's a little messy to read, so:-
The s command calls substitution
There is the start of line marker with ^ and then the literal character C that we want to remove if we match the condition between the first / pair
The escaped brackets \( and \) wrap a section of the line matched so we can use it later. There is only 1 such grouping in this regular expression.
There are the six spaces and the literal word you want to be sure you are matching so we then get CALL (six leading spaces)
We then don't care much about what the next part of the line looks like, so we use a single wildcard character . and the following * repeats for zero or more, so any number of characters
We then have the literal text 'Gmax' to look for. The ' characters is a literal because the expression is wrapp with ". The alternate strings you have need to be grouped and alternated The group is wrapped (again) with and escaped bracket, so \( and \) and the strings listed inside. The alternator separator | also has to be escaped, hence you end up with this part being \ to avoid being interpreted. We want the literal characters
We then have the same .* as above to match the rest of the line and end the group with )
After the separating / that shows the end of the expression we have the start of what to substitute it with. We substitute the lines matched with the first group we matched, i.e. the bit in ( and ) above. Here we use \1 to represent the first (only) matched part, which is everything excluding the leading C as required. For completeness, you also have the following available to you:-
\0 - the entire original record matched
\1 - the first group matched (in this case the entire line excluding the leading C
\2 - the second group match, in this case one of Gmax, Gmin etc. as matched, if that's useful in any way.
Unmatched strings (not a leading C or not containing Gmax or whatever) are just printed as they are.
Does this meet your need? Does the explanation make sense?
You could be brave and use the -i flag and no target file to just update the source file, but I'd recommend testing it first to make sure you are happy.
If the list of alternates is getting overly complex, you could pout them in a reference file, one line at a time and build the list for your command, something like:-
Perhaps run this with bash -xv your_script_name to check what it's doing.
I hope that this helps,
Robin
Last edited by rbatte1; 01-09-2020 at 11:47 AM..
These 3 Users Gave Thanks to rbatte1 For This Post:
Nice approach indeed!
Could be curtailed to
, including the item list file as well.
I went with this method inserted into a script. It worked well (and very quickly) the first time I tried it, but there was no output the second time. I will have to investigate what I did there.
I also made a second try before there were any responses here. This ended up looking more like the code posted by MadeInGermany where I read in the file to be modified and stored it in an array. I then did a double loop with the outside loop being my list file and the inside loop being the array with the file to be modified. Each item in the list was searched against the lines in the array. If a match was found, the array element was modified to remove the comment and then there was a break in the inner loop. The modified array was printed at the end. This approach means that each file is read in once and the output was written once, instead of once for each list item.
It seems to me that sed must be doing more or less the same thing under the hood. Every list item must be checked against every item in the file to be modified, at least until a match is found. I wasn't able to rationalize if it was more efficient to have one or the other file be the inner loop. The only approach I could think of that would be faster would be to identify the 'Gmax" value on each line of the file to be modified and then loop up that value in a map holding the list. That would, however, involve much more significant parsing of the lines to extract the 'Gmax' value. It is very nice to have a glob match, especially when there isn't a clear and consistent delimiter. If the list was the inner loop, you could delete each array element when a match was found and thus shorten the search as the process continues but deleting and shifting around array elements also takes resources.
Does anyone know what sed is doing to achieve the result so quickly? Is it mainly that is is using compiled code?
Hello,
I need to collect some statistical results from a series of files that are being generated by other software. The files are tab delimited. There are 4 different sets of statistics in each file where there is a line indicating what the statistic set is, followed by 5 lines of values. It... (8 Replies)
Hello,
I have some tab delimited text data,
file: final_temp1
aname val
NAME;r'(1,) 3.28584
r'(2,)<tab>
NAME;r'(3,) 6.13003
NAME;r'(4,) 4.18037
r'(5,)<tab>
You can see that the data is incomplete in some cases. There is a trailing tab after the first column for each incomplete row. I... (2 Replies)
I have one master file "File1" with all such info in it. I need to grep each object under each list from another file "File2". Can anyone help me with a script for this.
File 1
------
List 1
Object 1
Object 2
List 2
Object 3
Object 1
List 3
Object 2
... (5 Replies)
cd path
line1
line2
line3
line4
line5
Lets say thats the sample script...So say if i have to comment the above script, which would be the better way so that whenever i want, i cud comment or uncomment the same.
Thanks (1 Reply)
Hi All,
Need a small help in writing a shell script which can delete a few lines from a file which is currently being used by another process.
File gets appended using tee -a command due to which its size is getting increased.
Contents like :
25/09/2012 05:18 Run ID:56579677-1
My... (3 Replies)
Hi,
I have the following lines that I would like to see in an array for easy comparisons and printing:
Example 1:
field1,field2,field3,field4,field5
value1,value2,value3,value4,value5Example 2:
field1,field3,field4,field2,field5,field6,field7... (7 Replies)
I have a combo.cgi here. this is linux environment
What i am going to do is this combobox will list down all the flatfile name in this /u/test/cgi-bin/List directory.
after that, i wanted it to open the flatfile and display the content of the flatfile into another listbox or textarea in this page... (0 Replies)
Hi,
I've a list in the following format:
Empdept filedetails buildingNo Area
AAA 444 2 juy
AAA 544 2 kui
AAA 567 4 poi
AAA 734 5 oiu
AAA 444 ... (2 Replies)
Requirement is:
1. comment and uncomment the line with Shell
Script: /opt/admin/fastpg/bin/fastpg.exe -c -=NET (using fastpg.exe as a search option)
2. display = "Commented" (when its commented) and display = "Uncommented" (when its uncommented)
Its urgent, please let me asap!!!
Thanks in... (2 Replies)
I have a dropdown menu built in perl tk (I am using active state perl). I want to select a value from the dropdown menu and I want to be able to perform some other actions depending upon what value is selected. I have all the graphical part made but I dont know how to get the selected value. Any... (0 Replies)