|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Difficulty cleaning references to duplicated images in HTML code
Hi, I need to search and replace references to duplicated images in HTML code. There are several groups of duplicated images, which are visually the same, but with different filenames. I managed to find the duplicated files themselves, but now I need to clean the code too. I have a CSV file with each group of duplicated images organized: Code:
Group ID,Duplicated image filename, Number of duplicates 0,13429.png,3 0,18064.png,3 0,25025.png,3 1,14136.png,4 1,17382.png,4 1,19243.png,4 1,25389.png,4 2,21560.png,2 2,5529.png,2 3,3523.png,2 3,4811.png,2 and so on... The references to duplicated images are scattered throughout hundreds of HTML files. The task is to get the <img> tags that references duplicates pointing to just one unique image in each group. I'm wondering if some script magic could get it done easily. HTML (before): different files, same visual appearance Code:
<!-- group 0 --> <img src="13429.png" />...text...<img src="18064.png" />...text...<img src="18064.png" /> <!-- group 1 --> <img src="14136.png" />...text...<img src="17382.png" />...text...<img src="19243.png" />...text...<img src="25389.png" /> <!-- group 2 --> <img src="21560.png" />...text...<img src="5529.png" /> HTML (after): unique file in each group Code:
<!-- group 0 --> <img src="13429.png" />...text...<img src="13429.png" />...text...<img src="13429.png" /> <!-- group 1 --> <img src="14136.png" />...text...<img src="14136.png" />...text...<img src="14136.png" />...text...<img src="14136.png" /> <!-- group 2 --> <img src="21560.png" />...text...<img src="21560.png" /> I searched for some solutions here in the forum, with no success. Any help you can give would be greatly appreciated. Last edited by mdart; 01-30-2013 at 01:32 PM.. |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
Not sure I understand what you want to accomplish. Can I paraphrase it like so: replace in all files selected every occurrence of second ff member in group by first, i.e. 18064.png, 25025.png with 13429.png; 17382.png, 19243.png, 25389.png with 14136.png and so on?
|
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
@RudiC: Yes, that's correct. Sorry if I wasn't very clear.
|
|
#4
|
|||
|
|||
|
OK, try this very crude approach, which may need serious polishing: Code:
awk -F, 'NR==FNR {Ar[$1]=Ar[$1](Ar[$1]?"|":"")$2;
if (!Rr[$1])Rr[$1]=$2; next}
{for (i in Ar) gsub (Ar[i], Rr[i])}
1
' file file1
<!-- group 0 -->
<img src="13429.png" />...text...<img src="13429.png" />...text...<img src="13429.png" />
<!-- group 1 -->
<img src="14136.png" />...text...<img src="14136.png" />...text...<img src="14136.png" />...text...<img src="14136.png" />
<!-- group 2 -->
<img src="21560.png" />...text...<img src="21560.png" /> |
| The Following User Says Thank You to RudiC For This Useful Post: | ||
mdart (01-30-2013) | ||
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Thanks, that worked!
Sorry for the newbie question, but how can I run it in more than one file at once? |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
You can, but how you do it depends on some other factors, like how to collect/find the input files, output concatenated or in separate files. If all files are in the same directory which is your working directory, this will do: Code:
awk '...' file.csv *.html If you have them in a file.txt, try Code:
awk '...' file.csv $(cat file.txt) (not sure if this is a UUOC, and there's a better way) If you need the output separated, try replacing the singular 1 in line 4 by Code:
{print > FILENAME"new"} |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
Brilhant, RudiC, this is going to be extremelly useful! ---------- Post updated 01-31-13 at 12:15 AM ---------- Previous update was 01-30-13 at 06:46 PM ---------- I managed to output the results in a new file with Code:
{print >> "new"}Is there a way to just overwrite the original files? It's necessary to replace them with the results anyway. |
| Sponsored Links | ||
|
![]() |
| Tags |
| csv, html, images, replace, search |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Cleaning AWK code | Jotne | Shell Programming and Scripting | 8 | 10-06-2012 10:57 AM |
| Referring to attached images in html email body through mailx | biswasbaishali | Shell Programming and Scripting | 5 | 03-15-2010 01:13 PM |
| html link to images in /tmp directory | Solerous | Web Programming | 2 | 11-25-2008 01:00 PM |
| how can compile cpp code containing references to java classes | surinder | Programming | 1 | 10-07-2008 10:44 AM |
|
|