Replacing dates]] with (dates)]]


 
Thread Tools Search this Thread
Homework and Emergencies Emergency UNIX and Linux Support Replacing dates]] with (dates)]]
# 15  
Old 01-16-2011
The utility certainly will not work with a CSV file. It was designed to specifically work with the example data your provided.

Just send me a private message with the location of the file and I will look at it.
# 16  
Old 01-16-2011
Or back to the orginal approach. MediaWiki exports data using an XML format. MySQL using an SQL format. Let's assume the SQL format, with one insert-record per line (which may require a special mysql-dump option)
Code:
cp dump.sql dump-working.sql
perl -ipe 's/\[\[(\S+) v(\S+) (\d{4})\]\]/[[$1 v $2 ($3)]]/g' dump-working.sql

you can then do a diff to see if it's close to what you want:
Code:
diff dump.sql dump-working.sql | less

# 17  
Old 01-16-2011
Humm, I believe the output from the Otheus Perl script fails to fix the problem. For example, consider the following test file:
Code:
false imprisonment (see: [[Bird v Jones (1845)]]).
false imprisonment (see: [[Bird v Jones (1845)]] and [[Smith v Jones (1865)]]).
false imprisonment (see: [[Bird et al v Jones et al 1845]] and arrest (see: [[Smith v Jones (1865)]]).
false imprisonment (see: [[Bird vjones 1845]]) and also arrest (see: [[Smith vjones 1865]]).
false imprisonment (see: [[Bird vjones (1845)]]) and arrest (see: [[Smith vjones (1865)]]).
false imprisonment (see: [[Bird Murphy vjones (1845)]]).
false imprisonment (see: [[Bird Murphy vjones smith (1845)]]).
false imprisonment (see: [[Bird et al vjones et al (1845)]]).

Here is the output generated by the Perl script:
Code:
false imprisonment (see: [[Bird v Jones (1845)]]).
false imprisonment (see: [[Bird v Jones (1845)]] and [[Smith v Jones (1865)]]).
false imprisonment (see: [[Bird et al v Jones et al 1845]] and arrest (see: [[Smith v Jones (1865)]]).
false imprisonment (see: [[Bird v jones (1845)]]) and also arrest (see: [[Smith v jones (1865)]]).
false imprisonment (see: [[Bird vjones (1845)]]) and arrest (see: [[Smith vjones (1865)]]).
false imprisonment (see: [[Bird Murphy vjones (1845)]]).
false imprisonment (see: [[Bird Murphy vjones smith (1845)]]).
false imprisonment (see: [[Bird et al vjones et al (1845)]]).

The problem with the " v " remains!

The problem domain is complicated by the fact that the number of words within the citation is variable. The next issue is to upper case the first letter after " v " is not already uppercase. Solve these and the rest is trivial.
# 18  
Old 01-16-2011
I didn't pay close attention to the follow-up posts of this thread. Make it two-pass. Let's try this:
Code:
perl -pe 's/\[\[(\w.*?) v(\w)(.*?) (\d{4}|\(\d{4}\))\]\]/[[$1 v \U$2\E$3 $4]]/;s/(\[\[.*? )(\d{4})(\]\])/$1($2)$3/g' dump-working.sql

The first substitution handles the "v" problem, while the second handles the dates. (It could be done all in one, assuming that ALL such references were wrong, but if some are partially correct, the one-pass version fails on those)

Last edited by otheus; 01-16-2011 at 09:43 PM.. Reason: two-pass version, corrects vJones and v Jones 1932
# 19  
Old 01-17-2011
Hi,

Thank you so much!
About 70% of the links are displayed correctly, will the above break the correct once?
# 20  
Old 01-17-2011
That's rather difficult to say since I've seen only about 20 examples. But they fix those 20 and don't break the working ones.
# 21  
Old 01-19-2011
I am closing this topic. Because of the size of the database and the complexity of the required fixes I worked off line with the OP to solve his problem. I wrote a custom C utility to parse the database, locate the defects and fix them. There were a number of different types of defects, some layered on top of each other.

Attached is a copy of the final C source code file

Last edited by fpmurphy; 01-19-2011 at 04:20 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Display dates between two dates

Hi All, I have 2 dates in mm/dd format. sdate=10/01 (October 01) edate=10/10 (October 10) I need the dates in between these 2 dates like below. 10/01 10/02 10/03 10/04 10/05 10/06 10/07 10/08 (1 Reply)
Discussion started by: jayadanabalan
1 Replies

2. Shell Programming and Scripting

Replacing the Dates in a file

Hello Gurus, I'm beginner in Shell scripting. I got a requirement to write a script. I have a file with below (similar) content If you can observe above content, there are many date values existed (with different dates) in a format: ddMonyyyy I have to write replace all these... (7 Replies)
Discussion started by: raghu.iv85
7 Replies

3. Shell Programming and Scripting

How to get dates between two dates?

HI, to my shell script i pass two parameters date1 and date2 is there any way to get all dates betwen these two dates? if i pass 20130714 and 20130717 i need to get below dates 20130715 20130716 is it possible. thanks (3 Replies)
Discussion started by: ulab
3 Replies

4. UNIX for Advanced & Expert Users

How to get the Missing dates between two dates in the table?

Hi Am Using Unix Ksh ... I have a Table called date select * from date ; Date 01/02/2013 06/02/2013 I need the output as Missing Date 01/02/2013 02/02/2013 03/02/2013 04/02/2013 05/02/2013 06/02/2013 (2 Replies)
Discussion started by: Venkatesh1
2 Replies

5. Shell Programming and Scripting

Generating dates between two dates

HI, i have row like this HHH100037440313438961000201001012012073110220002 N in this i have 2 dates in pos 25-32 and 33-40 , so based upon the se two dates , i need to generated records between these two values so in the above record 20100101 and 20120731 need to genearte rows like this... (4 Replies)
Discussion started by: sathishsr
4 Replies

6. UNIX for Dummies Questions & Answers

How to write the dates between 2 dates into a file

Hi All, I am trying to print the dates that falls between 2 date variables into a file. Here is the example. $BUS_DATE =20120616 $SUB_DATE=20120613 Output to file abc.txt should be : 20120613,20120614,120120615,20120616 Can you pls help me accomplish this in LINUX. Thanks... (5 Replies)
Discussion started by: dsfreddie
5 Replies

7. Programming

SQL: find if a set od dates falls in another set of dates

Don't know if it is important: Debian Linux / MySQL 5.1 I have a table: media_id int(8) group_id int(8) type_id int(8) expiration date start date cust_id int(8) num_runs int(8) preferred_time int(8) edit_date timestamp ON UPDATE CURRENT_TIMESTAMP id... (0 Replies)
Discussion started by: vertical98
0 Replies

8. Programming

Dates in C

Hello, I am working on a program in C and need to read, write and manage Dates (YYY-MM-DD HH:MM). I have made my own version of a structure to store the date data and something else. Do you recommend me to use the time.h library? In such case, is it worthwhile to have an structure containing... (1 Reply)
Discussion started by: Ister
1 Replies

9. Shell Programming and Scripting

Need script to generate all the dates in DDMMYY format between 2 dates

Hello friends, I am looking for a script or method that can display all the dates between any 2 given dates. Input: Date 1 290109 Date 2 010209 Output: 300109 310109 Please help me. Thanks. :):confused: (2 Replies)
Discussion started by: frozensmilz
2 Replies

10. Shell Programming and Scripting

Difference between two dates...

Hi All, Wish you a Happy New year... I have to find the difference between two dates, the result should be the number of days. I have seen the "datecalc" function. Its good, can I have any other alternative. Thanks in Advance Raju (4 Replies)
Discussion started by: rajus19
4 Replies
Login or Register to Ask a Question