Update:
Change in direction (too slow to keep doing the migration over and over).
Create / write a Ruby script (done):
- Retrieve the mappings from the vB posts to the Discourse posts stored the Discourse postgres DB.
- Use these postid-to-postid mappings to grab the original vB post text from each vB post in the original mysql DB.
- Preprocess the vB post text
- Postprocess the vB post text
- Update the raw post in the Discourse DB
- Test and redo.
This script above processes about a million posts in 45 minutes (much faster) and when happy with the results can rebake the raw posts into the cooked posts. Rebaking
1M posts takes about 16+ hours, so avoiding this when possible.
Ran this yesterday and found that all the bugs posted my @MadeInGermany before (mangled code, missing left square brackets) and the hard line break error reported by @Scrutinzer (where
\n in code fragments were converted to hard line breaks) were fixed.
However, still more gremlins to slay, working on:
- Fixing missing emoji in the preprocessing. In particular the thumps up emoji that Ravinder loves to use :b: converts to :+1:. DONE
- Fixing a bug in attachments and other images. DONE
However, the main reported gremlins in code fragments appear to be fixed. Now working on other missing transformations (missing emoji, images, etc).
Making progress... slowly but surely.
All work currently done on test / staging server only.