Update on vB3 Migration to Discourse - Issues and Status of BBCode Transformations

Thread Tools Search this Thread
The Lounge What is on Your Mind? Update on vB3 Migration to Discourse - Issues and Status of BBCode Transformations
# 8  
Old 04-05-2020
For those who have never seen the inside of a Discourse app, here are the canned raking tasks:

# rake --tasks
rake about                                                             # List versions of all Rails frameworks and the environment
rake add_topic_to_quotes                                               # Add the topic to quotes
rake admin:create                                                      # Creates a forum administrator
rake admin:invite[email]                                               # invite an admin to this discourse instance
rake api_key:create_master[description]                                # generate a master api key with given description
rake app:template                                                      # Applies the template supplied by LOCATION=(/path/to/template) or URL
rake app:update                                                        # Update configs and some other initially generated files (or use just update:configs or update:bin)
rake assets:clean[keep]                                                # Remove old compiled assets
rake assets:clobber                                                    # Remove compiled assets
rake assets:environment                                                # Load asset compile environment
rake assets:precompile                                                 # Compile all the assets named in config.assets.precompile
rake assets:prestage                                                   # pre-stage assets on cdn
rake autospec                                                          # Run all specs automatically as needed
rake avatars:clean                                                     # Clean up all avatar thumbnails (use this when the thumbnail algorithm changes)
rake avatars:refresh                                                   # Refresh all avatars (download missing gravatars, refresh system)
rake bookmarks:sync_to_table[sync_limit]                               # migrates old PostAction bookmarks to the new Bookmark model & table
rake build:stamp                                                       # stamp the current build with the git hash placed in version.rb
rake build_test_topic                                                  # create pushstate/replacestate test topic
rake cache_digests:dependencies                                        # Lookup first-level dependencies for TEMPLATE (like messages/show or comments/_comment.html)
rake cache_digests:nested_dependencies                                 # Lookup nested dependencies for TEMPLATE (like messages/show or comments/_comment.html)
rake categories:list                                                   # Output a list of categories
rake db:create                                                         # Creates the database from DATABASE_URL or config/database.yml for the current RAILS_ENV (use db:create:al...
rake db:drop                                                           # Drops the database from DATABASE_URL or config/database.yml for the current RAILS_ENV (use db:drop:all to...
rake db:environment:set                                                # Set the environment value for the database
rake db:fixtures:load                                                  # Loads fixtures into the current environment's database
rake db:migrate:status                                                 # Display status of migrations
rake db:prepare                                                        # Runs setup if database does not exist, or runs migrations if it does
rake db:rebuild_indexes                                                # Rebuild indexes
rake db:schema:cache:clear                                             # Clears a db/schema_cache.yml file
rake db:schema:cache:dump                                              # Creates a db/schema_cache.yml file
rake db:schema:dump                                                    # Creates a db/schema.rb file that is portable against any DB supported by Active Record
rake db:schema:load                                                    # Loads a schema.rb file into the database
rake db:seed                                                           # Loads the seed data from db/seeds.rb
rake db:seed:replant                                                   # Truncates tables of each database for current environment and loads the seeds
rake db:seed_fu                                                        # Loads seed data for the current environment
rake db:setup                                                          # Creates the database, loads the schema, and initializes with the seed data (use db:reset to also drop the...
rake db:stats                                                          # Statistics about database
rake db:structure:load                                                 # Recreates the databases from the structure.sql file
rake db:version                                                        # Retrieves the current schema version number
rake destroy:categories                                                # Destroy a comma separated list of category ids
rake destroy:groups                                                    # Destroy all groups
rake destroy:private_messages                                          # Remove all private messages
rake destroy:stats                                                     # Destroy site stats
rake destroy:topics[category,parent_category]                          # Remove all topics in a category
rake destroy:topics_all_categories                                     # Remove all topics in all categories
rake destroy:users                                                     # Destroy all non-admin users
rake docker:test                                                       # Run all tests (JS and code in a standalone environment)
rake emails:import                                                     # use this task to import a mailbox into Disourse
rake emails:test[email]                                                # Check if SMTP connection is successful and send test message
rake emoji:test                                                        # test the emoji generation script
rake emoji:update                                                      # update emoji images
rake enqueue_digest_emails                                             # This task is called by the Heroku scheduler add-on
rake export:categories[category_ids]                                   # Export all the categories
rake export:category_structure[include_group_users,file_name]          # Export only the structure of all categories
rake i18n:check[locale]                                                # Checks locale files for errors
rake i18n:reseed[locale]                                               # Update seeded topics and categories with latest translations
rake import:file[file_name]                                            # Import existing exported file
rake incoming_emails:truncate_long                                     # removes attachments and truncates long raw message
rake integration:create_fixtures                                       # Creates the integration fixtures
rake log:clear                                                         # Truncates all/specified *.log files in log/ to zero bytes (specify which logs with LOGS=test,development)
rake maxminddb:get                                                     # downloads MaxMind's GeoLite2-City database
rake middleware                                                        # Prints out your Rack middleware stack
rake multisite:generate:config                                         # generate multisite config file (if missing)
rake multisite:migrate                                                 # migrate all sites in tier
rake multisite:rollback                                                # rollback migrations for all sites in tier
rake plugin:install[repo]                                              # install plugin
rake plugin:install_all_gems                                           # install all plugin gems
rake plugin:install_all_official                                       # install all official plugins (use GIT_WRITE=1 to pull with write access)
rake plugin:install_gems[plugin]                                       # install plugin gems
rake plugin:migrate:down[plugin]                                       # run all migrations of a plugin
rake plugin:qunit[plugin,timeout]                                      # run plugin qunit tests
rake plugin:spec[plugin]                                               # run plugin specs
rake plugin:update[plugin]                                             # update a plugin
rake plugin:update_all                                                 # update all plugins
rake poll:migrate_old_polls                                            # Migrate old polls to new syntax
rake posts:delete_all_likes                                            # Delete all likes
rake posts:delete_word[find,type,ignore_case]                          # Delete occurrence of a word/string
rake posts:fix_letter_avatars                                          # Rebake all posts with a quote using a letter_avatar
rake posts:inline_uploads                                              # Coverts full upload URLs in `Post#raw` to short upload url
rake posts:invalidate_broken_images                                    # invalidate broken images
rake posts:missing_uploads                                             # Finds missing post upload records from cooked HTML content
rake posts:normalize_code                                              # normalize all markdown so <pre><code> is not used and instead backticks
rake posts:rebake                                                      # Update each post with latest markdown
rake posts:rebake_match[pattern,type,delay]                            # Rebake all posts matching string/regex and optionally delay the loop
rake posts:recover_uploads_from_index                                  # Attempts to recover missing uploads from an index file
rake posts:refresh_emails[topic_id]                                    # Refreshes each post that was received via email
rake posts:refresh_oneboxes                                            # Update each post with latest markdown and refresh oneboxes
rake posts:remap[find,replace,type,ignore_case]                        # Remap all posts matching specific string
rake posts:reorder_posts[topic_id]                                     # Reorders all posts based on their creation_date
rake qunit:test[timeout,qunit_path]                                    # Runs the qunit test suite
rake release_note:generate[from,to]                                    # generate a release note from the important commits
rake restart                                                           # Restart app by touching tmp/restart.txt
rake scheduler:run_all                                                 # run every task the scheduler knows about in that order, use only for debugging
rake secret                                                            # Generate a cryptographically secure secret key (this is typically used to generate a secret for cookie se...
rake site_settings:export                                              # Exports site settings
rake site_settings:import                                              # Imports site settings
rake smoke:test                                                        # run chrome headless smoke tests on current build
rake stats                                                             # Report code statistics (KLOCs, etc) from the application or engine
rake themes:install                                                    # Install themes & theme components
rake time:zones[country_or_offset]                                     # List all time zones, list by two-letter country code (`rails time:zones[US]`), or list by UTC offset (`ra...
rake tmp:clear                                                         # Clear cache, socket and screenshot files from tmp/ (narrow w/ tmp:cache:clear, tmp:sockets:clear, tmp:scr...
rake tmp:create                                                        # Creates tmp directories for cache, sockets, and pids
rake user_actions:rebuild                                              # rebuild the user_actions table
rake users:anonymize_all                                               # Anonymize all users except staff
rake users:change_post_ownership[old_username,new_username,archetype]  # Change topic/post ownership of all the topics/posts by a specific user (without creating new revision)
rake users:disable_2fa[username]                                       # Disable 2FA for user with the given username
rake users:list_recent_staff                                           # List all users which have been staff in the last month
rake users:merge[source_username,target_username]                      # Merge the source user into the target user
rake users:recalculate_post_counts                                     # Recalculate post and topic counts in user stats
rake users:rename[old_username,new_username]                           # Rename a user
rake users:update_posts[old_username,current_username]                 # Update username in quotes and mentions
rake yarn:install                                                      # Install all JavaScript dependencies as specified via Yarn
rake zeitwerk:check                                                    # Checks project structure for Zeitwerk compatibility

These 2 Users Gave Thanks to Neo For This Post:
# 9  
Old 04-09-2020
It has been a few days since my last status update, so here is a new, quick one:

@Scrutiziner and @Neo continue to work on the migration and are getting closer. The migration script provided OOTB by Discourse (the various BBCODE converters) mangled a lot of text; and @Scrutinzer has been leading the effort to get the various bbcode conversions as error free as practical. So far, so good.

Today, after more discussions with @Scrutinzer, I modified a Discourse theme component and added four new editor / composer buttons:

Update on vB3 Migration to Discourse - Issues and Status of BBCode Transformations-new_composer_buttonsjpeg

This was my first Discourse theme component modification, and frankly speaking, it was really easy (orders of magnitude easier, and infinitely faster debugging, than Discourse plugin development and testing).

That modified theme component is available on GitHub as md-composer-extras-neo


Frankly, we don't want to get too much into editor / composer button modifications until people use it more; so this will probably be the only changes to the composer before going live (sooner than later). We can look for better icons on FontAwesome and get input from the people who matter the most, all of you!

@Scruitizer informs that he is getting closer to have his new Ruby preprocessing script ready for testing against the database, and he says he has been having a lot of fun with Ruby as well!

More later.....
These 2 Users Gave Thanks to Neo For This Post:
# 10  
Old 04-13-2020
Another update:

We continue to make progress in the complex mess of migration the bbcode from the old forum to the new ones:
  • @Scrutinzer wrote some nice code which strips all bbcode from our old code tags (like color, fonts, etc) because these will not work in markdown in the new forums.
  • After more testing, it became clear also that old bbcode tags like color, which look good in one theme, look terrible in other themes. So, I have decided to strip out the vast majority of these legacy bbcode tags in the migration everwhere (COLOR, SIZE, FONT, etc). This insures that themes in the new forums are not constrained by low value color, fonts and size types of bbcode.
  • In addition, inline code tags need an extra space before and after when the migrate to MD because often users do not put spaces and this causes markdown (MD) errors.

On the system admin side:
  • I found out that Docker on MacOS does not support unix sockets being shared outside the docker container. I found this out after spending two days trying to set up a new test configuration which decouples the Discourse app in the container with external ports. However, it works fine on Linux. This means that I will migrate all Discourse apps (staging and production) from a single standalone docker container to a two container solution, where the postgres db data is in one container and the web app is an other container.
  • In addition, the web app container will no longer expose TCP/IP web sockets directly but will only expose a unix socket. This means we can decouple the web app from the web server; and so, for example, we can completely rebuild the app in a new docker web container, and expose a different unix socket.
  • On nginx, this means we can just switch between docker instances with a simple symbolic link change outside the container. This is very powerful. I have tested it and it works flawlessly.
  • On apache2, reverse proxy symbolic links to unix sockets in the docker container do not work (will not connect); only direct links to the unix socket works in the proxy pass configuration to a shared docker unix socket, so this requires an apache server restart.
  • nginx does not require a restart because the symlink works.
  • Yesterday, I got both apache2 and nginx working in reverse proxy mode to a unix socket; but ran into some minor issues with SSL; which is the next layer of testing I need to do. It worked flawlessly on http but I ran into some small issues on https (SSL).
  • Also, in quick testing, I found that ngnix was about five percent faster than apache2, but that was on two different servers with very different configurations and traffic, so this comparison is not (yet) relevant nor valid

So where are we now?
  • Still testing various preprocessing code routines against the DB, looking for anomalies. We are not hoping for 100% perfect, but we do want to keep all the code and solutions in tact for sure (sans color, font size, fonts, etc).
  • Soon, I will work on getting SSL to work on the "two container with web server reversed proxy to an exposed docker unix socket" (TTWSRPEDUS, LOL) solution on a staging server.

That's it for now.
These 2 Users Gave Thanks to Neo For This Post:
# 11  
Old 04-14-2020

This migration is moving along. @Scrutinizer has been a great help with debugging the bbcode migration, writing Ruby methods to preprocess various bbcode situations which arise in the conversion to markdown.

For examples:

There was some newlines being added to code blocks, so we added some post-processing REGEX search and replace to tidy these up. This was a purely cosmetic change but @Scrutinizer is more annoyed by these cosmetic details than me, and so he wrote some Ruby code to fix it. We are very fortunately to have @Scrutinizer working with me on this.

The same is true for some "bbcode abuse" where in the past over the years, some people copy-and-pasted some bbcode into the forum or others just loved bbcode so much the embedded bbcode everywhere, sometimes nesting bbcode is strange ways. We have also slayed most of those dragons.

We are getting very close. We cannot promise 100% of every possible combination of bbcode-mangles in the vB3 forum will be perfect, but it will be very good, a few orders of magnitude from the initial release, hands down.

Currently I am rebaking all the post again on the staging server. That is a process which takes 12 to 14 hours. For those who may not be familiar with this, here is a short summary:

The vB3 forum (indeed most, if not all, LAMP-based forums) process(es) the pagetext in the database on the fly (when the page is summoned by the client, e.g. the web browser).

However, Discourse stores the pagetext as "raw" and then it cooks the raw into HTML to be rendered. This of course makes the site faster since the code is already rendered and stored in the DB "cooked".

The downside to this, of course, is that it takes longer to "cook all this" during migration testing (reprocessing the raw for bbcode mangling); but lucky for us, after migration is done, it's done.

BTW, this is the same way I serve our forumman pages. Man pages are also cooked and the cooked pages are stored in the DB to make them render faster, so this technique is nothing new.

OBTW, those man pages will stay here in the legacy vB3 forums; until we decide if and/or when to write a plugin to port these to discourse.

That's it for now.
This User Gave Thanks to Neo For This Post:
# 12  
Old 04-15-2020

Long baking (staging server, discourse1) done after close to 15 hours:

Update on vB3 Migration to Discourse - Issues and Status of BBCode Transformations-screen-shot-2020-04-15-73504-amjpg

It is much improved, but @Scrutiziner, with his eagle eye for details, has plans for more refinements to make it even better; as there are still some chars "lost in migration" outside of the code tag fences.
These 3 Users Gave Thanks to Neo For This Post:
# 13  
Old 04-15-2020

Have made some core changes and am currently running both our staging server and our production server in "two container" mode; where the database and the web app are in two separate docker containers:

Update on vB3 Migration to Discourse - Issues and Status of BBCode Transformations-screen-shot-2020-04-16-91845-amjpg

My next plan is to set up ngnix as a reverse proxy on the staging server to a unix domain socket in the docker container, and decouple the network (TCP/IP) from the web app in the container.

This will permit us to completely rebuild the app, add new plugins, add custom code, etc. to the web app with almost zero down time because each container will have it's one unique shared (persistant) directory and we can symlink from outside the container to inside.

This means, for example, on nginx we can just change symlinks to move from one live container to the other.

OBTW, on apache (which we are not using in this setup), apache will not let us set up the proxy configuration with a symlink, so a restart of the web app is required. It's not a big deal; so if you are running apache and want to integrate these container-based web apps into your virtual host configurations, it is not a problem at all.

I have been doing these changes in a controlled, step-by-step manner, since these are "breaking changes". However, after all done. The site will be much more robust when we are done.

Honestly, I think the "two container" solution should be standard OOTB and it not "an advanced configuration" like Discourse meta says. It's actually straight forward and the best way to go and the setup is actually straight forward (two containers). The reverse proxy server to a unix domain socket is a bit more "advanced" so I do understand why folks call this "advanced" Smilie
This User Gave Thanks to Neo For This Post:
# 14  
Old 04-16-2020

After man days on this one issue, I have got the "two container" solution with nginx as a front end reverse proxy to a unix domain socket to work on the staging server.

There is a setting which is NOT in the discourse admin UI, which I had to set from the rails console:

cd /var/discourse
./launcher enter socket-only
rails c
SiteSetting.force_https = true

This sites setting does not exist in the site setting DB table until you run this command above.

Only then does it appear in the DB:

postgres=# \c discourse
You are now connected to database "discourse" as user "postgres".
discourse=# select * from site_settings where name like '%http%';
 id |    name     | data_type | value |         created_at         |         updated_at         
 79 | force_https |         5 | t     | 2020-04-16 05:51:13.165124 | 2020-04-16 05:51:13.165124
(1 row)

discourse=# \q

Even then, this setting does not appear in the admin UI.

After regarding the app with:

./launcher restart socket-only

it worked..... finally !!
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. What is on Your Mind?

VBulletin 3.8 to Discourse on Docker Migration Test Take Four

Test Build 4 on New Server, with changes identified in discourse test builds 2 and 3, primarily: Insuring ruby-bbcode-to-markdown is enabled. Removing line breaks from ICODE to markdown in migration script. Added vbpostid to posts in discourse to setup migrating vb "thanks" to discourse... (28 Replies)
Discussion started by: Neo
28 Replies

2. What is on Your Mind?

VBulletin 3.8 to Discourse on Docker Migration Test Take Two

OK. Like we all do, we learn a lot from tests, test migrations, and so forth. Today, I started from scratch on test migration 2, armed with a lot more knowledge, The main differences are as follows: Installed discourse plugin ruby-bbcode-to-md before starting the install Modified... (30 Replies)
Discussion started by: Neo
30 Replies

3. What is on Your Mind?

Status of Migration of Moderation Systems

First a bit of history .... A number of years ago one of our admins built a number of plugin systems for moderation, including (1) a voting system, (2) a "user feelings" system and (3) a confidential posting system. During this time, I was busy on other projects, not very active in the forums,... (1 Reply)
Discussion started by: Neo
1 Replies

4. Programming

How to track table status delete/update/insert status in DB2 V10 z/os?

Dear Team I am using DB2 v10 z/os database . Need expert guidance to figure out best way to track table activities ( Ex Delete, Insert,Update ) Scenario We have a table which is critical and many developer/testing team access on daily basis . We had instance where some deleted... (1 Reply)
Discussion started by: Perlbaby
1 Replies

5. HP-UX

Migration - Compiler Issues.

All, We are migrating an application from HP-UX B.11.00 to HP-UX B.11.31 and both of them have the same informix version - 7.25se. However the compilers are different on both servers. HP-UX B.11.00 - has B3913DB C.03.33 HP aC++ Compiler (S800) HP-UX B.11.31 - has PHSS_40631 1.0 HP C/aC++... (2 Replies)
Discussion started by: helper
2 Replies

6. Shell Programming and Scripting

Shell Script migration issues

Hi All, We will be doing a Solaris 8 to Solaris 10 migration migration, just wanted to know if there are any known / common issues arise from this migration from Shell script point of view. I tried searching this site but mostly post are related to SA's question and jumpstart, etc. If there's... (4 Replies)
Discussion started by: arvindcgi
4 Replies
Login or Register to Ask a Question

Featured Tech Videos