February 19, 2015
The Geek Speaks: How to Correct Back-links in Hundreds of Posts
Guest post by Silver Bullet.
Ever since my wife decided to shift from Blogger to WordPress 1.5 years ago, the one major task left to complete the migration was updating all of the back-links that would still take readers to her old blog rather than the same posts on her new site.
A task which she has been procrastinating for ages, and the one task she has been fretting about all this while.
With the recent addition of her “Best of” page as part of her Blogging 101 assignment, there was no longer avoiding updating the back-links and I saw her reluctantly dragging her feet to set herself up for this task.
That night, I watched for some time as she laboriously updated each link in her posts manually. She had to:
- check each existing back-link
- if it still directing to her old blog, she’d have to find the relevant post on her new blog domain
- Look for the linked-post within the new domain
- and then manually correct the back-link…one link at a time (!!!)
Now. That bothers me. Not only do I have to hear her muttering cusses under her breath while at it, but because I am an IT guy and having worked on data management and databases for the better part of a decade, I felt there had to be a more efficient way to handle this.
I suspected that WordPress must be using some kind of database in the background to store all the data, and considering how customizable a WordPress blog is, I also assumed that it would be accessible by the blog’s owner.
All it took was a little bit of Googling to confirm my suspicions:
WordPress uses a MySQL database to house all the blog’s data. Most site hosts provide a control panel (cPanel) which, among other programs, contains an application called “phpMyAdmin”. This application can be used to manage the database and the data inside it.
Knowing what I knew, I then logged into the control panel and quickly found what I was looking for.
On the left hand side of the screen, there is an overview of the databases, and a bunch of tabs on the right providing a host of options to play with.
There were also not too many tables in the database and it wasn’t difficult to identify the table containing the actual posts of the blog. I then ran a short query on the column with the blog text, searching for any posts containing the old blog’s URL: oldblogURL.blogspot.com.
This returned all the posts that still had outdated back-links. With some basic SQL query language, I managed to extract the full URL strings of all the back-links in each of the posts.
The same table containing the posts data also contains the grubbsncritters.com URL of each post. As most back-links pointing to the old blog had the same format on the new blog (http://blogUrl.com/postname.html), I took the following steps:
- Replaced the old domain name with the new domain (grubbsncritters.com)
- Ran a query which replaced the old domain name in the back-link with the new one, and compared this with the existing URLs on the Grubbncritters.com site.
For all records that matched, it meant that the existing back-link in the post could be directly updated by replacing the old domain name with the new one. It turned out that about 60% of all back-links matched.
After which, I:
- made a full backup copy of the table containing all the posts – This way, if something would go wrong, I could simply replace the updated posts with the original again. Luckily this was not required.
- to be double safe I then added a separate new table in which I inserted the original post texts before running the update.
Changing the query to an update statement then allowed me to update all those back-links from the old domain to the new one with a single click of a button. That updated about 200 posts automatically.
Unfortunately I was not able to match the remaining 131 posts containing “oldblogURL.blogspot.com” back-links character-for-character with an existing URL on the new blog . Those, to my chagrin, have to be checked by hand – a task I will leave to my darling wife.
The upside: I was able to generate a list of all the posts affected, including the link to access it, which will allow her to quickly go to these posts and update the back-links manually. It is still a bit of work, but considerably faster than having to check each of her 4000+ posts one-by-one.
The more technical bit actually came with writing the query itself and for some, the steps I described above may already seem quite technical and an exceedingly boring exercise.
My wife would have balked at this with more cusses had she attempted to do it herself, but because I love puzzles like this, I really enjoyed figuring it out and helping my wife with something she’s passionate about (she thinks I’m mad, and she’s probably right).
As I work mainly with MSSQL systems with limited hands-on experience with MySQL databases, the practice came in really handy for me. In fact, I was quite surprised how customizable and accessible the back-end of WordPress really is!
I am certainly looking forward to the puzzling on more technical aspects of what goes on behind the blog and help my wife (or share thoughts with other fellow geeks) when other challenges present itself in the future.
Side note from Ann:
The article above would be particularly useful for the technically challenged who have been blogging for a while and/or are thinking about migrating their blog to a different platform. I suppose this would come in useful in case you run into the same issue with back-links.
Yes. I think my husband is crazy for finding such stuff to be a very interesting puzzle. But I have to admit that his enthusiasm behind all the technical geek-speak comes in handy for someone who cannot for the life of me get interested to puzzle over the codes. Feel free to shoot questions or adding your builds on my contact form addressing your queries to [Silver Bullet] on the technical aspects of blogging.
If you are celebrating Chinese New Year, here’s us wishing you Gong Xi Fa Cai. May this year brings you an abundance of blessings, health, prosperity and wealth of wisdom.