Developer’s guide to effective and maintainable internationalization – multiple language support in web applications

item_3 Intl flags

An interesting problem I’ve encountered many times over the years is the multi-language website feature. Given a primary language of the website with its respective database, it is pretty tricky to come up with a good way of having multiple languages for the same website and making it easy to maintain.

First, let’s consider we have the following prerequisites:
– an SQL based relational database (e.g. MySQL)
– support for creating Views

And also, optional but useful:
– able to create triggers on the DB for insert/update/delete methods

I’ll be using a simple SQL relational database with table structure as shown below:

Article : { id, title, content, date, id_image, id_category }
Category : { id, title, key, id_parent }
Image : { id, filename, width, height }

Of course, id is the primary key in each table, and fields prefixed “id_” are foreign keys in corresponding tables.

Let’s also say that our primary language is English and we also need to support French and German.

There are several approaches for achieving the multi-language feature,
so let’s start with those that first come to mind – which are, by the way, a terrible idea.

—————
1) Additional fields for language support

The idea behind additional fields is that we have a special field for each language (English, French or German) in the same table which contains human-readable text.

This approach is easy to implement on a database, and the above Article table structure would be transformed to:

Article : { id, title, title_ger, title_fra, content, content_ger, content_fra, date, id_image, id_category }

Then, for example, once on the German domain we use fields title_ger and content_ger in our SQL queries.

Select id, title_ger, content_ger, date from Article

Seems really simple, right? Well it is!

However, you can already notice why this is a bad approach. Your application will need to be restructured to keep up with the multiple language feature – and you will most likely have to change a lot of your queries and also the application code (e.g. PHP).

The second problem is that you will have to restructure (alter) a big portion of your database each time you introduce a new language (e.g. when you decide to support Croatian).

The third and the biggest problem from my opinion is how this approach will affect your Content Management System ( Admin area ) for your website. You will either have to create some panels for each language-specific field (e.g. when entering content of an Article, separate fields for eng/ger/fra) which leads to a lot of restructuring (and ugly user interface) – or you will have three separate modes of CMS – one for each language and then switch the mode context back and forth when editing – which is a slightly better way of handling it.

Overall, taking the additional field road will force you to seriously restructure your application in order to accommodate the multiple language feature and in the same time make it very hard to maintain.

—————
2) One database per language

Whoa man, is this a bad idea…

So you would have something like this:

DB_EN
------
Article : { id, title, content, date, id_image, id_category }
Category : { id, title, key, id_parent }
Image : { id, filename, width, height }

DB_GE
------
Article : { id, title, content, date, id_image, id_category }
Category : { id, title, key, id_parent }
Image : { id, filename, width, height }

DB_FR
------
Article : { id, title, content, date, id_image, id_category }
Category : { id, title, key, id_parent }
Image : { id, filename, width, height }

Although this might be a good solution if each language offers different content, in most cases this is the right way to go if you wish to create maintenance hell for yourself. The reason is obvious, each time you have to alter the DB structure, you’ll have to do it for all the databases. Also if the databases need to be synced, you will have to execute the same insert/update/delete query for each of the databases. Imagine what kind of modifications you would be making to your application…

—————
3) One database and a table per language

This basically has almost the same nigthmarish effects as the One DB per language approach, with a special touch of you needing to rewrite all of your SQL queries in order to add the language suffix _EN, _GE, _FR . How fun does that sound?

DB
------
Article_EN : { id, title, content, date, id_image, id_category }
Category_EN : { id, title, key, id_parent }
Image_EN : { id, filename, width, height }

Article_GE : { id, title, content, date, id_image, id_category }
Category_GE : { id, title, key, id_parent }
Image_GE : { id, filename, width, height }

Article_FR : { id, title, content, date, id_image, id_category }
Category_FR : { id, title, key, id_parent }
Image_FR : { id, filename, width, height }

——————————

Ok, so now we have discussed some approaches to support multiple languages. Although it’s still hard to determine which one of the above causes most developer deaths per year, I think we’re ready to learn from our mistakes and move on.

Assuming that we still want to make an effective solution which is easily maintained, let’s create a scenario:

– our application is already working for the primary language (e.g. English)
– our SQL queries are not explicitly naming the database in which the tables reside, we mention the database only once – when connecting

Given that above is true, let’s compile a list of features for our multi-language system to support:

– our website delivers the same content in multiple languages

– we do not want to modify our existing database which works just fine with the primary language

– we do not want to modify existing application code because it’s working great, only adjust the application router to support multiple domains or URLs for our languages

– if we need to alter the database in the future, we will do it only on the existing primary DB since we’re too lazy to maintain three databases

– we do not expect our web master (CMS user) to know all three languages

– we expect from a third person we hire for translating our content from English to German and French not to be a skillful web master, just a good translator

At this point, I think some of you will see where I’m going with this approach.
Before I reveal the details of the solution I’ll give you the opportunity to answer a few questions down in the Facebook comments. I’m quite sure the questions themselves will steer you in the right direction.

1) How do you prepare your application for the translator? What kind of UI does he need to know how to use in order to translate from English to other languages?

2) Would you be using more than one (existing) database to accommodate the features above? If so, how many?

3) If your application code (the business logic and model parts) and the SQL queries are not modified, how can you achieve that once on a French domain the application delivers French content?

4) Do all string typed fields in your database necessarily need to be translated (e.g. password hashes)? How would you instruct the system to omit those and include others in the translation UI?

5) If you had no French translation for an Article, how would you achieve that the user gets the default English version of it (without changing your application code or existing English DB)?

I will post the whole solution here soon :-) I have a PHP implementation for this too :-)

Can’t wait to read your answers in the FB comments!

Don’t forget to share! :-)

Coding Examples, Tutorials, You're doing it wrong | Posted on April 8, 2014 by .

About Kristijan Burnik

Kristijan Burnik is a Programmer and Web Developer specializing in Server-side and Client-side Technologies and Application Development on Linux Servers and Windows Desktop . Also has experience in Networking, Desktop application development as well as Android mobile application development . Experienced in Programming Languages like Java, C, C++, C#, PHP, Javascript, MySQL and somewhat in other languages like Bash, Perl & Python. Sometimes he works as a Graphical Designer for digital production as well as for printing and advertising. He dedicates his spare time writing Tech Articles on his blog in order to share his work with others, as well as to document his projects for his own use. He's also an Educator & Mentor in field of Algorithms and Programming to young programmers in Zagreb, Croatia.