As I have implemented multilingual support on the website and in the project

Creating and maintaining the project open source we want to solve all possible problems on multilanguage support project and site. With multilingual support in various projects I have encountered in a very long time, beginning with desktop programs. Thus, having an idea of possible needs, I started to get acquainted with the proposed solutions. Yes, almost all SaaS services offer free usage for open-source projects, but it's mostly all tuned to the translated string resources. And what about the website and documentation? Unfortunately, I never found anything suitable and started to self-realization. I must say that the result satisfied and use the system for almost six months, although I warn you that it is not mass complete solution, but rather a specific implementation for my needs, but I hope that some ideas can be useful to other developers.

To start I will list the requirements that are set for future offspring.

    the
  1. to Locate need resources for a project are stored in the form of JSON .js, and all code and documentation on the website.
  2. the
  3. a Resource can have no translation into other languages. That is, for example, I can save the texts in Russian, and then given to the interpreter, and in the Russian version of the website the texts are already available.
  4. the
  5. Needs to be a convenient system on the website in order to allow the user to translate not translated into his language resources to create a new resource (text) or inspect and edit already existing texts in their native language. It should look like this — the user selects the action (transfer, check), native language (and in the case of translation is the original language) and the desired volume. According to these parameters is searched for a resource and is available for translation or editing. Of course, there should be a log of user actions and accumulate the statistics on works performed.
  6. the
  7. should have a choice of languages, but each page should show only those languages for which there is already a translation of this page.
  8. the
  9. , the same string can be used in several places. For example, the string is used .js and documentation. That is, the resource must be in the same instance and if it is changed, it needs to change in the JSON and in the documentation.
  10. the
  11. should ideally be some kind of auto-moderated system, but until you can stay on the personal decisions of the publication.

The display changes in real time, I was not true, and I decided to do some staging tables with all internal kitchen and then do build the JSON and generating the pages of the site. In fact, enough for four tables.
table Structure
CREATE TABLE IF NOT EXISTS `languages` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`_owner` smallint(5) unsigned NOT NULL,
`name` varchar(32) NOT NULL,
`native` varchar(32) NOT NULL,
`iso639` varchar(2) NOT NULL,
PRIMARY KEY (`id`),
KEY `_uptime` (`_uptime`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ;

CREATE TABLE IF NOT EXISTS `langid` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`_owner` smallint(5) unsigned NOT NULL,
`name` varchar(96) NOT NULL,
`comment` text NOT NULL,
`restype` tinyint(3) unsigned NOT NULL,
`attrib` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `_uptime` (`_uptime`),
KEY `name` (`name`),
KEY `restype` (`restype`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ;

CREATE TABLE IF NOT EXISTS `langlog` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`_owner` smallint(5) unsigned NOT NULL,
`iduser` int(10) unsigned NOT NULL,
`idlangres` int(10) unsigned NOT NULL,
`action` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `_uptime` (`_uptime`),
KEY `iduser` (`iduser`,`idlangres`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ;

CREATE TABLE IF NOT EXISTS `langres` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`_owner` smallint(5) unsigned NOT NULL,
`langid` smallint(5) unsigned NOT NULL,
`lang` tinyint(3) unsigned NOT NULL,
`text` text NOT NULL,
`prev` mediumint(9) unsigned NOT NULL,
`verified` tinyint(3) NOT NULL,
`size` mediumint(9) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `_uptime` (`_uptime`),
KEY `langid` (`langid`,`lang`),

) ENGINE=MyISAM DEFAULT CHARSET=utf8 ;

Languages languages table with three fields, name, native, iso639. An example of an entry: Russian, Russian ru

Table text resource identifiers langid where you can specify a review type. I divided all the resources into several types: JSON, string, page, plain text, text in MarkDown format. You can of course use your own styles.
Example: сancelbtn, Text for the Cancel button, JSON

Table text resources langres ( langid, language, text, prev). Stored references to the identifier, language and the text itself.
Prev the last field provides a versioning text edits and points to the previous version of the resource.

All changes are recorded in the log table langlog ( iduser, idlangres, action ). The action box will indicate a completed action — create, edit, validate.

I will not dwell on the work with users, let me just say that the user is automatically registered when a transfer or correction. Since email is not required, the user is immediately reported to the username and password. All made changes will be tied to his account. In the future, it may indicate your email address and other data or simply forget about the check.

I drew a diagram to help you better represent all the relationships between tables.
image

Since I need the ability to insert resources into other resources, I added the macros of the form #ID#. For example, in the simplest case, if we have resource name = "Name", we can use it in online entername = "Enter your #name#", which when generated will be replaced with your Name.
Now, to generate pages is enough to go through all the languages and resources with the appropriate type, treat every text with the special function of replacement and record the result in a separate table with the finished pages. Moreover, the processing occurs in such a way that if the #ID# is not found in the current language, it is searched for in other languages. Here is the sketch of a recursive function (with protection from infinite loops), which makes this processing.
Example of PHP lookup functions
 public function proceed( $input, $recurse = false )
{
global $db, $syslang;

if ( !$recurse )
$this->chain = array();
$result = ";
$off = 0;
$start = 0;
$len = strlen( $input );
while ( ($off = strpos( $input, '#', $off )) !== false && $off < $len - 2 )
{
$end = strpos( $input, '#', $off + 2 );
if ( $end === false )
break;
if ( $end - $off > $this->lenlimit )
{
$off = $end - 1;
continue;
}
$name = substr( $input, $off + 1, $end - $off - 1 );
$langid = $db->getone("select id from langid where name=?s", $name );
if ( $langid && !in_array( $langid, $this->chain ))
{
$langres = $db->getrow("select _uptime, id,text from langres where langid=?s && verified>0
order by if( lang=?s, 0, 1 ),lang", $langid, $this->lang );
if ( $langres )
{
if ( $langres['_uptime'] > $this->time )
$this->time = $langres['_uptime'];
$result .= substr( $input, $start, $off - $start );
$off = $end + 1;
$start = $off;
array_push( $this->chain, $langid );
$result .= $this->proceed( $langres['text'], true );
array_pop( $this->chain );
if ( $off > = $len - 2 )
break;
continue;
}
}
$off = $end - 1;
}
if ( $start < $len )
$result .= substr( $input, $start );

return $result;
}


In addition to replacing macros like #name#, I also immediately convert MarkDown markup to HTML and handle their own guidelines. For example, I have a table of images, where one record can be hung screenshots for different languages, and if I in the text refer to the tag [img "/file/#*indexes#"], I have inserted image with the name indexes with my desired language. But most importantly, I can generate the discharge for different purposes in any format. As example, here is code generating the JSON files, there is really unnecessary, don't use wildcard identifiers.
Generate JSON files for EN and EN
function jsonerror( $message )

print $message;
exit();
}

function save_json( $filename )
{
global $db, $original;

preg_match("/^\w*_(?<lang>\w*)\.js$/", $filename, $matches );
if ( empty( $matches['lang'] ))
jsonerror( 'No locale' );
$lang = $db->getrow("select * from languages where iso639=?s", $matches['lang'] );
if ( !$lang )
jsonerror( 'Unknown locale '.$matches['lang'] );

$list = $db->getall("select lng.name, r.text from langid as lng
left join langres as r on r.langid = lng.id
where lng.restype=5 && verified>0 && r.lang=?s
order by lng.name", $lang['id'] );
$out = array();
foreach ( $list as $il )
$out[ $il['name']] = $il['text'];
if ( $lang['id'] == 1 )
$original = $out;
else
foreach ( $original as $ik = > $io )
if ( !isset( $out[ $ik ] ))
$out[ $ik ] = $io;
$output = "/* This file is automatically generated on eonza.org.
Use http://www.eonza.org/translate.html to edit or translate these text resources.
*/

var lng = {
\tcode: '$lang[iso639]',
\tnative: '$lang[native]',
";
foreach ( $out as $ok = > $ov )
{
if ( strpos( $ov, "'" ) === false )
$text = "'$ov'";
elseif (strpos( $ov, '"' ) === false )
$text = "\"$ov\"";
else
jsonerror( 'Wrong text:'.$text );
$output .= "\t$ok: $text\r\n";
}
$output .= "\r\n};\r\n";
$jsfile = dirname(__FILE__)."/i18n/$lang[iso639].js";
if ( file_exists( $jsfile ))
$output .= file_get_contents( $jsfile );
if (file_put_contents( HOME."/tmp / $filename", $output ))
print "Save: ".HOME."tmp/$filename<br>";
else
jsonerror( 'Save error:'.HOME."tmp/$filename" );
}

$original = array();
$files = array( 'en', 'EN');

foreach ( $files as $if )
save_json( "locale_$if.js" );

$zip = new ZipArchive();
print $zip- > open( HOME."tmp/locale.zip", ZipArchive::CREATE );
foreach ( $files as $f )
print $zip- > addFile( HOME."/tmp / locale_$f.js", "locale_$f.js" );
print $zip->close();
print "Finish<br><a href='/tmp/locale.zip'>ZIP file</a>";


Thus, not spending so much effort I implemented almost everything I wanted. The only unsold things that are out of date at the moment due to low activity on the site. But been added additional features that were needed in the process. For example, obtaining a text file that contains the resources that need translation and reverse download of the translated text.
Guests can take a look at the working page, where users can translate, edit and create new resources for my project.

image
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

When the basin is small, or it's time to choose VPS server

Performance comparison of hierarchical models, Django and PostgreSQL

From Tomsk to Silicon Valley and Back