17 June 2011

Game localization tips: checking inline tags with Word and Excel

Working in games translation, sooner or later you are bound to find a file packed with inline tags. You know, those annoying non translatable things between <>, [], {} and similar. You can come up with many ways to work around them during the translation, but at the end you will probably want 1) a version of the text without the damn things for proofreading and spell checking 2) a quick way to check that they all are in the right place.

I'm pretty sure the net has a specific tool somewhere, but let's say that you are on a client's PC with no internet connection. You want to do a quick check without too much fuss. Office can help.

Finding tags in Word

I'm assuming that your text is in an Excel file, with the source text on one column and the target text in the next one.

While Excel has some basic search and replace functionality, we really need Word for this, so copy and paste into Word and let's start.

You want to find all the tags, so you open find and replace, and type <*> hoping that it will work... and it doesn't

It doesn't because < and >, like many other similar characters, are one of the wildcards used by Word itself. The workaround is pretty easy: you just  put"\" a before it and Word will know that your <* *really means <

So, Edit Find, search for *\<\>** , highlight all items found in Current Selection, Use wildcards and here we go.

All the tags selected! You want to proofread without having to twist your eyes around tags? You want to use the spellchecker or get a clean word count? Press Delete and you are done!

Checking tags back into Excel

What if you want to check for tags deleted or broken during the translation? Easy. Do the procedure above and stop just before deleting all the tags. Just keep them all selected, go to the format bar and highlight them in yellow.

Good. Now we want to delete everything else. Edit / Find / Leave both the search and replace field empy and deselect "Use wildcards" if it's still there. Now, go into Format and choose Format/Highlight twice.

Yep, twice, because like this we will be searching for Format: Not highlight.

Yes, all the text is gone, leaving out only the tags. Now, you could check each line yourself, if you are into this sort of things, but I would recommend a quick jump back into Excel.

Copy everything, paste it back in a new Excel file so that you have source tags in column A e and target tags in column B. Now double-click on cell C1 and type =A1=B1 . Very clear formula isn't it? Now click once on cell C1 select the little square in its bottom right corner and drag it down until you reach the last line on you file.

Done! Every time you see Excel stating "Wrong" it means that source and target in that line are not the same. As we only have tags (and hopefully the source is right) it means that the tag is broken.

For simplicity, you can even use a filter to pull out just the wrong instances. And with this we're done, happy fixing and good work!

PS If you are using Apsic Xbench you can use this technique in order to create a "tag glossary" for your automated checks. You just need to extract the tags as shown above, put them side by side in Excel, fire up TMBuilder and you're done.

(cover credits)