Monday, August 18, 2008

Working With Sources (4) Microsoft Office - PowerPoint

This is the fourth of a series of posts dealing with the issue of how to access various source formats. This time we'll look at accessing Microsoft PowerPoint .ppt and .pptx formats.

PowerPoint unfortunately has no (older) XML format, and does not lend itself well to copy and paste. Probably the most effective method for accessing the text to translate is to save .ppt as .pptx files (if necessary) and then follow a similar method to that used for .docx and .xlsx where the file is renamed with a .zip extension and decompressed. After decompressing the file, translatable text will be spread out in a number of smaller files: slides/slide*.xml and notesSlides/notesSlide*.xml will probably contain most of the text. Note that textboxes are in drawings/drawing1.xml and similar files. To segment properly see my post on segmenting Office XML.

For PowerPoint WildCAT might be the best option. Look for a post coming up on the side bar where I'll look at WildCAT and make a script for connecting AppleTrans to PowerPoint.

No comments: