I have to translate some details using a Google API which we’re paying for. The details contain HTML, and Google charges for each character. I don’t want to send the complete content, but only the English text instead, with the HTML removed. I can remove HTML tags and entities using PHP functions, but I have to place the English content back in the HTML tags after translation for proper display. It will also include CSS.
Example:
<strong>This is a test</strong><br /> <custom tag>This is a test</custom tag><br />
After translation to Spanish I need:
<strong>Translated content </strong><br /> <p>Translated content </p><br />
How can I preserve the HTML format with out sending HTML to the API?
Haha, I also had that problem. But it has been while ago…
I think, there was a problem were – due to translation-nature – some sentenceparts were swaped. So I was not able to fit the tags in at the same position, first. But I think there was a way to get some metadata from the translationprocess, were you can see which part of the sentence have moved to a new position and what the content was… I know, I solved it finally. But I cant recall how 🙁
If every word takes the same place again after translation, you could first separate all words by whitespace OR htmltag into an array and remember where each HTML-tag was and reapply that after translation…