If I have a string of HTML, maybe like this… <h2>Header</h2><p>all the <span class=bright>content</span>

Question

0

Asked: June 9, 20262026-06-09T10:04:25+00:00 2026-06-09T10:04:25+00:00

If I have a string of HTML, maybe like this… <h2>Header</h2><p>all the <span class=bright>content</span>

0

If I have a string of HTML, maybe like this…

<h2>Header</h2><p>all the <span class="bright">content</span> here</p>

And I want to manipulate the string so that all words are reversed for example…

<h2>redaeH</h2><p>lla eht <span class="bright">tnetnoc</span> ereh</p>

I know how to extract the string from the HTML and manipulate it by passing to a function and getting a modified result, but how would I do so whilst retaining the HTML?

I would prefer a non-language specific solution, but it would be useful to know php/javascript if it must be language specific.

Edit

I also want to be able to manipulate text that spans several DOM elements…

Quick<em>Draw</em>McGraw

warGcM<em>warD</em>kciuQ

Another Edit

Currently, I am thinking to somehow replace all HTML nodes with a unique token, whilst storing the originals in an array, then doing a manipulation which ignores the token, and then replacing the tokens with the values from the array.

This approach seems overly complicated, and I am not sure how to replace all the HTML without using REGEX which I have learned you can go to the stack overflow prison island for.

Yet Another Edit

I want to clarify an issue here. I want the text manipulation to happen over x number of DOM elements – so for example, if my formula randomly moves letters in the middle of a word, leaving the start and end the same, I want to be able to do this…

<em>going</em><i>home</i>

Converts to

<em>goonh</em><i>gmie</i>

So the HTML elements remain untouched, but the string content inside is manipulated (as a whole – so goinghome is passed to the manipulation formula in this example) in any way chosen by the manipulation formula.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-09T10:04:27+00:00

I implemented a version that seems to work quite well – although I still use (rather general and shoddy) regex to extract the html tags from the text. Here it is now in commented javascript:

Method

/**
* Manipulate text inside HTML according to passed function
* @param html the html string to manipulate
* @param manipulator the funciton to manipulate with (will be passed single word)
* @returns manipulated string including unmodified HTML
*
* Currently limited in that manipulator operates on words determined by regex
* word boundaries, and must return same length manipulated word
*
*/

var manipulate = function(html, manipulator) {

  var block, tag, words, i,
    final = '', // used to prepare return value
    tags = [], // used to store tags as they are stripped from the html string
    x = 0; // used to track the number of characters the html string is reduced by during stripping

  // remove tags from html string, and use callback to store them with their index
  // then split by word boundaries to get plain words from original html
  words = html.replace(/<.+?>/g, function(match, index) {
    tags.unshift({
      match: match,
      index: index - x
    });
    x += match.length;
    return '';
  }).split(/\b/);

  // loop through each word and build the final string
  // appending the word, or manipulated word if not a boundary
  for (i = 0; i < words.length; i++) {
    final += i % 2 ? words[i] : manipulator(words[i]);
  }

  // loop through each stored tag, and insert into final string
  for (i = 0; i < tags.length; i++) {
    final = final.slice(0, tags[i].index) + tags[i].match + final.slice(tags[i].index);
  }

  // ready to go!
  return final;

};

The function defined above accepts a string of HTML, and a manipulation function to act on words within the string regardless of if they are split by HTML elements or not.

It works by first removing all HTML tags, and storing the tag along with the index it was taken from, then manipulating the text, then adding the tags into their original position in reverse order.

Test

/**
 * Test our function with various input
 */

var reverse, rutherford, shuffle, text, titleCase;

// set our test html string
text = "<h2>Header</h2><p>all the <span class=\"bright\">content</span> here</p>\nQuick<em>Draw</em>McGraw\n<em>going</em><i>home</i>";

// function used to reverse words
reverse = function(s) {
  return s.split('').reverse().join('');
};

// function used by rutherford to return a shuffled array
shuffle = function(a) {
  return a.sort(function() {
    return Math.round(Math.random()) - 0.5;
  });
};

// function used to shuffle the middle of words, leaving each end undisturbed
rutherford = function(inc) {
  var m = inc.match(/^(.?)(.*?)(.)$/);
  return m[1] + shuffle(m[2].split('')).join('') + m[3];
};

// function to make word Title Cased
titleCase = function(s) {
  return s.replace(/./, function(w) {
    return w.toUpperCase();
  });
};

console.log(manipulate(text, reverse));
console.log(manipulate(text, rutherford));
console.log(manipulate(text, titleCase));

There are still a few quirks, like the heading and paragraph text not being recognized as separate words (because they are in separate block level tags rather than inline tags) but this is basically a proof of method of what I was trying to do.

I would also like it to be able to handle the string manipulation formula actually adding and removing text, rather than replacing/moving it (so variable string length after manipulation) but that opens up a whole new can of works I am not yet ready for.

Now I have added some comments to the code, and put it up as a gist in javascript, I hope that someone will improve it – especially if someone could remove the regex part and replace with something better!

Gist: https://gist.github.com/3309906

Demo: http://jsfiddle.net/gh/gist/underscore/1/3309906/

(outputs to console)

And now finally using an HTML parser

(http://ejohn.org/files/htmlparser.js)

Demo: http://jsfiddle.net/EDJyU/

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

If I have a string of HTML, maybe like this… <h2>Header</h2><p>all the <span class=bright>content</span>

Edit

Another Edit

Yet Another Edit

Leave an answerCancel reply

1 Answer

Method

Test

Gist: https://gist.github.com/3309906

Demo: http://jsfiddle.net/gh/gist/underscore/1/3309906/

And now finally using an HTML parser

Leave an answer
Cancel reply