I’ve made this to try to extract text. <script type = text/javascript> function extractText(node){

Question

0

Asked: June 16, 20262026-06-16T15:40:01+00:00 2026-06-16T15:40:01+00:00

I’ve made this to try to extract text. <script type = text/javascript> function extractText(node){

0

I’ve made this to try to extract text.

<script type = "text/javascript">
function extractText(node){
    var all = "";
    for (node=node.firstChild;node;node=node.nextSibling){
        alert(node.nodeValue + " = " + node.nodeType);
        if (node.nodeType == 3){
            all += node.nodeValue   
        }
    }
    alert(all);
}
</script>

That is located in the head of an html document.
The body looks as such…

<body onload = "extractText(document.body)">
Stuff
<b>text</b>
<script>
var x = 1;
</script>
</body>

The problem is that the alert(all); only shows “Stuff”, and it adds a bunch of null things that I don’t really understand when doing the alert(node.nodeValue + " = " + node.nodeType);. It says null = 3 a few times. Could anyone tell me why this isn’t working properly? Thanks in advance.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T15:40:02+00:00

If you want the text from the document, you may want to look in to a recursive call. However, if you don’t care about children, remove the first if (node.hasChildNodes()){} condition in the following:

function extractText(node){
    var txt = '';
    // recursive exploration and option to uncomment the check for a <script>
    // <script>s will have children as the the actual portion being executed
    // is considered a text node (nodeType===3)
    if (node.hasChildNodes()/* && node.nodeName !== 'SCRIPT'*/){
        for (var c = 0; c < node.childNodes.length; c++){
            txt += extractText(node.childNodes[c]);
        }
    }else if(node.nodeType===3){
        txt += node.textContent;
    }
    return txt;
}
alert(extractText(document.body));

Also, you probably want to grab textContent over nodeValue but that’s your call. You can also get more granular and test if the nodeName is a SCRIPT and ignore if (if you so chose) but I’ll let you make that determination.

Follow-Up: here’s a fiddle you can play with, with the <script> test commented and optional whitespace removal: http://jsfiddle.net/KZuk5/2/

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve made this to try to extract text. <script type = text/javascript> function extractText(node){

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply