Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7368419
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T03:40:15+00:00 2026-05-29T03:40:15+00:00

I saw this question PHP – Get number of pages in a Word document

  • 0

I saw this question PHP – Get number of pages in a Word document . I also need to determine the pages count from given word file (doc/docx). I tried to investigate phplivedocx/ZF (@hobodave linked to those in the original post answers), but I lost my hands and legs there. I can’t use any outer web service either (like DOC2PDF sites, and then count the pages in the PDF version, or so…).

Simply: Is there any php code (using ZF or anything else in PHP, excluding COM object or other execution-files, such ‘AbiWord’; I’m using shared Linux server, without exec or similar function), to find the pages count of word file?

EDIT: The word versions that about to be supported are Microsoft-Word 2003 & 2007.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T03:40:16+00:00Added an answer on May 29, 2026 at 3:40 am

    Getting the number of pages for docx files is very easy:

    function get_num_pages_docx($filename)
    {
        $zip = new ZipArchive();
    
        if($zip->open($filename) === true)
        {  
            if(($index = $zip->locateName('docProps/app.xml')) !== false)
            {
                $data = $zip->getFromIndex($index);
                $zip->close();
    
                $xml = new SimpleXMLElement($data);
                return $xml->Pages;
            }
    
            $zip->close();
        }
    
        return false;
    }
    

    For 97-2003 format it’s certainly challenging, but by no means impossible. The number of pages is stored in the SummaryInformation section of the document, but due to the OLE format of the files that makes it a pain to find. The structure is defined extremely thoroughly (though badly imo) here and simpler here. I looked at this for an hour today, but didn’t get very far! (not a level of abstraction I’m used to), but output the hex to better understand the structure:

    function get_num_pages_doc($filename) 
    {
        $handle = fopen($filename, 'r');
        $line = @fread($handle, filesize($filename));
    
        echo '<div style="font-family: courier new;">';
    
            $hex = bin2hex($line);
            $hex_array = str_split($hex, 4);
            $i = 0;
            $line = 0;
            $collection = '';
            foreach($hex_array as $key => $string)
            {
                $collection .= hex_ascii($string);
                $i++;
    
                if($i == 1)
                {
                    echo '<b>'.sprintf('%05X', $line).'0:</b> ';
                }
    
                echo strtoupper($string).' ';
    
                if($i == 8)
                {
                    echo ' '.$collection.' <br />'."\n";
                    $collection = '';
                    $i = 0;
    
                    $line += 1;
                }
            }
    
        echo '</div>';
    
        exit();
    }
    
    function hex_ascii($string, $html_safe = true)
    {
        $return = '';
    
        $conv = array($string);
        if(strlen($string) > 2)
        {
            $conv = str_split($string, 2);
        }
    
        foreach($conv as $string)
        {
            $num = hexdec($string);
    
            $ascii = '.';
            if($num > 32)
            {   
                $ascii = unichr($num);
            }
    
            if($html_safe AND ($num == 62 OR $num == 60))
            {
                $return .= htmlentities($ascii);
            }
            else
            {
                $return .= $ascii;
            }
        }
    
        return $return;
    }
    
    function unichr($intval)
    {
        return mb_convert_encoding(pack('n', $intval), 'UTF-8', 'UTF-16BE');
    }
    

    which will out put code where you can find the sections such as:

    007000: 0500 5300 7500 6D00 6D00 6100 7200 7900 ..S.u.m.m.a.r.y.
    007010: 4900 6E00 6600 6F00 7200 6D00 6100 7400 I.n.f.o.r.m.a.t.
    007020: 6900 6F00 6E00 0000 0000 0000 0000 0000 i.o.n...........
    007030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 
    

    Which will allow you to see the referencing info such as:

    007040: 2800 0201 FFFF FFFF FFFF FFFF FFFF FFFF (...ÿÿÿÿÿÿÿÿÿÿÿÿ
    007050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
    007060: 0000 0000 0000 0000 0000 0000 0000 0000 ................
    007070: 0000 0000 2500 0000 0010 0000 0000 0000 ....%...........
    

    Which will allow you to determine properties described:

    _ab = ("SummaryInformation") 
    _cb = 0028
    _mse = 02 (STGTY_STREAM) 
    _bflags = 01 (DE_BLACK) 
    _sidLeftSib = FFFF FFFF 
    _sidRightSib = FFFF FFFF (none) 
    _sidChild = FFFF FFFF (n/a for STGTY_STREAM) 
    _clsid = 0000 0000 0000 0000 0000 0000 0000 0000 (n/a) 
    _dwUserFlags = 0000 0000 (n/a) 
    _time[0] = CreateTime = 0000 0000 0000 0000 (n/a) 
    _time[1] = ModifyTime = 0000 0000 0000 0000 (n/a)
    _startSect = 0000 0000 
    _ulSize = 0000 1000 
    _dptPropType = 0000 (n/a)
    

    Which will let you find the relevant section of code, unpack it and get the page number. Of course this is the hard bit that I just don’t have time for, but should set you in the right direction.

    M$ don’t make it easy!

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I saw this question asked about C# I would like an answer for PHP.
I saw this question asking about whether globals are bad . As I thought
I saw this question and it motivated me to look again (without success) at
I saw this question Inject into private, package or public field or provide a
I saw this question and it reminded me of AutoGenerateColumns in the old DataGrid.
I saw this quote in this question : MS support is poor, except when
I just saw this question: Understanding .NET’s SecurityAction parameter for permissions And I have
I just saw this question and one of the answers looks really appealing to
What does the following expression return in Java? Math.max(Float.POSITIVE_INFINITY, Double.POSITIVE_INFINITY); I saw this question
I saw this same question for VIM and it has been something that I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.