I want implement a program that can do a a Vision-based Page Segmentation. I

Question

0

Asked: May 29, 20262026-05-29T22:33:12+00:00 2026-05-29T22:33:12+00:00

I want implement a program that can do a a Vision-based Page Segmentation. I

0

I want implement a program that can do a a “Vision-based Page Segmentation”. I need some guide and clue. (I need practical information and not just academical info)

My preferred languages are JS (jQuery) and PHP.

I read the following article (VIPS: a Vision-based Page Segmentation Algorithm) and I think it can be a good framework for this purpose:

ftp://ftp.research.microsoft.com/pub/tr/tr-2003-79.pdf

Is there any open source impementation for “Vision-based Page Segmentation”?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T22:33:13+00:00

No. Microsoft Granted Patent on Vision-Based Document Segmentation (VIPS). Try again in 2023. I am truly sorry.

I am not a patent lawyer, but the US patent 7,428,700 claims are quite straightforward:

A method implemented at least in part by a computing device of identifying one or more portions of a document described by a tree structure having a plurality of nodes, the method comprising: identifying a plurality of visual blocks in the document based on, at least, a document model of the document; detecting, distinct from the plurality of visual blocks, one or more separators of the document based on, at least, one or more characteristics of at least one of the plurality of visual blocks; assigning, to each of the one or more separators, a weight based on characteristics of visual blocks on either side of the separator; and constructing, based at least in part on the plurality of visual blocks and the one or more separators, a content structure for the document, wherein the content structure identifies the different visual blocks as different portions of semantic content of the document.

Now, a document described by a tree structure having a plurality of nodes is our old friend DOM model of a Web page.

Also note that the four inventors are also the same four co-authors of the paper cited. I be damned if that’s a sheer coincidence.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I want implement a program that can do a a Vision-based Page Segmentation. I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply