I’m looking for an algorithm for detecting lines (e.g. from tables) and word bounding boxes in document images.
Currently I am segmenting the image by performing alternating horizontal and vertical projections and checking the resulting histogram for gaps. While this works for some documents, it doesn’t for those that contain tables with lines on the outside, as the histogram then contains no gaps that would allow a further segmentation. Therefore I am looking for a more sophisticated algorithm.
Not sure I understood your question completely. It would be better if you add the image you are talking about.
Any way, Use cvHoughLines function to detect lines in image.
Also, opencv comes with a sample to detect squares. Modify it a little to detect word bounding boxes.