I’m working on a browser automation framework that, among other browsers, drives IE. As such, I need to get the screen coordinates of elements the user wants to interact with. The primary method we use to get these coordinates is by using getBoundingClientRect(). The problem is that in IE, these coordinates are always given relative to the top-left corner of the enclosing document. If the element is in a frame/iframe, this may not be the screen coordinates of the element relative to the top-left corner of the browser window.
I use COM via C++ to drive the browser, though I am able to execute JavaScript via that C++ driver layer. I am also driving it out-of-process, which means I am not executing within a Browser Helper Object (BHO, or IE add-in). Moving to a BHO is not an option for this project. Outside of working with IE 6, 7, 8, and 9, I don’t have to worry about cross-browser compatibility for whatever method I can find that works. Using jQuery or similar really isn’t an option for me, but if there is a solution that uses a JavaScript library, I may be able to reverse-engineer a solution that is acceptable for use within the project.
Here are some of the things I’ve tried so far:
- Use element.ownerDocument.parentWindow.frameElement within JavaScript. This will fail if the containing document and the document within the frame are in different domains.
- Use IHTMLWindow4::get_frameElement() on the window containing the element within a frame. This should be the C++ equivalent of the JavaScript method above; however, it returns E_NOINTERFACE when I attempt to use it.
- Use IDisplayServices::TransformPoint() to convert the coordinates of the element top-left corner to a different frame of reference. This does not appear to work cross-process.
How can I find or calculate the screen coordinates of an HTML element relative to the upper-left corner of the browser window, when the target element may exist within a frame or iframe?
The correct approach for this project is to get the document containing the frame or iframe element and looping through the set of frames in that parent document until you find the proper frame. Once there, you can get the or element using the document.parentWindow.frameElement(), and can calculate the location of the containing element from there. The actual code solving the problem can be found in Element::AppendFrameDetails() at this link.