The Penn Treebank format does not annotate the internal structure of a noun phrase, e.g.
(NP (JJ crude) (NN oil) (NNS prices))
or
(NP
(NP (DT the) (JJ big) (JJ blue) (NN house))
(SBAR
(WHNP (WDT that))
(S
(VP (VBD was)
(VP (VBN built)
(PP (IN near)
(NP (DT the) (NN river)))))))
I would like to extract the heads (prices and house). Do you know of any tool that can do this?
Michael Collins dissertation (Appendix A) includes head-finding rules for the Penn Treebank that work reasonably well and are not difficult to implement. They’re far from perfect, though, since it’s not the easiest task.
The work by David Vadas and James Curran on NP structure in the Penn Treebank could also be relevant: