I’ve a text document from which I want to extract URLs and place them in a new text file. How can I do that in Perl? A sample of the text file is here:
{“type”:”TabGroupsManager:GroupData”,”id”:65,”name”:”XML
Schema
Editor”,”image”:”http://www.altova.com/favicon.ico”,”disableAutoRename”:false,”titleList”:”XML
Schema Editor\u000aAltova XMLSpy Code
Generation\u000aOnline Video
Demos\u000aScheduled Data Exchange
Case Study\u000aXML Editor\u000aAltova
XMLSpy 2011\u000aXML Schema Management
Tool\u000a”,”tabs”:[“{\”entries\”:[{\”url\”:\”http://www.altova.com/xmlspy/xml-schema-editor.html\”,\”title\”:\”XML
Schema
Editor\”,\”ID\”:1442422751,\”referrer\”:\”http://www.altova.com/xmlspy/xml-editing.html\”,\”scroll\”:\”0,0\”,\”formdata\”:{\”#q\”:\”\”}}],\”index\”:1,\”attributes\”:{\”image\”:\”http://www.altova.com/favicon.ico\”},\”extData\”:{\”TabGroupsManagerGroupId\”:\”65\”,\”TabGroupsManagerGroupName\”:\”XML
Schema
Editor\”},\”_formDataSaved\”:true}”,”{\”entries\”:[{\”url\”:\”http://www.altova.com/xmlspy/xml-code-generation.html\”,\”title\”:\”Altova
XMLSpy Code
Generation\”,\”ID\”:1442423118,\”referrer\”:\”http://www.google.com/search?hl=en&client=firefox-a&hs=GR1&rls=org.mozilla%3Aen-GB%3Aofficial&q=altova+derive+schema+from+xml&aq=f&aqi=m1&aql=&oq=&gs_rfai=\”,\”scroll\”:\”0,0\”,\”formdata\”:{\”#q\”:\”\”}}],\”index\”:1,\”attributes\”:{\”image\”:\”http://www.altova.com/favicon.ico\”},\”extData\”:{\”TabGroupsManagerGroupId\”:\”65\”,\”TabGroupsManagerGroupName\”:\”XML
Schema
Editor\”},\”_formDataSaved\”:true}”,”{\”entries\”:[{\”url\”:\”http://www.altova.com/videos.asp?type=0&video=xmlspy\”,\”title\”:\”Online
Video
Demos\”,\”ID\”:1442423184,\”referrer\”:\”http://www.altova.com/xmlspy/xml-code-generation.html\”,\”scroll\”:\”0,0\”,\”formdata\”:{\”#q\”:\”\”}}],\”index\”:1,\”attributes\”:{\”image\”:\”http://www.altova.com/favicon.ico\”},\”extData\”:{\”TabGroupsManagerGroupId\”:\”65\”,\”TabGroupsManagerGroupName\”:\”XML
Schema
Editor\”},\”_formDataSaved\”:true}”,”{\”entries\”:[{\”url\”:\”http://www.altova.com/solutions/exchange_ratecasestudy.html\”,\”title\”:\”Scheduled
Data Exchange Case
Study\”,\”ID\”:2618,\”formdata\”:{\”#q\”:\”\”},\”scroll\”:\”0,1369\”}],\”index\”:1,\”attributes\”:{\”image\”:\”http://www.altova.com/favicon.ico\”},\”extData\”:{\”TabGroupsManagerGroupId\”:\”65\”,\”TabGroupsManagerGroupName\”:\”XML
Schema
Editor\”},\”_formDataSaved\”:true}”,”{\”entries\”:[{\”url\”:\”http://www.altova.com/xml-editor/\”,\”title\”:\”XML
Editor\”,\”ID\”:2620,\”formdata\”:{\”#q\”:\”\”},\”scroll\”:\”0,0\”}],\”index\”:1,\”attributes\”:{\”image\”:\”http://www.altova.com/favicon.ico\”},\”extData\”:{\”TabGroupsManagerGroupId\”:\”65\”,\”TabGroupsManagerGroupName\”:\”XML
Schema
Editor\”},\”_formDataSaved\”:true}”,”{\”entries\”:[{\”url\”:\”http://manual.altova.com/XMLSpy/spystandard/index.html?xmlschemasstd.htm\”,\”title\”:\”Altova
XMLSpy
2011\”,\”ID\”:2622,\”children\”:[{\”url\”:\”http://manual.altova.com/XMLSpy/spystandard/xmlspy_content_dyn.html\”,\”title\”:\”Altova XMLSpy
2011\”,\”ID\”:2623,\”referrer\”:\”http://manual.altova.com/XMLSpy/spystandard/index.html?xmlschemasstd.htm\”,\”scroll\”:\”0,0\”},{\”url\”:\”http://manual.altova.com/XMLSpy/spystandard/xmlschemasstd.htm\”,\”title\”:\”XML
Schemas\”,\”ID\”:2624,\”referrer\”:\”http://manual.altova.com/XMLSpy/spystandard/index.html?xmlschemasstd.htm\”,\”scroll\”:\”0,260\”}],\”scroll\”:\”0,0\”}],\”index\”:1,\”attributes\”:{},\”extData\”:{\”TabGroupsManagerGroupId\”:\”65\”,\”TabGroupsManagerGroupName\”:\”XML
Schema
Editor\”},\”_formDataSaved\”:true}”,”{\”entries\”:[{\”url\”:\”http://www.altova.com/schemaagent.html\”,\”title\”:\”XML
Schema Management
Tool\”,\”ID\”:2626,\”formdata\”:{\”#q\”:\”\”},\”scroll\”:\”0,171\”}],\”index\”:1,\”attributes\”:{\”image\”:\”http://www.altova.com/favicon.ico\”},\”extData\”:{\”TabGroupsManagerGroupId\”:\”65\”,\”TabGroupsManagerGroupName\”:\”XML
Schema
Editor\”},\”_formDataSaved\”:true}”]}
From that I want to create a text file like:
http://www.altova.com/xmlspy/xml-schema-editor.html
http://www.altova.com/xmlspy/xml-code-generation.html
Since that appears to be a JSON file rather than a plain text file, use one of the JSON modules on CPAN. This is slightly complicated by the fact that you appear to have data encoded as JSON then stored as strings in a larger object which has then been converted to JSON — so you will have to parse the file, extract the strings, parse them as JSON in turn and then extract the URIs from them.