Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7852989
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T19:33:15+00:00 2026-06-02T19:33:15+00:00

i have a extremely large xml-file – which is derived from the field of

  • 0

i have a extremely large xml-file – which is derived from the field of geo informatics. i got it from a German subsite or the OpenStreetMap-Project: the Geograpical-Engineering-site that deilvers a weekly snapshot of OpenStreetMap of a certain area: i took the germany.osm.bz2 from here http://ftp5.gwdg.de/pub/misc/openstreetmap/download.geofabrik.de/

For doing some tests with xslt i want to run a request to find out certain entity – let us take for example the restaurants. we want to find out all the restaurants in the area.

now we can run that directly on the bz2 compressed file, that we downloaded – for example if we use the following code:

bzcat germany.osm.bz2 | xsltproc restaurants.xslt - > restaurants,csv

well i splitted the file with xml_split -which is a great perl-module from CPAN.

The problem: with the following xslt-processor i get only bad results – the parsed files werent not parsed enough i only get a minor set of informations when i run the code on a xml-file. see the xslt-processor – and below – a litte data-chunk out of the file i run and parse if you want to check it – just get the little dataset – note it is a splitted file

here you can get it: https://rapidshare.com/#!download|643p12|2523227518|germany-001.xml|100000

Note: see therefore the important lines: xmlns:xml_split="http://xmltwig.com/xml_split"
and this one here:

 <xsl:for-each select="xml_split:root/node/tag[@k='amenity' and @v='restaurant']">

Note– you can run a little test – and see how long it takes to parse
time xsltproc restaurants.xslt germany-001.xml > restaurants-001.csv

real    0m0.308s
user    0m0.283s
sys     0m0.022s

here we have the xslt-processor that contains the code for parsing – ( called atest3.xslt
)

<xsl:stylesheet version = '1.0'
        xmlns="http://www.w3.org/1999/xhtml"
        xmlns:xml_split="http://xmltwig.com/xml_split"
        xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>

    <xsl:output method="text" encoding="UTF-8"/>
    <xsl:template match="/">

            <xsl:for-each select="xml_split:root/node/tag[@k='amenity' and @v='restaurant']">
            <xsl:value-of select="../@id"/>
            <xsl:text>&#x09;</xsl:text>
            <xsl:value-of select="../@lat"/>
            <xsl:text>&#x09;</xsl:text>
            <xsl:value-of select="../@lon"/>
            <xsl:text>&#x09;</xsl:text>
            <xsl:for-each select="../tag[@k='name']">
                <xsl:value-of select="@v"/>
            </xsl:for-each>
            <xsl:text>&#x0A;</xsl:text>
        <xsl:value-of select="./tag[@k = 'cuisine']/@v"/>
        <xsl:text>&#x09;</xsl:text>
        <xsl:value-of select="./tag[@k = 'wheelchair']/@v"/>
        <xsl:text>&#x09;</xsl:text>
        <xsl:value-of select="./tag[@k = 'website']/@v"/>
        <xsl:text>&#x09;</xsl:text>
        <xsl:value-of select="./tag[@k = 'addr:country']/@v"/>
        <xsl:text>&#x09;</xsl:text>
        <xsl:value-of select="./tag[@k = 'addr:city']/@v"/>
        <xsl:text>&#x09;</xsl:text>        
        <xsl:value-of select="./tag[@k = 'addr:street']/@v"/>
        <xsl:text>&#x09;</xsl:text>
        <xsl:value-of select="./tag[@k = 'addr:housenumber']/@v"/>
        <xsl:text>&#x0A;</xsl:text>
    </xsl:for-each>
    </xsl:template>

</xsl:stylesheet>

and here below we have a data-chunk out of the xml-file that we have parsed: see it

<node id="52768810" lat="48.2044749" lon="11.3249434" version="7" changeset="9490517" user="wheelmap_visitor" uid="290680" timestamp="2011-10-07T20:24:46Z">
    <tag k="addr:city" v="Olching" />
    <tag k="addr:country" v="DE" />
    <tag k="addr:housenumber" v="72" />
    <tag k="addr:postcode" v="82140" />
    <tag k="addr:street" v="Hauptstraße" />
    <tag k="amenity" v="restaurant" />
    <tag k="cuisine" v="mexican" />
    <tag k="email" v="info@cantina-olching.de" />
    <tag k="name" v="La Cantina" />
    <tag k="opening_hours" v="Mo-Su 17:00-01:00" />
    <tag k="phone" v="+49 (8142) 444393" />
    <tag k="website" v="http://www.cantina-olching.com/" />
    <tag k="wheelchair" v="no" />
</node>

see the results – note there are missing some parts – unfortunatly..

51923772    49.0812534  8.5637183   Zur Talschänke

52040576    49.4635433  12.4287292  Emil-Kemmer-Haus

52141326    49.4144243  12.4143153  Gasthaus Plecher

52623232    48.9293634  8.2722549   Korfu

52664989    49.0435133  8.3919370   Restaurant Zentrum

52754898    49.3243828  12.3618662  Gasthaus Irlbacher

52762875    49.0099641  8.2528132   Langasthof Stober

52765672    50.0082768  9.2139632   Wirtshaus im Frohnrad

52768810    48.2044749  11.3249434  La Cantina

52768816    48.2051698  11.3257964  Indian Palace

52768826    48.2073264  11.3276147  Dorfstub'n

52768830    48.2075968  11.3281055  Le Candele

52774284    49.0319471  8.2888353   Zum Anker

well it is somewhat a problem that i get the results – ive tried alot but at the moment i am glueless why i get the little output – that is totally contrary to the tags i have in the xslt -processor – any idea and hint will be greatly appreciatdd

btw: after all i want to run approx 5000 files that are the result of the split – and subsequently i want to collect all the results in a mysql-database…

here you can get the original-file:
http://ftp5.gwdg.de/pub/misc/openstreetmap/download.geofabrik.de ( germany.osm.bz2 01-Apr-2012 14:51 1.7G )

and here a splitted one:
https://rapidshare.com/#!download|643p12|2523227518|germany-001.xml|100000

i have to refactor the coed -so the question – is – how can i get the mysql-results on a efficient way?

*update:*thx to the first answer in this thread i startet to refactor the code – but still lack of some better results. i have to retry it again..lots of changes were suggested – i did a quick walktrough on the xslt-parser: with the first trial of refactoring i got some funny results. But i will try again – i go trough all the xslt-processor-code and have a closer look if i find the errors and finally i try to refactor all the xslt-file. – any pointers and subbestions or code-snippets are greatly wellcome. Greetings your zero

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T19:33:16+00:00Added an answer on June 2, 2026 at 7:33 pm

    It looks like your ./tag[@k = '???']/@v xpath should be ../tag[@k='???'], because your context node is your original matching tag element, not the node element.

    You should consider changing your context node to make this code clearer and avoid errors like this:

    <xsl:for-each select="xml_split:root/node[tag[@k='amenity' and @v='restaurant']]">
    

    Then you can use XPaths like select="tag/@id" and tag[@k='country']/@v.

    But you should consider refactoring this code to make better use of template instead of for-each.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have an extremely large XSL file to transform XML into CSV. Bascially, the
I have a SQL script which is extremely large (about 700 megabytes). I am
I have to remove duplicate strings from extremely big text file (100 Gb+) Since
I have an extremely large db table (around 30GB data file), and I started
What I have is an extremely large text file that needs to go into
I am converting an extremely large and very old (25 years!) program from C
I have a query returning large XML, its size can reach 1GB in extreme
I have an extremely large picture of a map. Now I want to create
I have an extremely large database and most of the space is the index
Here's what I'm trying to do. I have an extremely large list of items.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.