See -pathExtension method: pathExtension Returns the path extension of a…

Question

0

Asked: May 16, 20262026-05-16T10:59:36+00:00 2026-05-16T10:59:36+00:00

I have such files to parse (from scrapping) with Python: some HTML and JS

0

I have such files to parse (from scrapping) with Python:

some HTML and JS here...
SomeValue = 
{
     'calendar': [
     {       's0Date': new Date(2010, 9, 12),
             'values': [
                     { 's1Date': new Date(2010, 9, 17), 'price': 9900 },
                     { 's1Date': new Date(2010, 9, 18), 'price': 9900 },
                     { 's1Date': new Date(2010, 9, 19), 'price': 9900 },
                     { 's1Date': new Date(2010, 9, 20), 'price': 9900 },
                     { 's1Date': new Date(2010, 9, 21), 'price': 9900 },
                     { 's1Date': new Date(2010, 9, 22), 'price': 9900 },
                     { 's1Date': new Date(2010, 9, 23), 'price': 9900 }]
     },
     'data': [{
     index: 0,
     serviceClass: 'Economy',
     prices: [9900, 320.43, 253.27],
     eTicketing: true,
     segments: [{
             indexSegment: 0,
             stopsCount: 1,
             flights: [{
                     index: 0,

... and a lot of nested data and again HTML and JS...

I need to parse it and extract all json data. Now I use regex with cleaning all ‘\n’ and ‘\t’ and eval() function to convert it to Python dictionary.. I really don’t like this solution, eval() especially. But I looked at BeautifulSoup and lxml, and didn’t find something that will help to parse it.
Can you suggest something better than regex and eval() for this task?
Page example: http://codepaste.ru/3830/

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T10:59:37+00:00

aarrghhh no regex dont use regex no regex no no nooooooo

Use the json module to handle JSON data:

import json
json.loads( <string> )

Use BeautifulSoup or lxml to handle parsing the html page:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup( <string> )

If you want specific help, you’ll need to provide specific data e.g. the class of the tags in which this data is enclosed. You could soup.findAll the script tags, for instance, then strip some lines to get to the JSON, then feed that into json.loads.

aarrghhh no regex dont use regex no regex no no nooooooo

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have such files to parse (from scrapping) with Python: some HTML and JS

Leave an answerCancel reply

1 Answer

aarrghhh no regex dont use regex no regex no no nooooooo

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Leave an answer
Cancel reply