I wrote a function in Python 2.7 (on Window OS 64bit) in order to

Question

0

Asked: June 16, 20262026-06-16T23:17:56+00:00 2026-06-16T23:17:56+00:00

I wrote a function in Python 2.7 (on Window OS 64bit) in order to

0

I wrote a function in Python 2.7 (on Window OS 64bit) in order to calculate the mean value of of the intersection area from a reference polygon (Ref) and one or more segmented (Seg) polygon(s) in ESRI shapefile format. The code is quite slow because i have more that 2000 reference polygon (s) and for each Ref_polygon the function run for every time for all Seg polygons(s) (more than 7000). I am sorry but the function is a prototype.

I wish to know if multiprocessing can help me to increase the speed of my loop or there are more performance solutions. if multiprocessing can be a possible solution i wish to know the best way to optimize my following function

import numpy as np
import ogr
import osr,gdal
from shapely.geometry import Polygon
from shapely.geometry import Point
import osgeo.gdal
import osgeo.gdal as gdal

def AreaInter(reference,segmented,outFile):
     # open shapefile
     ref = osgeo.ogr.Open(reference)
     if ref is None:
          raise SystemExit('Unable to open %s' % reference)
     seg = osgeo.ogr.Open(segmented)
     if seg is None:
          raise SystemExit('Unable to open %s' % segmented)
     ref_layer = ref.GetLayer()
     seg_layer = seg.GetLayer()
     # create outfile
     if not os.path.split(outFile)[0]:
          file_path, file_name_ext = os.path.split(os.path.abspath(reference))
          outFile_filename = os.path.splitext(os.path.basename(outFile))[0]
          file_out = open(os.path.abspath("{0}\\{1}.txt".format(file_path, outFile_filename)), "w")
     else:
          file_path_name, file_ext = os.path.splitext(outFile)
          file_out = open(os.path.abspath("{0}.txt".format(file_path_name)), "w")
     # For each reference objects-i
     for index in xrange(ref_layer.GetFeatureCount()):
          ref_feature = ref_layer.GetFeature(index)
          # get FID (=Feature ID)
          FID = str(ref_feature.GetFID())
          ref_geometry = ref_feature.GetGeometryRef()
          pts = ref_geometry.GetGeometryRef(0)
          points = []
          for p in xrange(pts.GetPointCount()):
               points.append((pts.GetX(p), pts.GetY(p)))
          # convert in a shapely polygon
          ref_polygon = Polygon(points)
          # get the area
          ref_Area = ref_polygon.area
          # create an empty list               
          Area_seg, Area_intersect = ([] for _ in range(2))
          # For each segmented objects-j
          for segment in xrange(seg_layer.GetFeatureCount()):
               seg_feature = seg_layer.GetFeature(segment)
               seg_geometry = seg_feature.GetGeometryRef()
               pts = seg_geometry.GetGeometryRef(0)
               points = []
               for p in xrange(pts.GetPointCount()):
                    points.append((pts.GetX(p), pts.GetY(p)))
               seg_polygon = Polygon(points)
               seg_Area.append = seg_polygon.area
               # intersection (overlap) of reference object with the segmented object
               intersect_polygon = ref_polygon.intersection(seg_polygon)
               # area of intersection (= 0, No intersection)
               intersect_Area.append = intersect_polygon.area
          # Avarage for all segmented objects (because 1 or more segmented polygons can  intersect with reference polygon)
          seg_Area_average = numpy.average(seg_Area)
          intersect_Area_average = numpy.average(intersect_Area)
          file_out.write(" ".join(["%s" %i for i in [FID, ref_Area,seg_Area_average,intersect_Area_average]])+ "\n")
     file_out.close()

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T23:17:57+00:00

You can use the multiprocessing package, and especially the Pool class. First create a function that does all the stuff you want to do within the for loop, and that takes as an argument only the index:

def process_reference_object(index):
      ref_feature = ref_layer.GetFeature(index)
      # all your code goes here
      return (" ".join(["%s" %i for i in [FID, ref_Area,seg_Area_average,intersect_Area_average]])+ "\n")

Note that this doesn’t write to a file itself- that would be messy because you’d have multiple processes writing to the same file at the same time. Instead, it returns the string that needs to be written. Also note that there are objects in this function like ref_layer or ref_geometry that will need to reach it somehow- that’s up to you how to do it (you could put process_reference_object as the method in a class initialized with them, or it could be as ugly as just defining them globally).

Then, you create a pool of process resources, and run all of your indices using Pool.imap_unordered (which will itself allocate each index to a different process as necessary):

from multiprocessing import Pool
p = Pool()  # run multiple processes
for l in p.imap_unordered(process_reference_object, range(ref_layer.GetFeatureCount())):
    file_out.write(l)

This will parallelize the independent processing of your reference objects across multiple processes, and write them to the file (in an arbitrary order, note).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I wrote a function in Python 2.7 (on Window OS 64bit) in order to

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply