I need to make my Perl code read some Python’s serialized object for later processing. I came with parser based on Parse::MGC, but it’s slow. May be I did it wrong or may be someone knows better way to convert Python’s serialized object in some sort of Perl structure?
Here is my Parse code:
package Room::HandParser;
use base qw( Parser::MGC );
my @poker_cards_string = ( '2h', '3h', '4h', '5h', '6h', '7h', '8h', '9h', 'Th', 'Jh', 'Qh', 'Kh', 'Ah', '2d', '3d', '4d', '5d', '6d', '7d', '8d', '9d', 'Td', 'Jd', 'Qd', 'Kd', 'Ax', '2c', '3c', '4c', '5c', '6c', '7c', '8c', '9c', 'Tc', 'Jc', 'Qc', 'Kc', 'Ac', '2s', '3s', '4s', '5s', '6s', '7s', '8s', '9s', 'Ts', 'Js', 'Qs', 'Ks', 'As' );
sub parse_declaration {
my $self = shift;
[
$self->any_of(
sub { $self->token_int },
sub { $self->token_string },
),
$self->expect(":"),
$self->parse,
]
}
sub parse_hash {
my $self = shift;
my %ret;
$self->list_of(",", sub {
my $res = $self->parse_declaration;
$ret{$res->[0]} = $res->[2];
});
return \%ret;
}
sub parse_cards {
my $self = shift;
my $card = $self->token_int;
return $poker_cards_string[$card & 0x3F];
}
sub parse {
my $self = shift;
$self->any_of(
sub { $self->scope_of( "[", sub { $self->list_of(",", \&parse) }, "]" ) },
sub { $self->scope_of( "(", sub { $self->list_of(",", \&parse) }, ")" ) },
sub { $self->scope_of( "{", sub { $self->parse_hash }, "}" ) },
sub { $self->scope_of( "PokerCards([", sub { $self->list_of(",", \&parse_cards) }, "])" ) },
sub { $self->token_float },
sub { $self->token_int },
sub { $self->token_string },
sub { $self->token_kw( qw(None True False) ) },
);
}
1;
Here is example of serialized Python object I need to parse:
[('game', 0, 195, 0, 0.0, 'holdem', '100-200-no-limit', [50312, 50313, 50314, 50315, 50316, 50317, 2], 0, {2: 1000000, 50312: 200000, 50313: 200000, 50314: 200000, 50315: 200000, 50316: 200000, 50317: 200000}), ('position', 1), ('blind', 50313, 10000, 0), ('position', 2), ('blind', 50314, 20000, 0), ('position', -1), ('round', 'pre-flop', PokerCards([]), {2: PokerCards([226, 208]), 50312: PokerCards([223, 206]), 50313: PokerCards([221, 233]), 50314: PokerCards([222, 211]), 50315: PokerCards([235, 216]), 50316: PokerCards([209, 236]), 50317: PokerCards([237, 243])}), ('position', 3), ('call', 50315, 20000), ('position', 4), ('call', 50316, 20000), ('position', 5), ('call', 50317, 20000), ('position', 6), ('call', 2, 20000), ('position', 0), ('fold', 50312), ('position', 1), ('call', 50313, 10000), ('position', 2), ('check', 50314), ('position', -1), ('round', 'flop', PokerCards([7, 21, 46]), {2: PokerCards([226, 208]), 50313: PokerCards([221, 233]), 50314: PokerCards([222, 211]), 50315: PokerCards([235, 216]), 50316: PokerCards([209, 236]), 50317: PokerCards([237, 243])}), ('position', 1), ('check', 50313), ('position', 2), ('check', 50314), ('position', 3), ('check', 50315), ('position', 4), ('check', 50316), ('position', 5), ('check', 50317), ('position', 6), ('check', 2), ('position', -1), ('round', 'turn', PokerCards([7, 21, 46, 38]), None), ('position', 1), ('check', 50313), ('position', 2), ('check', 50314), ('position', 3), ('check', 50315), ('position', 4), ('check', 50316), ('position', 5), ('check', 50317), ('position', 6), ('check', 2), ('position', -1), ('round', 'river', PokerCards([7, 21, 46, 38, 20]), None), ('position', 1), ('check', 50313), ('position', 2), ('check', 50314), ('position', 3), ('check', 50315), ('position', 4), ('check', 50316), ('position', 5), ('check', 50317), ('position', 6), ('check', 2), ('position', -1), ('showdown', None, {2: PokerCards([226, 208]), 50313: PokerCards([29, 41]), 50314: PokerCards([222, 211]), 50315: PokerCards([43, 24]), 50316: PokerCards([209, 236]), 50317: PokerCards([45, 51])}), ('end', [50317], [{'serial2delta': {2: -20000, 50313: -20000, 50314: -20000, 50315: -20000, 50316: -20000, 50317: 100000}, 'player_list': [50312, 50313, 50314, 50315, 50316, 50317, 2], 'serial2rake': {50317: 0}, 'serial2share': {50317: 120000}, 'pot': 120000, 'serial2best': {2: {'hi': [101154816, ['FlHouse', 46, 20, 7, 34, 21]]}, 50313: {'hi': [50841600, ['Trips', 46, 20, 7, 38, 21]]}, 50314: {'hi': [50841600, ['Trips', 46, 20, 7, 38, 21]]}, 50315: {'hi': [50842368, ['Trips', 46, 20, 7, 38, 24]]}, 50316: {'hi': [50841600, ['Trips', 46, 20, 7, 38, 21]]}, 50317: {'hi': [101171200, ['FlHouse', 46, 20, 7, 51, 38]]}}, 'type': 'game_state', 'side_pots': {'building': 0, 'pots': [[120000, 120000]], 'last_round': 3, 'contributions': {0: {0: {2: 20000, 50313: 20000, 50314: 20000, 50315: 20000, 50316: 20000, 50317: 20000}}, 1: {}, 2: {}, 'total': {2: 20000, 50313: 20000, 50314: 20000, 50315: 20000, 50316: 20000, 50317: 20000}, 3: {}}}}, {'serials': [50313, 50314, 50315, 50316, 50317, 2], 'pot': 120000, 'hi': [50317], 'chips_left': 0, 'type': 'resolve', 'serial2share': {50317: 120000}}])]
For such structure it takes several seconds and 100% CPU to parse this object which is not acceptable in my case.
EDIT: here I am NOT looking for workaround like writing python script for eval’ing this strucutre and output it JSON, or rewrite original Python app with added functions to store data as JSON. I am looking into parsing this data with Perl with reasonable performance, since this format is pretty close to JSON and it should be possible to parse it in similar time.
If anyone interested: I end up with few regexps which transform this string into JSON (since they are so close-looking) and then parsing it with JSON::XS. https://github.com/hippich/Bitcoin-Poker-Room/commit/2f0e089908d3fa71dc16021ac6a24807c46529ad#diff-1 __parse_hands() subroutine.