“Out of memory” while parsing large (100 Mb) XML file using perl?

Handling large XML files that don't fit in memory is something that XML::Twig advertises.

Handling large XML files that don't fit in memory is something that XML::Twig advertises: One of the strengths of XML::Twig is that it let you work with files that do not fit in memory (BTW storing an XML document in memory as a tree is quite memory-expensive, the expansion factor being often around 10). To do this you can define handlers, that will be called once a specific element has been completely parsed. In these handlers you can access the element and process it as you see fit (...) The code posted in the question isn't making use of the strength of XML::Twig at all (using the simplify method doesn't make it much better than XML::Simple).

What's missing from the code are the 'twig_handlers' or 'twig_roots', which essentially cause the parser to focus on relevant portions of the XML document memory-efficiently. It's difficult to say without seeing the XML whether processing the document chunk-by-chunk or just selected parts is the way to go, but either one should solve this issue. So the code should look something like the following (chunk-by-chunk demo): use strict; use warnings; use XML::Twig; use List::Util 'sum'; # To make life easier use Data::Dump 'dump'; # To see what's going on my %bedrooms; # Data structure to store the wanted info my $xml = XML::Twig->new ( twig_roots => { DivisionHouseRoom => \&count_bedrooms, } ); $xml->parsefile( 'divisionhouserooms-v3.

Xml'); sub count_bedrooms { my ( $twig, $element ) = @_; my @divParents = $element->children( 'Divisions' ); my $id = $element->first_child_text( 'HouseCode' ); for my $divParent ( @divParents ) { my @divisions = $divParent->children( 'Division' ); my $total = sum map { $_->text } @divisions; $bedrooms{$id} = $total; } $element->purge; # Free up memory } dump \%bedrooms.

See Processing an XML document chunk by chunk section of XML::Twig documentation, it specifically discuss how to process document part by part, allowing for large XML file processing.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions