Scandata parsing¶
- internetarchivepdf.scandata.scandata_xml_get_toc(xml_file)[source]¶
Returns a table of contents given a parsed scandata.xml
Args:
scandata: Parsed scandata as returned by scandata_parse
Returns:
List of dict describing the table of contents: Indexes of pages that match a specific page type: [{‘title’: ‘The beginning’, ‘level’: 1, ‘label’: None, ‘leaf’: 2}, …] (
list of dict
)
Might raise KeyError in case the scandata is invalid