Summary
Graphon currently recognizes common Office and text formats, but OpenDocument text files (.odt, application/vnd.oasis.opendocument.text) are not routed as supported document inputs. This means integrations that rely on Graphon file type standardization or document extraction cannot handle ODT files consistently.
Some WPS Office document MIME types should also be classified as document files for file-type validation.
Expected behavior
.odt files are classified as document files.
application/vnd.oasis.opendocument.text is classified as a document MIME type.
- Document extractor dispatch routes
.odt and application/vnd.oasis.opendocument.text to an ODT extractor.
- WPS Office document MIME types are classified as document files.
Notes
unstructured.partition.odt.partition_odt is available through the existing unstructured dependency, so ODT extraction can reuse the existing unstructured extractor path.
Summary
Graphon currently recognizes common Office and text formats, but OpenDocument text files (
.odt,application/vnd.oasis.opendocument.text) are not routed as supported document inputs. This means integrations that rely on Graphon file type standardization or document extraction cannot handle ODT files consistently.Some WPS Office document MIME types should also be classified as document files for file-type validation.
Expected behavior
.odtfiles are classified asdocumentfiles.application/vnd.oasis.opendocument.textis classified as a document MIME type..odtandapplication/vnd.oasis.opendocument.textto an ODT extractor.Notes
unstructured.partition.odt.partition_odtis available through the existingunstructureddependency, so ODT extraction can reuse the existing unstructured extractor path.