Show HN: Pdf2csv – Convert PDF Tables to CSV with CLI and Python API
Hi Hackernews
Hi HN,
I’m thrilled to share pdf2csv, a lightweight tool for converting tables from PDF files into CSV or XLSX format. It’s particularly handy for right-to-left (RTL) languages like Farsi, Hebrew, and Arabic, ensuring text is extracted correctly and easily reversed when needed.
Features:
• RTL Language Support: Handles Farsi, Hebrew, and Arabic beautifully with optional text reversal.
• Flexible Output: Save tables as CSV or XLSX.
• Dual Interface: Use as a Python library or from the CLI.
• Powered by Docling: Leveraging the robust Docling library for accurate table extraction
Comments URL: https://news.ycombinator.com/item?id=42614330
Points: 1
# Comments: 0