PDFMiner is a text extraction tool for PDF documents. pdf2txt.py extracts all the texts that are rendered programmatically. It also extracts the corresponding locations, font names, font sizes, ...