by Mike Driscoll
I would to see an in-depth discussion on using regular expressions to extract data from PDFs. Anyone that has done that realizes the suttleties and complexity of that even if the documents look the same
I will try to address each of the current comments that are currently posted.
@Kevin - I will contact you for more information as your idea may be quite similar to something I am already planning.
@Francisco - Could you PM me more about what you mean by a Tkinter app? Also, what do you consider to be a "deep dive" into bar codes? What wasn't covered in the blog article, for example?
@Osrosk - I actually already planned to cover headers and footers. I can just enhance it to include some of the other items you mentioned.
@Steve - I actually already received some requests about extracting data, which is why I am reserving a section of the book for talking about tools outside of ReportLab that do this sort of thing, such as pdfminer. I do plan to have a section on ReportLab's capabilities for creating fillable PDF forms as well.
@Harsh - I will contact you for more information before I decide if I can include your idea or not.
I would really like to be able to read academic papers into python, or latex documents in general. There are a bunch of standard formats that are used and it would be great to be able to read them (accurately) programmatically
There are a huge number of questions on Stack Overflow on extracting text, tables, data, etc., from PDFs - might be an issue to address. There are also a few on compressing PDFs to reduce size. Personally I think that there is a shortage of information & awareness on making fillable PDF forms.
I'd love to learn how to create and use header and footer templates. Especially technical document headers and footers that on each page contain logo and areas for date, author, page number, total number of pages, document number and so on.
A tkinter sample project that uses reportlab would be nice. Or an introduction or deep dive into the barcode feature in reportlab. It was from your blog post about this feature that introduced me to reportlab.
How about a well-developed sample object class that you call from your application so that the default options are consistent? I almost always do this sort of thing, but I would like to see how you would tackle this, and folks who are earlier in the programming career would benefit. It would be nice to include a clever way to drive some of the settings via configuration files, also.