Python Reportlab Template
Last updated on March 5, 2020
Posted by Felix Zumstein - Comments
Python is a popular tool for all kind of automation needs and therefore a great candidate for your reporting tasks.There is a wealth of techniques and libraries available and we’re going to introduce four popular options here. After reading this blog post, you should be able to pick the right library for your next reporting project according to your needs and skill set.
Table of Contents
Overview
Before we begin, here is a high level comparison of the libraries presented in this post:
Xhtml2pdf is a CSS/HTML to PDF generator/converter and Python library that can be used in any Python framework such as Django. In fact, to create PDFs, the tool uses ReportLab, a common open source Python library for generating PDF files. ReportLab uses an XML-based markup language called Requirements Modelling Language (RML).
Library | Technology | Summary |
---|---|---|
Pandas + HTML | HTML | You can generate beautiful reports in the form of static web pages if you know your way around HTML + CSS. The HTML report can also be turned into a PDF for printing. |
Pandas + Excel | Excel | This is a great option if the report has to be in Excel. It can be run on a server where Excel is not installed, i.e. it’s an ideal candidate for a “download to Excel” button in a web app. The Excel file can be exported to PDF. |
xlwings | Excel | xlwings allows the use of an Excel template so the formatting can be done by users without coding skills. It requires, however, an installation of Excel so it’s a good option when the report can be generated on a desktop, e.g. for ad-hoc reports. The Excel file can be exported to PDF. |
Plotly Dash | HTML | Dash allows you to easily spin up a great looking web dashboard that is interactive without having to write any JavaScript code. If formatted properly, it can be used as a source for PDFs, too. Like Pandas + HTML, it requires good HTML + CSS skills to make it look the way you want. |
ReportLab | ReportLab creates direct PDF files without going through HTML or Excel first. It’s very fast and powerful but comes with a steep learning curve. Used by Wikipedia for their PDF export. |
Pandas
I am probably not exaggerating when I claim that almost all reporting in Python starts with Pandas. It’s incredibly easy to create Pandas DataFrames with data from databases, Excel and csv files or json responses from a web API. Once you have the raw data in a DataFrame, it only requires a few lines of code to clean the data and slice & dice it into a digestible form for reporting. Accordingly, Pandas will be used in all sections of this blog post, but we’ll start by leveraging the built-in capabilities that Pandas offers for reports in Excel and HTML format.
Pandas + Excel
Required libraries: pandas
, xlsxwriter
If you want to do something slightly more sophisticated than just dumping a DataFrame into an Excel spreadsheet, I found that Pandas and XlsxWriter is the easiest combination, but others may prefer OpenPyXL. In that case you should be able to easily adopt this snippet by replacing engine='xlsxwriter'
with engine='openpyxl'
and changing the book/sheet syntax so it works with OpenPyXL:
Running this will produce the following report:
Of course, we could now go back to the script and add more code to style it a bit nicer, but I leave this as an exercise to the reader…
Pandas + HTML
Required libraries: pandas
, jinja2
Creating an HTML report with pandas works similar to what’ve just done with Excel: If you want a tiny bit more than just dumping a DataFrame as a raw HTML table, then you’re best off by combining Pandas with a templating engine like Jinja:
First, let’s create a file called template.html
:
Then, in the same directory, let’s run the following Python script that will create our HTML report:
The result is a nice looking HTML report that could also be printed as a PDF by using something like WeasyPrint:
Note that for such an easy example, you wouldn’t necessarily need to use a Jinja template. But when things start to become more complex, it’ll definitely come in very handy.
xlwings
xlwings allows you to program and automate Excel with Python instead of VBA. The difference to XlsxWriter or OpenPyXL (used above in the Pandas section) is the following: XlsxWriter and OpenPyXL write Excel files directly on disk. They work wherever Python works and don’t require an installation of Microsoft Excel.xlwings, on the other hand, can write, read and edit Excel files via the Excel application, i.e. a local installation of Microsoft Excel is required. xlwings also allows you to create macros and user defined functions in Python rather than in VBA, but for reporting purposes, we won’t really need that.
While XlsxWriter/OpenPyXL are the best choice if you need to produce reports in a scalable way on your Linux web server, xlwings does have the advantage that it can edit pre-formatted Excel files without losing or destroying anything. OpenPyXL on the other hand (the only writer library with xlsx editing capabilities) will drop some formatting and sometimes leads to Excel raising errors during further manual editing.
xlwings CE
Replicating the sample we had under Pandas is easy enough with the open-source version of xlwings:
Running this will produce the following report:
So where does all the formatting come from? The formatting is done directly in the Excel template before running the script. This means that instead of having to program tens of lines of code to format a single cell with the proper font, colors and borders, I can just make a few clicks in Excel. xlwings then merely opens the template file and inserts the values.
This allows us to create a good looking report in your corporate design very fast. The best part is that the Python developer doesn’t necessarily have to do the formatting but can leave it to the business user who owns the report.
Note that you could instruct xlwings to run the report in a separate and hidden instance of Excel so it doesn’t interfere with your other work.
xlwings PRO
The Pandas + Excel as well as the xlwings CE sample both have a few issues:
- If, for example, you insert a few rows below the title, you will have to adjust the cell references accordingly in the Python code. Using named ranges could help but they have other limitations (like the one mentioned at the end of this list).
- The number of rows in the table might be dynamic. This leads to two issues: (a) data rows might not be formatted consistently and (b) content below the table might get overwritten if the table is too long.
- Placing the same value in a lot of different cells (e.g. a date in the source note of every table or chart) will cause duplicated code or unnecessary loops.
To fix these issues, xlwings PRO comes with a dedicated reports package:
- Separation of code and design: Users without coding skills can change the template on their own without having to touch the Python code.
- Template variables: Python variables (between double curly braces) can be directly used in cells , e.g.
{{ title }}
. They act as placeholders that will be replaced by the values of the variables. - Frames for dynamic tables: Frames are vertical containers that dynamically align and style tables that have a variable number of rows. To see how Frames work, have a look at the documentation.
You can get a free trial for xlwings PRO here. When using the xlwings PRO reports package, your code simplifies to the following:
All that’s left is to create a template with the placeholders for title
and df
: Nvidia geforce gtx 660 ti driver for mac.
Running the script will produce the same report that we generated with xlwings CE above. The beauty of this approach is that there are no hard coded cell references anymore in your Python code. This means that the person who is responsible for the layout can move the placeholders around and change the fonts and colors without having to bug the Python developer anymore.
Plotly Dash
Required libraries: pandas
, dash
Plotly is best known for their beautiful and open-source JavaScript charting library which builds the core of Chart Studio, a platform for collaboratively designing charts (no coding required).
To create a report though, we’re using their latest product Plotly Dash, an open-source framework that allows the creation of interactive web dashboards with Python only (no need to write JavaScript code). Plotly Dash is also available as Enterprise plan.
How it works is best explained by looking at some code, adopted with minimal changes from the official getting started guide:
Running this script and navigating to http://localhost:8050 in your browser will give you this dashboard:
The charts look great by default and it’s very easy to make your dashboard interactive by writing simple callback functions in Python: You can choose the year by clicking on the slider below the chart. In the background, every change to our year-slider
will trigger the update_figure
callback function and hence update the chart.
By arranging your documents properly, you could create an interactive web dashboard that can also act as the source for your PDF factsheet, see for example their financial factsheet demo together with it’s source code.
Alternatives to Plotly Dash
If you are looking for an alternative to Plotly Dash, make sure to check out Panel. Panel was originally developed with the support of Anaconda Inc., and is now maintained by Anaconda developers and community contributors. Unlike Plotly Dash, Panel is very inclusive and supports a wide range of plotting libraries including: Bokeh, Altair, Matplotlib and others (including also Plotly).
ReportLab
Required libraries: pandas
, reportlab
ReportLab writes PDF files directly. Most prominently, Wikipedia uses ReportLab to generate their PDF exports. One of the key strength of ReportLab is that it builds PDF reports “at incredible speeds”, to cite their homepage. Let’s have a look at some sample code for both the open-source and the commercial version!
ReportLab OpenSource
In its most basic functionality, ReportLab uses a canvas where you can place objects using a coordinate system:
ReportLab also offers an advanced mode called PLATYPUS (Page Layout and Typography Using Scripts), which is able to define dynamic layouts based on templates at the document and page level. Within pages, Frames
would then arrange Flowables
(e.g. text and pictures) dynamically according to their height. Here is a very basic example of how you put PLATYPUS at work:
Running this script will produce the following PDF:
ReportLab PLUS
In comparison to the open-source version of ReportLab, the most prominent features of Reportlab PLUS are
- a templating language
- the ability to include vector graphics
The templating language is called RML (Report Markup Language), an XML dialect. Here is a sample of how it looks like, taken directly from the official documentation:
The idea here is that you can have any program produce such an RML document, not just Python, which can then be transformed into a PDF document by ReportLab PLUS.
Conclusion
Python offers various libraries to create professional reports and factsheets. If you are a good at HTML + CSS have a look at Plotly Dash or Panel or write your HTML documents directly with the help of the to_html
method in Pandas.
If you need your report as Excel file (or if you hate CSS), Pandas + XlsxWriter/OpenPyXL or xlwings might be the right choice - you can still export your Excel document as PDF file. xlwings is the better choice if you want to split the design and code work. XlsxWriter/OpenPyxl is the better choice if it needs to be scalable and run on a server.
If you need to generate PDF files at high speed, check out ReportLab. It has a steep learning curve and requires to write quite some code but once the code has been written, it works at high speed.
Please enable JavaScript to view the comments powered by Disqus.This is a wrapper for ReportLab which allows easy creation of PDF documents:
Letters and reports
PDFDocument comes with two different PDF templates, letters and reports. Theonly difference is the layout of the first page: The letter has an additionalframe for the address at the top and a smaller main content area.
Usage is as follows:
The letter generates default styles using 9 point fonts as base size, the reportuses 8 points. This can be changed by calling pdf.generate_style
again.
There exists also a special type of report, the confidential report, the onlydifferences being that the confidentiality is marked using a red cross at thetop of the first page and a watermark in the background.
Styles
The call to pdf.generate_style
generates a set of predefined styles. (Yesit does!) That includes the following styles; this list is neither exhaustivenor a promise:
pdf.style.normal
pdf.style.heading1
pdf.style.heading2
pdf.style.heading3
pdf.style.small
pdf.style.bold
pdf.style.right
pdf.style.indented
pdf.style.paragraph
pdf.style.table
Most of the time you will not use those attributes directly, except in the caseof tables. Convenience methods exist for almost all styles as described in thenext chapter.
Content
All content passed to the following methods is escaped by default. ReportLabsupports a HTML-like markup language, if you want to use it directly you'llhave to either use only pdf.p_markup
or resort to creatingpdfdocument.document.MarkupParagraph
instances by hand.
Headings
pdf.h1
, pdf.h2
, pdf.h3
Paragraphs
pdf.p
, pdf.p_markup
, pdf.small
, pdf.smaller
Unordered lists
pdf.ul
Mini-HTML
pdf.mini_html
Various elements
pdf.hr
, pdf.hr_mini
, pdf.spacer
, pdf.pagebreak
,pdf.start_keeptogether
, pdf.end_keeptogether
, pdf.next_frame
,
Tables
pdf.table
, pdf.bottom_table
Canvas methods
Canvas methods work with the canvas directly, and not with Platypus objects.They are mostly useful inside stationery functions. You'll mostly useReportLab's canvas methods directly, and only resort to the following methodsfor special cases.
pdf.confidential
, pdf.draw_watermark
, pdf.draw_svg
Additional methods
pdf.append
, pdf.restart
Django integration
PDFDocument has a few helpers for generating PDFs in Django views, most notablypdfdocument.utils.pdf_response
:
The SVG support uses svglib by Dinu Gherman. It can be found on PyPI:<http://pypi.python.org/pypi/svglib/>