Python bindings for gopdfsuit - a comprehensive PDF library for generation, merging, splitting, form filling, and HTML to PDF/Image conversion.
- PDF Generation: Create PDFs from structured templates with tables, images, and styled text
- PDF Merging: Combine multiple PDFs into a single document
- PDF Splitting: Split PDFs by pages, ranges, or maximum pages per file
- Form Filling: Fill PDF forms using XFDF data
- HTML to PDF: Convert HTML content or URLs to PDF documents
- HTML to Image: Convert HTML content or URLs to images (PNG, JPG, SVG)
- PDF Redaction: Securely redact sensitive information using coordinates or text search
- Build the shared library locally:
cd bindings/python
chmod +x build.sh
./build.shOn Windows, use the batch file instead:
cd bindings\python
build.bat- Install the Python package:
pip install .There is currently an issue on Windows. Please build the application locally.
Sample data for the Python bindings is available here:
- Python 3.8+
- Go 1.22+ (for building the shared library)
- Chrome/Chromium (for HTML to PDF/Image conversion)
from pypdfsuit import generate_pdf, PDFTemplate, Config, Title, Element, Table, Row, Cell
template = PDFTemplate(
config=Config(page="A4", page_alignment=1),
title=Title(
props="Helvetica:24:100:center:0:0:0:0",
text="My Document"
),
elements=[
Element(
type="table",
table=Table(
max_columns=2,
column_widths=[1.0, 1.0],
rows=[
Row(row=[
Cell(props="Helvetica:12:100:left:1:1:1:1", text="Name"),
Cell(props="Helvetica:12:000:left:1:1:1:1", text="John Doe"),
])
]
)
)
]
)
pdf_bytes = generate_pdf(template)
with open("output.pdf", "wb") as f:
f.write(pdf_bytes)from pypdfsuit import merge_pdfs
with open("doc1.pdf", "rb") as f1, open("doc2.pdf", "rb") as f2:
merged = merge_pdfs([f1.read(), f2.read()])
with open("merged.pdf", "wb") as f:
f.write(merged)from pypdfsuit import split_pdf, SplitSpec
with open("document.pdf", "rb") as f:
pdf_data = f.read()
# Split specific pages
spec = SplitSpec(pages=[1, 3, 5])
parts = split_pdf(pdf_data, spec)
# Or split every 5 pages
spec = SplitSpec(max_per_file=5)
parts = split_pdf(pdf_data, spec)
for i, part in enumerate(parts):
with open(f"part_{i+1}.pdf", "wb") as f:
f.write(part)from pypdfsuit import convert_html_to_pdf, HtmlToPDFRequest
# Convert HTML string
request = HtmlToPDFRequest(
html="<html><body><h1>Hello World</h1></body></html>",
page_size="A4",
orientation="Portrait",
)
pdf_bytes = convert_html_to_pdf(request)
# Or convert a URL
request = HtmlToPDFRequest(
url="https://example.com",
page_size="Letter",
)
pdf_bytes = convert_html_to_pdf(request)from pypdfsuit import fill_pdf_with_xfdf
with open("form.pdf", "rb") as f:
pdf_data = f.read()
with open("data.xfdf", "rb") as f:
xfdf_data = f.read()
filled = fill_pdf_with_xfdf(pdf_data, xfdf_data)
with open("filled.pdf", "wb") as f:
f.write(filled)from pypdfsuit import apply_redactions_advanced
with open("document.pdf", "rb") as f:
pdf_data = f.read()
redacted = apply_redactions_advanced(pdf_data, {
"blocks": [
{"pageNum": 1, "x": 120, "y": 620, "width": 180, "height": 24}
],
"textSearch": [
{"text": "Confidential"}
],
"mode": "visual_allowed"
})
with open("redacted.pdf", "wb") as f:
f.write(redacted)PDFTemplate- Main template structure for PDF generationConfig- Page configuration (size, orientation, security, etc.)Title- Document title sectionTable,Row,Cell- Table structureElement- Generic element (table, spacer, image)Image,Spacer- Additional elementsSecurityConfig- Encryption settingsPDFAConfig- PDF/A compliance settingsSignatureConfig- Digital signature settingsHtmlToPDFRequest- HTML to PDF conversion optionsHtmlToImageRequest- HTML to image conversion optionsSplitSpec- PDF split specificationFontInfo- Font information
generate_pdf(template: PDFTemplate) -> bytesget_available_fonts() -> List[FontInfo]merge_pdfs(pdf_files: List[bytes]) -> bytessplit_pdf(pdf_data: bytes, spec: SplitSpec) -> List[bytes]parse_page_spec(spec: str, total_pages: int = 0) -> List[int]fill_pdf_with_xfdf(pdf_data: bytes, xfdf_data: bytes) -> bytesconvert_html_to_pdf(request: HtmlToPDFRequest) -> bytesconvert_html_to_image(request: HtmlToImageRequest) -> bytesget_page_info(pdf_data: bytes) -> dictextract_text_positions(pdf_data: bytes, page_num: int) -> list[dict]find_text_occurrences(pdf_data: bytes, text: str) -> list[dict]apply_redactions(pdf_data: bytes, redactions: list[dict]) -> bytesapply_redactions_advanced(pdf_data: bytes, options: dict) -> bytes
The props string format for cells and titles is:
FontName:FontSize:StyleCode:Alignment:BorderLeft:BorderRight:BorderTop:BorderBottom
- FontName: Helvetica, Courier, Times-Roman, etc.
- FontSize: Integer size in points
- StyleCode: 3 digits for bold(1/0), italic(1/0), underline(1/0). e.g., "100" = bold only
- Alignment: left, center, right
- Borders: 1 = border, 0 = no border
Example: "Helvetica:12:100:center:1:1:1:1" = Helvetica 12pt, bold, centered, all borders
MIT License - see LICENSE for details.