Skip to content

python pdf writer

2019.11.04 21:04

WHRIA 조회 수:348

https://github.com/pymupdf/PyMuPDF/wiki/How-to-Insert-new-PDF-Pages,-Images-and-Text

 

 

How to Insert new PDF Pages, Images and Text

Jorj X. McKie edited this page on 20 Sep · 5 revisions

Inserting new Pages

Beginning with v1.11.0 PyMuPDF allows to insert new pages into (existing or new) PDFs. This works like so:

    doc = fitz.open("some.pdf")            # or a new PDF by fitz.open()
    doc.insertPage(n, text="some text")  # insert a new page in front of page n
    doc.save(...)                          # save what we did

Insertion page number n is 0-based and means "insertion in front of this page". Insertion at end is achieved by n = -1. Several parameters and options are available: fontsize, color, standard Base14 fonts or other fonts from your system, page dimension, etc. If text is a string containing line breaks (\n) or a Python sequence, then several text lines are generated.

We have included a new demo program `text2pdf.py" that converts a text file to a new PDF using this feature. As usual with PyMuPDF: a very fast alternative to similar solutions.

Inserting new Images

New images can now be put on PDF pages. Use this new method like so:

    doc = fitz.open("some.pdf")            # some existing PDF
    page = doc[n]                          # load page (0-based)
    rect = fitz.Rect(0, 0, 100, 100)       # where we want to put the image
    pix = fitz.Pixmap("some.image")        # any supported image file
    page.insertImage(rect, pixmap=pix, overlay=True)   # insert image
    doc.save(...)                          # save our deeds

The image will overlay (default) what currently is there in the rectangle. Transparent images are supported and thus can be used for some kind of "watermarking" your PDF. With overlay = False, the image will become background.

In order to put an identical thumbnail on each page, do this:

for page in doc:
    page.insertImage(rect, pixmap=pix)

Potentially except for the first page, this is a very fast process. On my machine it took 6 seconds to stamp all the 1'310 pages of Adobe's PDF Reference Manual with a (relatively small) image.

The method is more flexible than shown here:

  • Instead of a fitz.Pixmap you can also directly use the filename of the image - replace parameter pixmap= with filename=. Or use an image in memory (as a bytes or io.BytesIO object) and use parameter stream= instead.
  • Rotate the image using rotate=deg specifying an integer multiple of 90 degrees.
  • The image will be inserted centered, normally fully using at least one of width or height of the rectangle but keeping its aspect ratio. If you specify keep-proportion=False, then the image will fully cover the rectangle area and thus may appear somehow distorted.

How to "censor" a PDF

I am not (quite) serious here ... but using this technique you can overlay certain text pieces with images as well:

    doc = fitz.open(...)
    for page in doc:
        rl = page.searchFor("nasty word", hit_max=nnn)
        for r in rl:
            page.insertImage(r, "black.jpg")
    doc.save("censored.pdf")

File "censored.pdf" will now have every (up to nnn per page) occurrence of "nasty word" overlaid with picture "black.jpg".

But note: the overlaid text is physically still there, and can be accessed e.g. via page.getText().

You can also use this approach to emphasize text in a textmarker style:

Create a small image file that only contains pixels of one color to be used for textmarking, say "yellow.jpg". Then use it in a variation of the above:

    pix = fitz.Pixmap("yellow.jpg")          # arbitrary size
    for page in doc:
        rl = page.searchFor("interesting stuff", hit_max=...)
        for r in rl:                         # every rectangle containing this text
            page.insertImage(r, pixmap=pix, overlay=False)
    doc.save(...)

All "interesting stuff" will now be textmarked yellow, i.e. shown with a yellow background.

If you don't want to use an image file, but just a general color as background, use this script instead:

    from fitz.utils import getColor          # function delivers RGB triple for a color name
    pink = getColor("pink")                  # one of the 540+ pre-installed colors ...
    for page in doc:
        rl = page.searchFor("interesting stuff", hit_max=...)
        for r in rl:                         # every rectangle containing this text
            page.drawRect(r, color=pink, fill=pink, overlay=False)
    doc.save(...)

Please note, that all these changes permanently modify the PDF. They can not be reverted, in contrast to using annotations, which can be deleted again.

Inserting new Text

You can insert new text on existing pages. This works similar to creating a new page together with text, but adds more flexibility. You can freely position your text pieces, choose different fonts / text sizes / colors for each piece, rotate it (multiples of 90 degrees), etc.

    page = doc[n]
    text = "some text containing line breaks and\na prettier mono-spaced font."
    fname = "F0"
    ffile = "c:/windows/fonts/dejavusansmono.ttf"
    where = fitz.Point(50, 100)    # text starts here
    # this inserts 2 lines of text using font `DejaVu Sans Mono`
    page.insertText(where, text,
                    fontname=fname,    # arbitrary if fontfile given
                    fontfile=ffile,    # any file containing a font
                    fontsize=11,       # default
                    rotate=0,          # rotate text
                    color=(0, 0, 1),   # some color (blue)
                    overlay=True)      # text in foreground

Inserting Textboxes

This method is similar to the previous one (and in fact ultimately uses it).

You provide a rectangle and text that should be put into it - and nowhere else. Text is broken down to single words respecting tabulators and multiple spaces. Alignment is possible: left, center, right or justified.

The method returns a non-negative float result if successful.

    rc = page.insertTextbox(rect, text,
                    align=fitz.TEXT_ALIGN_JUSTIFY,  # justify the text
                    # ... more parameters like in insertText()
                    )
    if rc < 0:
        # take action if not enough space

The value rc indicates how much unused space is left in the rectangle area. If negative, no action has been taken, and you could increase the rectangle, reduce fontsize or amount of text or whatever.

번호 제목 글쓴이 날짜 조회 수
1519 비천무는 잼있다! 채영광 2000.07.16 4119
1518 무념 무상 WHRIA 2007.05.25 4108
1517 재미있는 핸드폰 문자 모음.(뜨는데 오래걸림) 한승석 2000.08.15 4094
1516 내게 현재 필요한 것 WHRIA 2007.06.26 4088
1515 만남 WHRIA 2007.05.22 4070
1514 False Information [1] han 2005.10.01 4061
1513 꾹 참기 WHRIA 2007.05.09 4044
1512 죄송합니다. 하이텔에서 [그냥 드립니다.]에서 오영택 2000.08.23 4043
1511 Intro WHRIA 2018.10.01 4034
1510 승석아 재호 2000.08.14 4030
1509 장기 저축 WHRIA 2007.05.15 4025
1508 골프장지도 file WHRIA 2013.08.16 4019
1507 시간 관리 WHRIA 2007.05.23 3993
1506 의욕상실 채영광 2000.07.13 3975
1505 여행하기 좋은때.. file 하은이 2007.06.07 3975

Powered by Xpress Engine / Designed by Sketchbook

sketchbook5, 스케치북5

sketchbook5, 스케치북5

나눔글꼴 설치 안내


이 PC에는 나눔글꼴이 설치되어 있지 않습니다.

이 사이트를 나눔글꼴로 보기 위해서는
나눔글꼴을 설치해야 합니다.

설치 취소