Wednesday 2 August 2017

Adding tiled watermark text to documents

        Recently I really wanted to start adding watermark texts to different documents with my personal information in them. These are usually related to the hiring process. There are different documents, which we frequently send out whenever we are getting x-rayed by this or that company, which is trying to hire us.

A watermark can be helpful in ensuring that the documents we send, if they ever were to appear somewhere, somehow, are going to be easily traceable back to the source. Obviously it is possible to remove a watermark especially with a shabby quality photo-copy. But at least it adds this 1 extra step which perhaps some people will not be willing to go through.

A watermark text itself could contain the information to whom this document is being sent to, and who has created it.

I wanted to be able to scan any document I want on my home scanner, have the document in a JPG format and add a sliding, tiled watermark text of my choosing. I also wanted to be able to script this operation for multiple files.

Obstacles:
- all printers I had access to printed in PDF
- there's no free, simple way of converting PDF to JPG without sending your data to some sketchy website
- adding watermark automatically to multiple documents (otherwise the easiest way would be to use a free tool such as GIMP to add such watermark text manually)

I've done a bit of digging and the simplest solution for me was as follows and requires installation of 2 programs:

- ImageMagicks (a console based image processing app)
- GhostScript (a pdf renderer)

ImageMagicks has it's own pdf to jpg conversion command but it relies on having GhostScript installed underneath - you can use GhostScript solely for that for better performance for batch jobs. I've used the ImageMagicks version for ease of use.

Once all is installed you can use the following command to convert pdf to jpg, given there's a Watermarking directory in your ImageMagics folder:

convert 
-density 150 
-trim Watermarking\test.pdf 
-quality 100 
-flatten 
-sharpen 0x1.0 Watermarking\test.jpg

You can customize the quality using different parameter values, for more info check: https://www.imagemagick.org/script/convert.php

Next we can add a watermark to the resulting image:

convert 
-font Arial 
-pointsize 40 
-size 430x270 xc:none 
-fill #80808080 
-gravity NorthWest 
-draw "rotate 15 text 5,0 'Mr XYZ Company ABC'" miff:- | composite 
-tile - Watermarking\test.jpg Watermarking\watermarked_test.jpg

You can read more oh the parameters used above under the following url: http://www.imagemagick.org/Usage/annotating/#wmark_text

Example pre-watermark:




Example post-watermark:


With the parameters in the command line you can among others set the density of the text it's size, size of the canvas the text is first created in, pivot the text and set the color of the font.

Once you have these commands it's really easy to parameterise them and use powershell, batch or other scripting tool to run it for all pdf's/jpg's in the given folder.

Troubleshooting:
- When on Windows you might want to pass the full path to the Convert.exe, otherwise Windows may mistake the call for another convert.exe which is a system executable related to the filesystem.

No comments:

Post a Comment