Paperless Office using the Raspberry Pi
This is a follow-up on an older blog using Ubuntu.

Creative Commons Creative Commons Attribution 2.0 Generic License (http://creativecommons.org/licenses/by/2.0/) by rosmary
Raspberry Pi Prerequisites
Since this will be a purely headless install designed to sit in a corner behind the scanner I am using a Base Raspian (Debian Wheezy) install (I personally like the clean minimal install via https://github.com/debian-pi/raspbian-ua-netinst the best).
apt-get install sudo vim wget wput libusb-dev build-essential git-core
Add non-privileged user account(s)
adduser USERNAME
adduser USERNAME sudo groupadd scanner usermod -a -G scanner USERNAME`
Install Sane
The version of sane from the Raspbian repos is not working with the Fujitsu ScanJet range and needs to be built from source.
git clone git://git.debian.org/sane/sane-backends.git
cd sane-backends BACKENDS=epjitsu ./configure –prefix=/usr –sysconfdir=/etc –localstatedir=/var make make install`
Install S1300i Driver
You need to get the driver file (‘1300i_0D12.nal’) from the CD that came with the scanner. If you still have access to a CDROM drive that is. :(
mkdir -p /usr/share/sane/epjitsu/
cp 1300i_0D12.nal /usr/share/sane/epjitsu/`
Check /etc/sane.d/epjitsu.conf and see if the following line is there (in my case it was already created by sane build).
# Fujitsu S1300i
firmware /usr/share/sane/epjitsu/1300i_0D12.nal usb 0x04c5 0x128d`
sane-find-scanner -q
found USB scanner (vendor=0x04c5 [FUJITSU], product=0x128d [ScanSnap S1300i]) at libusb:001:004
found USB scanner (vendor=0x0424, product=0xec00) at libusb:001:003
scanimage -L
device `epjitsu:libusb:001:004′ is a FUJITSU ScanSnap S1300i scanner
Copy libsane rules from the sane build directory to udev rules.
sudo cp sane-backends/tools/udev/libsane.rules /etc/udev/rules.d/60-libsane.rules
Logout and log in a the non-privileged user account previously created.
If the scanimage -L command works as above you have fully configured the scanner to work under that user account.
Start saned on boot-up
Edit the /etc/rc.local file and add the following line before the ‘0’ line to ensure saned is running as the non-privileged user when you have to reboot.
saned -a USERNAME
Installing Conversion Tools
sudo apt-get install imagemagick bc exactimage pdftk tesseract-ocr tesseract-ocr-eng unpaper
You can add other languages such as tesseract-ocr-deu if you require OCR support for those.
Scan to Repository Script
The script is hosted on Github: https://github.com/leogaggl/misc-scripts/blob/master/scan2repo.sh
#!/bin/bash
# ------------------------------------------------------------------
# [Author] Leo Gaggl
# http://www.gaggl.com
# ©2014 - SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.
# License GPL V2 - details see attached LICENSE file
# This script captures a still image from the Raspberry Pi
# via the camera and upload it to AWS S3 or Linode object storage.
# https://gaggl.com
#
# Dependency:
# Raspberry Camera Module - enabled
# s3cmd tools installed
# ------------------------------------------------------------------
IMAGEDIR=/home/pi/wildlife-rpi-cam/
OUT_DIR=~/scan
TMP_DIR=`mktemp -d`
FILE_NAME=scan_`date +%Y%m%d-%H%M%S`
LANGUAGE="eng"
echo 'scanning...'
scanimage --resolution 300 \
--batch="$TMP_DIR/scan_%03d.pnm" \
--format=pnm \
--mode Gray \
--source 'ADF Duplex'
echo "Output saved in $TMP_DIR/scan*.pnm"
cd "$TMP_DIR"
# cut borders
echo 'cutting borders...'
for i in scan_*.pnm; do
mogrify -shave 50x5 "${i}"
done
# check if there is blank pages
echo 'checking for blank pages...'
for f in ./*.pnm; do
unpaper --size "a4" --overwrite "$f" "$(echo "$f" | sed 's/scan/scan_unpaper/g')"
#need to rename and delete original since newer versions of unpaper can't use same file name
rm -f "$f"
done
# apply text cleaning and convert to tif
echo 'cleaning pages...'
for i in scan_*.pnm; do
echo "${i}"
convert "${i}" -contrast-stretch 1% -level 29%,76% "${i}.tif"
done
# Starting OCR
echo 'doing OCR...'
for i in scan_*.pnm.tif; do
echo "${i}"
tesseract "$i" "$i" -l "$LANGUAGE" hocr
hocr2pdf -i "$i" -s -o "$i.pdf"
Webmentions
No webmentions yet. Be the first to send a webmention !