#!/bin/bash # script to convert pdf bank statements to # csv files # these files may have # white space in files names while IFS='' read -r -d '' fname ; do nname="${fname##*/}" mv -v -n "${fname}" "${fname%/*}/${nname//[[:space:]]/_}" done < <(find "$(pwd)" -name "* *" -type f -print0) rm -rf dd.txt a=`ls *.pdf` for l in $a; do echo "processing" $l ps2ascii $l >> dd.txt; done view dd.txt cat dd.txt | sed -n '/.*[0-9][0-9] \+[A-Z][a-z][a-z]/p' \ | sed 's/\([0-9]\+\),\([0-9]\+\)/\1\2/g' \ > ddd.txt view ddd.txt vi m_spreadsheet.txt cat m_spreadsheet.txt | sed 's/\(Opening balance\)/,,,,,,,\1/' \ | sed 's/\(Statement period\)/,,,,,,,\1/' \ | sed 's/\(DR INTEREST\)/,,,,,,,\1/' \ | sed 's/\(MONTHLY ACCOUNT FEE\)/,,,,,,,\1/' \ | sed 's/\([0-9]\+\.[0-9]\+\) OD/\1,OD/' \ | sed 's/.*\([0-9][0-9] \+[A-Z][a-z][a-z]\) \+\([A-Z][A-Z]\)/\1,\2/' \ | sed 's/\(Your accounts at a glance as at\)/,,,,,,,\1/' > m_spreadsheet.csv # view m_spreadsheet.csv #