10.10 Desktop Applications

10.10.1 Spread Sheets

Spreadsheet applications present a two dimensional view of structured data where the field values are (possibly) mutually dependent. On the Emacspeak desktop, a speech-enabled spreadsheet application can be used to manipulate such data-driven documents ranging from simple cheque books and expense reports to complex investment portfolios. Where the traditional visual interface to spreadsheets is typically independent of the semantics of the data stored in the spreadsheet, the speech-enabled interface is derived from the meaning of the various fields making up the data. When presenting such information on a visual display, implicit visual layout can be used to cue the user to the meaning of different data fields.

On the other hand, in the case of an actively scrolling auditory display, the spoken output needs to explicitly convey both the value and interpretation of the different data items. In addition, the interface needs to enable an active dialog between user and application where the user is able to query the system about the possible meaning of a particular item of data.

Finally, the aural interface needs to enable multiple views of the display. In the visual interface, such multiple views are automatically enabled by the two dimensional layout combined with the eye’s ability to move rapidly around the layout structure. Thus, while viewing any particular row of a portfolio, one can immediately see the current total value as well as the net gain or loss. The Emacs spread-sheet package dismal can be retrieved from ftp://cs.nyu.edu/pub/local/fox/dismal.

10.10.2 Forms Mode

Forms mode an Emacs mode designed to edit structured data records like the line shown from file /etc/passwd presents a user-friendly visual interface that displays the field name along with the field value. The user can edit the field value and save the file, at which point the data is written out using the underlying : delimited representation. Mode forms provides a flexible interface to associating meaning to the fields of such structured data files. For details on it use, see the forms-mode section of the online Emacs info documentation.

10.10.3 OCR — Reading Print Documents

Module emacspeak-ocr implements an OCR (Optical Character Recognition) front-end for the Emacspeak desktop.

Page image is acquired using tools from package SANE (Scanner Access Now Easy). The acquired image is run through the OCR engine if one is available, and the results placed in a buffer that is suitable for browsing the results. This buffer is placed in mode emacspeak-ocr-mode a specialized mode for reading and scanning documents.

10.10.3.1 Emacspeak OCR Mode

Emacspeak OCR mode is a special major mode for document scanning and OCR.

Pre-requisites:

  • A working scanner back-end like SANE on Linux.
  • An OCR engine.

Make sure your scanner back-end works, and that you have the utilities to scan a document and acquire an image as a tiff file. Then set variable emacspeak-ocr-scan-image-program to point at this utility. By default, this is set to ‘scanimage’ which is the image scanning utility provided by SANE.

By default, this front-end attempts to compress the acquired tiff image; make sure you have a utility like tiffcp. Variable emacspeak-ocr-compress-image is set to ‘tiffcp’ by default; if you use something else, you should customize this variable.

Next, make sure you have an OCR engine installed and working. By default this front-end assumes that OCR is available as /usr/bin/ocr.

Once you have ensured that acquiring an image and applying OCR to it work independently of Emacs, you can use this Emacspeak front-end to enable easy OCR access from within Emacspeak.

The Emacspeak OCR front-end is launched by command emacspeak-ocr bound to C-e C-o.

This command switches to a special buffer that has OCR commands bounds to single keystrokes — see the key-binding list at the end of this description. Use Emacs online help facility to look up help on these commands.

Mode emacspeak-ocr-mode provides the necessary functionality to scan, OCR, read and save documents. By default, scanned images and the resulting text are saved under directory ~/ocr; see variable emacspeak-ocr-working-directory. Invoking command emacspeak-ocr-open-working-directory bound to d will open this directory.

By default, the document being scanned is named ‘untitled’. You can name the document by using command emacspeak-ocr-name-document bound to n. The document name is used in constructing the name of the image and text files.

Here is a list of all emacspeak OCR commands along with their key-bindings and a brief description:

digit

emacspeak-ocr-page Jumps to specified page in the OCR output.

c

emacspeak-ocr-set-compress-image-options Interactively update image compression options. Prompts with current setting in the minibuffer. Setting persists for current Emacs session.

i

emacspeak-ocr-set-scan-image-options Interactively update scan image options. Prompts with current setting in the minibuffer. Setting persists for current Emacs session.

spc

emacspeak-ocr-read-current-page Speaks current page.

s

emacspeak-ocr-save-current-page Saves current page as a text file.

p

emacspeak-ocr-page Prompts for a page number and moves to the specified page.

]

emacspeak-ocr-forward-page Move forward to the next page.

[

emacspeak-ocr-backward-page Move back to the previous page.

d

emacspeak-ocr-open-working-directory Open directory containing the results of OCR.

n

emacspeak-ocr-name-document Name current document.

o

emacspeak-ocr-recognize-image Launch OCR engine on a scanned image.

i

emacspeak-ocr-scan-image Acquire an image using scanimage.

RET

emacspeak-ocr-scan-and-recognize Scan and recognize a page.

w

emacspeak-ocr-write-document Write all pages of current document to a text file.

q

bury-buffer Bury the OCR buffer.

c

emacspeak-ocr-customize Customize Emacspeak OCR settings.

?

describe-mode Describe OCR mode.