Acrobat, PDF, and Excel Spreadsheets
Microsoft™ Excel™ is not a database. But it is used like one and Excel is probably the most common local data storage mechanism in use today. As such, it is very important to be able to move data between Excel spreadsheets and PDF forms. Fortunately, Acrobat provides several different mechanisms for performing this task for both LiveCycle (XFA) forms and traditional AcroForms. Each mechanism addresses different workflow needs. A general discussion is provided here and implementation specifics for each methodology are provided in the articles linked below.
Using Excel with Acrobat
Because Excel is widely available and easy to use, it is massively helpful to be able access an Excel spreadsheet from Acrobat. However, since Excel not a true database application there are some restrictions on how it can accessed by Acrobat.
These restrictions are not as harsh and onerous as they may seem at first glance. If you already have data in an Excel file and a Windows platform with Acrobat Pro, then this is a perfect solution for many different kinds of automation and workflow projects. For example, pre-populating customer forms or customizing documents with customer information and then emailing them out. All of this can be done in a single automation script.
- By definition, importing and exporting data to/from a PDF touches the user's local system, so all Excel access operations are restricted by Acrobat. All methods require the user's permission, and some methods require privileged access. Some require special enabling (not available in Acrobat Pro) to operate in Adobe Reader, and some will not operate in Reader at all. In general, database access, especially Excel file access, is an automation activity to be performed on a single machine running Acrobat Pro. This is not a solution that would ever be used on a generally distributed form.
- An Excel spreadsheet can only have a single connection, or single user at one time. This restriction includes the design-time applications such as Excel, LiveCycle Designer, and Acrobat. When developing with a spreadsheet you will have to close it in one application before opening it in another. Because of this restriction, the development process and the end user experience with Excel can be slightly awkward. Execution errors and failed connections may be the result of the spreadsheet being locked by another application rather than a real error. This is very different from a true database which allows multiple connections.
- A spreadsheet works best for a single user on a local system. It can be placed on a shared network drive and accessed by multiple users. But since it is not a true database and does not handle concurrent accesses, this usage model is very problematic.
- The most generic method of reading and writing Excel data with a PDF form is importing and exporting a 'Tab Delimited File'. This method does not require special drivers, does not use the Excel application as an intermediary, and works on both Windows and Mac platforms. However, the 'Tab Delimited File' format is not a standard Excel file format, so this is normally a two-step process. Files are moved from Excel to 'Tab Delimited', then into Acrobat, or visa-versa. Not only is this a bit awkward, but there are also a number of other restrictions. See the article on importing/exporting below for details.
- All but one of the access mechanisms rely on Windows based drivers (see previous point). This makes using Acrobat with Excel a mostly Windows solution.
- To be used with a database interface, i.e. ODBC, the data in a particular sheet has to be named. The database interface expects to access the data through a table. An Excel file is not database, so it does not have proper table definitions. Naming the data in a worksheet provides a pseudo-table that the database driver can handle. "Tables" in newer versions of Excel are not the same as Named data. See the section below for details on how to Name data in a worksheet.
- ODBC uses standard SQL syntax to interact with a database connection. Since Excel is not a database application, some but not all SQL commands will work. It's best to stick with the basics: "SELECT", "UPDATE", "DELETE", and "INSERT."
- There are many variations in the Excel workbook format and there was a major change after Excel 2003. This has an impact on the type of drivers and the types of interaction that can be used with a particular Excel file. For example, there is no OLE DB driver for Excel, but for early ".xls" files the Jet 4.0 OLE driver can be used to read and write to the file. Unfortunately, Acrobat 9 does not allow any connection to Jet 4.0 for security reasons. Acrobat X does not allow any ODBC connections in AcroForms. Later ODBC drivers can be used to read from an early ".xls" file, but they cannot write to it. The moral of the story is that an Excel Workbook file should be accessed with a matching version of the driver, and a compatible version of Acrobat. They all have to line up.
Ways to Connect a Form to Excel Data
There are 3 basic methods for connecting to an Excel File.
The specifics for implementing each method are given in the articles below. The first and last methods both require that the Excel file has some special setup. Please read the "Setting up an Excel File for DB Access" article first.
- Direct Import and Export from/to a Tab Delimited File - Tab delimited files are not exactly Excel, but they are Excel compatible. So this is an easy way to extract info from Excel data directly into fields on an AcroForm PDF. It is not so good for exporting data to Excel. The big advantage is that it works on Macintosh as well as Windows.
- LiveCycle Data Connection - LiveCycle has a built in mechanism for connecting to a database through Windows OLE drivers. Using this mechanism there are two ways to connect to an Excel file.
- Through ODBC - This is the most generic and reliable method
- Through Jet 4.0 OLE driver - This method only works for Pre Excel 2003 files and on Acrobat 8 and earlier. It's also an awkward and not always reliable methodology.
Acquiring Raw File Data|
External data, i.e., data outside of Acrobat or a PDF file, is often a very important part of a workflow process. For example, information on customers, products, employees, etc. are typically stored in Excel files, databases or on a server. One of the most common issues with automating such a workflow process is getting the data from the external file or data source into the automation script. This article provides techniques and script examples for acquiring external data. . . .
Importing and Exporting Excel Data|
This article explains exactly how to transfer data, in both directions, between an excel file and Acrobat. Scripts are provided for importing and exporting in a variety of scenarios, including a looping scenerio for performing variable data operations and mail merge. . . .
Registering an Excel File with ODBC|
ODBC is the standard Windows database access service. It is built-in to every modern version of the Windows operating system. When a database is registered with ODBC, it is accessible to . . .