PDF2Office Professional 5.0.4
Function: Convert PDF files to Microsoft Office.
Developer: Recosoft Corporation
Requirements: Mac OS X 10.4.11. Universal.
Trial: Feature-limited and time-limited (see below).
From the time Mac OS X first came out, the ability to create PDF documents from any printable files was available to all Mac users. PDF files are indeed portable in that you can distribute them to your audience and not have to worry about them not being able to read it—the Adobe Reader is freely available. On some occasions, however, you may need to convert the PDF files back to some other format that you can edit.
Since Microsoft Office has such a stranglehold on the office suite market, the likely formats to convert back to would likely be Word, Excel, or other Microsoft Office–supported formats. While the default PDF viewer for Mac OS X, Preview, allows you to copy text out of PDF, even when you paste styled text the result is far from perfect. You may need dedicated software like PDF2Office. Or do you?
Installation of PDF2Office takes a little more than your usual drag and drop of the application file to the hard drive. You first need to run the installer application, which then asks you to locate your copy of Microsoft Word. You are also prompted to select a folder to install PDF2Office, but you cannot choose one on an external drive. Lastly, when installation is complete, you are supposed to log out and then log in again for the changes to take effect.
PDF2Office has a trial version, but it is both feature-limited and time-constrained. Only five conversions are allowed, all output is watermarked, and only up to five pages are converted. And the trial version works for only seven days.
How I Worked PDF2Office
As you may have gathered, I do not create PDF files using Adobe Acrobat, the full software. Instead, I used Mac OS X’s “Save as PDF” option to generate the PDF files needed for this review. I would create a document in Word or Excel, print it, and choose “Save as PDF” to capture the output in a PDF file, then feed the PDF file to PDF2Office. I only have Microsoft Office v.X, so to test PDF2Office’s handling of the Office 2007 format I had to use a Dell Windows machine, which also has some sort of PDF printer driver.
How PDF2Office Works
PDF2Office converts a file from PDF to Word, Excel, or PowerPoint. It can also output Web and graphics files, but my interest in PDF2Office is all about being able to edit the text in the documents after the conversion, so I only focus only on the Office formats. Even the lowly Preview can convert PDFs to graphic formats.
First you set the folder to store converted documents and whether to overwrite any output files or append a unique number. Then you can load documents into PDF2Office through one of many ways, set an output format, highlight the documents, and click the Convert button. PDF2Office does a good job reproducing the document after it has gone through a PDF conversion. Even with Word’s Paste Special ‣ Styled Text, there is still some work left to do. PDF2Office’s version is perfect, if you can overlook the “wrong” bullet style. (More on the bullets later.) Vertical spacing seems not to work perfectly as the lines are bunched together.
If you can overlook the change in bullet style, the conversion of the PDF of the document (left) back to Word (right) is almost perfect.
You have options, such as how to maintain the arrangement of styles and tables, as well as the treatment of hyphens, font substitutions, and more. My favorite option involves converting the PDF into an editable static form. The PDF becomes a background image of the Word document, and you can edit the file as if on a typewriter. You use space and return characters to get to where you want and check off boxes (perhaps with an X) or to fill out forms.
You can load multiple documents and select them together for mass conversion. You can also use the Batch Convert option to convert all files located in a particular folder. With mass conversion, the output format must be uniform (e.g. everything converted to Word X and not some converted to Excel while others converted to RTF). Some power users may bemoan the lack of a choice to convert different files to different formats. I think the way PDF2Office works is fine. Let’s keep it simple!
Even with my method of generating PDF files for PDF2Office to convert back, I unconsciously expected the conversion back to Word to be perfect, but it is not meant to be. Headers and footers become part of the page, not separate entities to be edited differently. Spreadsheets that span beyond one page become a series of pages, not a spread. PDF2Office can convert only what is in the PDF file. To PDF2Office’s credit, with Excel conversion there is an option to turn pages into Excel sheets.
I also unconsciously expected my old copy of Word X to be bug-free and thought I had found a bug with PDF2Office. While single-page documents got converted by PDF2Office fine, multi-page documents showed a black bar at the bottom of every page, exactly where a soft page break should be. I exchanged e-mail with Recosoft and learned that the documents open fine in Word 2008. I was also able to open the same documents in Word 2007, on a Windows machine, without the black bars. It is a bug in how Word X displays soft page breaks. The workaround is to uncheck Apply Page Breaks prior to converting the PDF back to Word X format. Text flow may not be the same as in the original document, but that’s the price you pay when you use outdated software.
The Curse of the Black Bar. With Apply Page Breaks enabled, documents longer than one page have black bars added to the end of the pages.
The PDF2Office documentation explicitly mentions that PDFs generated from page-layout programs will not translate well into Word, and this is true. I fed PDF2Office some old ATPM PDF files, and while the converted Word files are readable, they need much work to get things aligned and fonts replaced, etc. Word is not meant to be used as a page-layout program, so anyone trying to go from, say, InDesign to Word is traveling the wrong path.
Room for Improvements
One annoyance in using PDF2Office is its handling of bulleted lists. As seen earlier, black circles become infinity symbols when PDF documents are converted back to Word. I notice that the infinity symbol is the first choice for the different types of bullets. Perhaps, somehow when PDF2Office converts bulleted lists in PDF to Word, it just grabs the first type of symbol regardless of what is really in use.
To a certain degree, PDF2Office can regenerate headers and footers. As long as your header takes up just one line and does not vary from page to page, such as a page number, PDF2Office can handle it. A two-line header would have its second line become part of the page body. A header that shows page numbers such as “Page 1 of 10”, “Page 2 of 10”, etc. won’t be properly converted. PDF2Office would just turn such a header into “Page of 10” (i.e. it will take out the varying number). While the one-line limitation should work for most people, being able to correctly generate the page number in the header is necessary and should be supported.
If you have a business that involves editing PDF files in Word or Excel formats, you need PDF2Office. Copy and paste can only do so much. PDF2Office works hard to convert PDF files back to MS Office formats, such as turning pages into Excel sheets or regenerating one-line headers. It’s too bad that varying headers are not fully supported, though. PDF2Office works best if you have Office 2004 or 2008. Otherwise, you would have to deal with the unsightly black bars that Word v.X incorrectly shows in place of soft page breaks.