close

Privacy guaranteed - Your email is not shared with anyone.

Converting PDF to Word

Discussion in 'Tech Talk' started by eisman, Oct 13, 2004.


  1. eisman

    eisman
    Expand Collapse
    ARGH!
    CLM

    Joined:
    Jul 28, 2002
    2,579
    0
    Location:
    Moving Target
    As always I'm turning to the guys who know best.

    I had a program that would convert PDF's to Word files. It's gone. I need another. Anyone have a viable download?
     

    Wanna kill these ads? We can help!
  2. NetNinja

    NetNinja
    Expand Collapse
    Always Faithful

    Joined:
    Oct 23, 2001
    967
    0
    Location:
    HotLanta, GA
  3. NRA_guy

    NRA_guy
    Expand Collapse
    Unreconstructed

    Joined:
    Jun 20, 2004
    1,704
    0
    Location:
    Mississippi, CSA
    I have ScanSoft. It will not convert a pdf file to Word if the pdf file was saved as an image.

    And if someone simply scanned a document, it's probably saved as an image. You never know.

    The person who scans it into Adobe can run "paper capture" in Adobe, and save the captured document. Then the text is saved as a text layer and ScanSoft will convert it to a Word document. It does pretty good. But the paper capture sometimes confuses "1" with "l" or "0" with "O" and shifts fonts on you.

    But to me ScanSoft is no better than simply blocking text and copying it to the clipboard in Adobe and pasting it into Word. ScanSoft may retain some formatting. I can't remember.

    You cannot easily tell if the pdf file was stored as an image or as a text layer.

    Adobe says to open the document in Adobe and go File--->Document Properties--->Font

    If you see fonts listed, it is captured as text and can be blocked and copied. This also means that it can be converted using ScanSoft.

    I run the full retail boxed version of Adobe, not the free reader. Not sure how the free reader works.

    PS: I hate Adobe. They give you Word--->Adobe conversion, but not Adobe--->Word conversion. I have heard that Word 2003 gives you Adobe--->Word conversion, and their Word 2003 implies this, but some who have it tell me that's not true.

    NRA_guy
     
  4. Warrior2k3

    Warrior2k3
    Expand Collapse
    Protector

    Joined:
    Oct 11, 2004
    30
    0
    Location:
    USA
    I just use the text selection tool on the PDF file, Select all text, Control C to copy then Control V on the blank word doc to paste. Some formatting will be lost but all the text will be there.
     
  5. Anon1

    Anon1
    Expand Collapse

    Joined:
    Aug 17, 2000
    1,116
    169

    Your hatred is misdirected!! Hate Microsoft instead.

    The reason that Adobe cannot readily import a Word file is because *Microsoft* (MS) keeps their file formats a closed secret. Whereas other companies open up the internal workings of their file formats so that others can create compatible programs, Microsoft does not.

    There are other open-source applications like OpenOffice.org which tries to be compatible with MS file formats but that part of the program code has had to be kind of hacked together because of the problems with reverse-engineering the closed MS formats.
     
  6. NRA_guy

    NRA_guy
    Expand Collapse
    Unreconstructed

    Joined:
    Jun 20, 2004
    1,704
    0
    Location:
    Mississippi, CSA
    Oh, I hate Microsoft equally. Maybe more.

    I'm no computer expert but as I understand, he was asking about converting a pdf (Adobe) file to a doc (Word) file---not about importing a Word file into Adobe.

    Actually, Adobe installs icons on the Word toolbar that enable one to readily create a pdf file from a Word file. Eliminating the icons is most folk's concern because we hardly ever do that.

    But we frequentlyneedto go the opposite way.

    I needed to convert pdf files to doc files often enough that I bought the $600 ScanSoft OmniPage 14 Office that claims "Turn PDF files into editable documents while retaining their layout ".

    Well, yes and no.

    On some pdf files you get “ScanSoft PDF Converter cannot process this file because the first page does not contain a text layer”

    The web link for more information gives the following explanation:

    Problem:

    An error will occur when converting a PDF file that does not contain a text layer on the first page. The error message states “ScanSoft PDF Converter cannot process this file because the first page does not contain a text layer.”

    Cause:

    When you open a PDF, whose first page has no text layer, it is assumed the whole document is an image-only PDF file.


    Then it balks.

    NRA_guy
     
  7. Toyman

    Toyman
    Expand Collapse

    Joined:
    May 6, 2003
    2,597
    20
    Location:
    West Michigan
    Never used it, but here it is:

    http://www.verypdf.com/pdf2word/index.html

     
  8. NRA_guy

    NRA_guy
    Expand Collapse
    Unreconstructed

    Joined:
    Jun 20, 2004
    1,704
    0
    Location:
    Mississippi, CSA
    Yeah . . . but. As I read it, it does pretty much the same thing that ScanSoft does, granted at a cheaper price. (ScanSoft balks if the first page contains no text.)

    Problem, as I said before, is that many pdf files are images---not text, even though they look like text and print like text---they are only graphic images of the page.

    When any of these pdf to doc conversion programs try to convert pdf to doc, they don't perform an OCR interpretation of text in a graphic image.

    Some pdf files contain the text as a layer, and that is the only kind of pdf file that these conversion programs can convert.

    It all depends upon how the pdf file was generated. You cannot tell simply by opening the file in Adobe.

    By the way, as I understand, pdf (poratble document format) is not an Adobe exclusive and there are a number of different pdf formats. Not all pdf files are the same.

    I sometimes resort to printing out pdf files and scanning them from the hardcopy with OCR (optical character reader) software in order to avoid re-typing the text. A pain, but not as much as re-typing.

    Of course, OCRs try to interpret each letter. Sometimes it gets them wrong.

    I would like to see somebody write a conversion program that can do OCR conversion from text graphic images without going through the "print hardcopy-->scan-->read with OCR" process.

    Again, I'm no expert by any means.

    NRA_guy