Results 1 to 9 of 9

Thread: How to scan a file in docx format instead of pdf

  1. #1
    Join Date
    May 2012
    Posts
    96

    How to scan a file in docx format instead of pdf

    I want some help to scan a file directly into docx format. It is a Microsoft Word 2007 format. I have a scanner that scan the file in pdf and jpeg only. It is not having any way to scan the same in docx format. I am trying to find some advance scanner software for the same, but found nothing. I have a huge list of project papers to scan. But I want to edit them. And pdf does not allow that. I have a hp printer.

  2. #2
    Join Date
    Apr 2009
    Posts
    586

    Re: How to scan a file in docx format instead of pdf

    You have to find a good scanner for that. Many scanner comes with that features. It can give you a editable text file at the end. But that will be txt. You cannot get a word file from it. HP Officejet Pro 3620 is one of them. It is a all in one printer with OCR features.

  3. #3
    Join Date
    Nov 2009
    Posts
    351

    Re: How to scan a file in docx format instead of pdf

    For that you require a OCR software. Also called as Optical Character Recognition. All you have to do is scan the page and save it in a image format. Like jpg or bmp image. Then you can load the file in OCR software. And it will convert that to a text. The accuracy depend on the type of image quality. Like if the text is more clear and perfect it will give you proper output. But if the scan quality is poor and the text are blurry then you will not get proper output. I had used this on some flash based website which does not text content. They have image mostly and you cannot copy them. So I take a screenshot and open it through OCR software. It give me the text file and it works well. There is no issue at all. As I said it is necessary that you must have a good quality image. If it is poor then there will be lots of junk characters.

  4. #4
    Join Date
    Apr 2009
    Posts
    393

    Re: How to scan a file in docx format instead of pdf

    I have a scanner and a software that come with it. It help me to get scanned file in editable text. It has its own word processor in which I can edit the content and move it to word. It is not very effective but works really well. It is a canon scanner.

  5. #5
    Join Date
    Apr 2009
    Posts
    515

    Re: How to scan a file in docx format instead of pdf

    There are some free online service which can help you to convert a image to a editable text. You can see that and try out. While you can also use a ocr software. There are plenty of them which are free. The best one is Microsoft OneNote which comes in Office suite. It has a OCR feature. It is quiet easy to extract the text from the image that works really well. Scan your files first. Then open Onetnote and click on Insert > Picture. When you import it, the tool will help you to get a editable text. You can clip the part of image which has text. Screen clipping icon is just available next to pictures. Click on that and choose the text. Then click on File Printout. You have to add the file which you had clipped. Insert it and done. The next screen you will get will be the editable text of image file. It has impressive text recognition features that works really fine and has no performance issue. You can also use a third party software here. That is more easy. You just have to load the file and done.

  6. #6
    Join Date
    Apr 2009
    Posts
    745

    Re: How to scan a file in docx format instead of pdf

    Use SimpleOCR. This is a nice tool that can help you extract the content from a image file. It works fine and has no issue. It is easy to manage and very easy to use. You must scan all your images properly at high resolution. The better quality it is, the better output you will get.

  7. #7
    Join Date
    Apr 2009
    Posts
    487

    Re: How to scan a file in docx format instead of pdf

    There is a big problem with all OCR tool. The issue lies with junk characters. I got a book. It is a pdf file which I had later on converted to jpeg files. There are in total 40 images. And I am using Simple OCR for the same. After converting them to a editable text there are lots of junk characters. Now this is a time consuming process to fix them all and read the all 40 pages one by one. One thing I agree with the software that it is very easy to use. It has simple way of converting files. But I am unable to understand why it is giving a junk characters. Before this I was using a online service to convert all the text to readable formats. It was more accurate compare to the software which I am using. But it has limitation. At a time I can only get conversion of not more than 5 files. I hope there can be more effective tool than this which an really work and give me the proper output.

  8. #8
    Join Date
    Jul 2011
    Posts
    440

    Re: How to scan a file in docx format instead of pdf

    Try different tools one by one until you get the right kind of output you are looking for. That would be more simple and easy for you to manage. I do the same until I did not get accurate result. Any OCR software cannot give you proper text on handwritten words.

  9. #9
    Join Date
    Jul 2011
    Posts
    434

    Re: How to scan a file in docx format instead of pdf

    I am using TopOCR. I am using this from long time and it works really fine. The benefit of using this software that it offer you to convert the part of image. It has two windows layout where on one side you can see the image loaded and on the other side you can see the editable text. You can convert everything into a readable text and get a proper output. It does not generate any junk characters. Now a OCR software can be accurate if you have minimal text. If you are trying to convert a huge amount of pages at the same time then you get the issue. So do one thing try to extract a paragraph one by one. This will give you more easy option. It support you conversion from jpeg, tiff, gif and bmp. It is a bit advance tool through which you can do more. You can get pdf, html, rft and txt output from the same. It is also capable of processing text from a image which has text and graphics both.

Similar Threads

  1. How to Convert PDF file to docx
    By SALAZAR in forum Windows Software
    Replies: 7
    Last Post: 31-01-2013, 06:21 PM
  2. How can Convert Docx file to other format
    By Nathen in forum Windows Software
    Replies: 6
    Last Post: 12-01-2010, 04:39 PM
  3. corrupted encrypted docx file
    By JosB in forum Windows Software
    Replies: 4
    Last Post: 08-06-2009, 04:14 PM
  4. Reading a file docx
    By Erskine in forum Tips & Tweaks
    Replies: 0
    Last Post: 26-11-2008, 06:37 PM
  5. DOCX file corrupt
    By hamX15 in forum Software Development
    Replies: 4
    Last Post: 16-09-2008, 03:17 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,711,662,580.82011 seconds with 17 queries