Go Back   TechArena Community > Software > Windows Software
Become a Member!
Forgot your username/password?
Register Tags Active Topics RSS Search Mark Forums Read

Sponsored Links



How to extract data from word document?

Windows Software


Reply
 
Thread Tools Search this Thread
  #1  
Old 13-07-2011
Member
 
Join Date: Jun 2011
Posts: 84
How to extract data from word document?
  

Since few days I am trying to do modification I macro. The reason behind this is that to extract data from Microsoft Word documents. The macro which I used is as follows

I am going to modify a macro to extract data from MS Word documents. I run below mentioned macro but in the beginning where there is :

Code:
Sub GetDataFromWord()

'Set a reference (Tools - References) to the
'Microsoft Word x.0 Object Library

Dim wdApp As Word.Application
Dim wdDoc As Word.Document
Dim sFile As String
Dim rInput As Range

'Define row and column of data in table
Const lROW As Long = 2
Const lCOL As Long = 2

'Specify file that contains table
sFile = "C:\Documents and Settings\Hashemi\Desktop\Macro test\test.doc"

'instantiate Word and open document
Set wdApp = New Word.Application
Set wdDoc = wdApp.Documents.Open(sFile)

'define range where data goes
Set rInput = Sheet1.Range("a1")

'Copy value from table and paste to cell
With wdDoc.Tables(1)
.Cell(lROW, lCOL).Range.Copy
rInput.PasteSpecial xlPasteValues
End With

wdDoc.Close False
wdApp.Quit

Set wdDoc = Nothing
Set wdApp = Nothing
  • ('Set a reference (Tools - References) to the
    'Microsoft Word x.0 Object Library)

    And also error on:
  • (Dim wdApp As Word.Application)
How should I solve this?

Reply With Quote
  #2  
Old 13-07-2011
Member
 
Join Date: May 2009
Posts: 523
Re: How to extract data from word document?

You can use many languages to extract data from Microsoft word, languages such as COM. Perl and Python because these languages have COM modules that you can use to extract data from MS word.
Code:
import glob,os,win32com.client
wordapp = win32com.client.Dispatch("Word.Application")
path = "D:\\mydir"
os.chdir(path)
for files in glob.glob("*.doc"):
    doc = os.path.abspath(os.path.join(path, files))
    print "processing " , doc   
    wordapp.Documents.Open(doc)
    txt = doc[:-3] + 'txt'             
    wordapp.ActiveDocument.SaveAs
                 (txt,FileFormat=win32com.client.constants.wdFormatText)
    wordapp.ActiveDocument.Close( )
wordapp.Quit()
Reply With Quote
  #3  
Old 13-07-2011
Member
 
Join Date: Apr 2009
Posts: 483
Re: How to extract data from word document?

You can use the below script to extract your important data from Microsoft word document. This script is an explicit option script that extracts data.
Option Explicit
Code:
REM We use "Option Explicit" to help us check for coding mistakes


REM the Word Application
Dim objWord

REM the path to the Word file
Dim wordPath

REM the document we are currently reading data from
Dim currentDocument
REM the number of Words in the current document
Dim numberOfWords
Dim i


REM where is the Word file located?
wordPath = "C:\Data\Doc1.doc"

WScript.Echo "Extract Data from " & wordPath

REM Create an invisible version of Microsoft Word
Set objWord = CreateObject("Word.Application") 

REM don't display any messages about documents needing to be converted
REM from  old Word file formats
objWord.DisplayAlerts = 0


REM open the Word document as read-only
REM open (path, confirmconversions, readonly
objWord.Documents.Open wordPath, false, true

REM Access the document
Set currentDocument = objWord.Documents(1)

REM How many words are in the document
NumberOfWords = currentDocument.Words.count
WScript.Echo "There are " & NumberOfWords & " words " & vbCRLF

For i = 1 to NumberOfWords
	WScript.Echo currentDocument.Words(i)
Next

REM Close the document
currentDocument.Close
REM Free memory used to store the document object
Set currentDocument = Nothing

REM exit Microsoft Word
objWord.Quit
Set objWord = Nothing
Reply With Quote
  #4  
Old 13-07-2011
Member
 
Join Date: May 2009
Posts: 531
Re: How to extract data from word document?

If the above code does not help you to extract data from Microsoft word than try this code. I am sure this will help. I have personally edited for you. So you just need to copy paste the below vba code.
Code:
Sub ExtractData()
Dim sDTE As String
Dim sSubject As String
Dim strFileName As String
Dim strPath As String
Dim oDoc As Document
Dim dataDoc As Document
Dim fDialog As FileDialog
Set fDialog = Application.FileDialog(msoFileDialogFolderPicker)
 
'Pick the folder with the letters
With fDialog
    .Title = "Select Folder containing the documents to be modifed and click OK"
    .AllowMultiSelect = False
    .InitialView = msoFileDialogViewList
    If .Show <> -1 Then
         MsgBox "Cancelled By User"
         Exit Sub
    End If
    strPath = fDialog.SelectedItems.Item(1)
    If Right(strPath, 1) <> "\" Then strPath = strPath + "\"
End With

'Close any open documents
If Documents.Count > 0 Then
    Documents.Close SaveChanges:=wdPromptToSaveChanges
End If
strFileName = Dir$(strPath & "*.do?")
 
'Assign the name of the document to take the data
Documents.Open ("""D:\My Documents\Test\DTE data.doc""")
Set dataDoc = ActiveDocument
 
'Open the letters in turn
While strFileName <> ""
    Set oDoc = Documents.Open(strPath & strFileName)
    Selection.HomeKey wdStory 'Start from the top of the letter
    With Selection.Find 'find the first string
        .ClearFormatting
        Do While .Execute(findText:="DTE/*^13", _
             MatchWildcards:=True, _
             Wrap:=wdFindStop, Forward:=True) = True
            'Assign the found text to a variable and chop off
            'the last character - 
            sDTE = Left(Selection.Range, Len(Selection.Range) - 1)
        Loop
    End With
    Selection.HomeKey wdStory 'Start from the top of the letter
    With Selection.Find 'find the second string
        .ClearFormatting
        Do While .Execute(findText:="Subject :*^13", _
                  MatchWildcards:=True, _
                  Wrap:=wdFindStop, Forward:=True) = True
           'Assign the second string to a variable and chop off
           'the last character and the leading text
           sSubject = Mid(Selection.Range, 10, Len(Selection.Range) - 10)
        Loop
    End With
    'Switch to the data document and add the content of
    'the variables to the blank row of the table
    dataDoc.Activate
    With Selection
        .EndKey wdStory
        .MoveUp Unit:=wdLine, Count:=1
        .MoveRight Unit:=wdCell, Count:=2 'Add a new blank row
        .TypeText Text:=sDTE
        .MoveRight Unit:=wdCell
        .TypeText Text:=sSubject
    End With

    'Close the letter without saving
    oDoc.Close SaveChanges:=wdDoNotSaveChanges
    Set oDoc = Nothing
    strFileName = Dir$()
Wend
'Save the data document
dataDoc.Save
End Sub
Reply With Quote
  #5  
Old 13-07-2011
Member
 
Join Date: Apr 2009
Posts: 567
Re: How to extract data from word document?

If you havenít found any necessary help from the above than you just have to copy paste as I suggest. Copy baste the below vba code in your Excel module After that you also have to select Microsoft Word Object Library from Tools.

Code:
Sub CollateForms()
Dim myPath As String
Dim myWord As New Word.Application
Dim myDoc As Word.Document
Dim myField As Word.FormField
Dim n As Long, m As Long
Dim fs, f, f1, fc
Range("A2").Select
myPath = InputBox("Path?")
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.GetFolder(myPath)
Set fc = f.Files
m = 0
For Each f1 In fc
n = 0
Set myDoc = myWord.Documents.Open(myPath & "\" & f1.Name)
For Each myField In myDoc.FormFields
ActiveCell.Offset(m, n).Value = myField.Result
n = n + 1
Next
myDoc.Close wdDoNotSaveChanges
m = m + 1
Next
Set myField = Nothing
Set myDoc = Nothing
Set myWord = Nothing
End Sub
Reply With Quote
  #6  
Old 15-07-2011
Member
 
Join Date: Nov 2008
Posts: 1,259
Re: How to extract data from word document?

The following mentioned code is edited and made by me for you to extract data from word. This code will be beneficial for you to load the doc files in the directory into the excel sheet.

Code:
Sub LoadWordDoc()
Dim F
Dim x As Double
Dim FolderYear As String
Dim DocMonth As String
Dim DocPath As String
Dim FName()
Dim Ext As String

'///Load variables values
    DocMonth = Sheet1.Range("b1")
    FolderYear = Sheet1.Range("b2")
    
'/// To create the path to search for files
    DocPath = "C:\Documents and Settings\montefem\My Documents\Excel Test\word Document\" + FolderYear + "\"
    
'///here i think I should use the varibe DocMonth just to list the doc with Feb in the string but I do not know how.
    Ext = "*.doc"
    
'///To load all files from that directory and place in F as an array
    ChDir (DocPath)
    F = Dir(DocPath & Ext)
    Application.DisplayAlerts = False
    x = 2

'///Clear previous values
    Sheet1.Range("g2:g100").ClearContents
    
'///to place the files name in the settings sheet in the xls application to manipulate them
    Sheet1.Activate
    With ActiveSheet
        Do While Len(F) > 0
            ReDim Preserve FName(2, x)
            FName(2, x) = DocPath & F
            Cells(x, "G") = FName(2, x)
            x = x + 1
            F = Dir()
        Loop
    If x = 1 Then MsgBox "No Files": GoTo 20
    End With
20
End Sub
Reply With Quote
Reply

  TechArena Community > Software > Windows Software
Tags: , , , ,



Thread Tools Search this Thread
Search this Thread:

Advanced Search


Similar Threads for: "How to extract data from word document?"
Thread Thread Starter Forum Replies Last Post
Blank word document opens while opening original document Loyalpalm Windows Software 6 01-12-2011 11:27 PM
Error Part: /word/document.xml, Line: 2, Column:15212 when Opening Word Document emvicente Windows Software 1 10-06-2011 03:31 AM
How to create a Word for Mac merge document by using Excel data? Dwarner Windows Software 5 12-03-2010 11:24 PM
Extract Images in a Word 2007 Document DAGAN Windows Software 2 27-06-2009 08:59 PM
Converting Word 97-2003 document to Word 2007 Jerry Vista Help 7 19-05-2008 03:14 AM


All times are GMT +5.5. The time now is 08:35 PM.