Results 1 to 4 of 4

Thread: How To Convert PDF Documents to XML ?

  1. #1
    Join Date
    Feb 2009
    Posts
    24

    How To Convert PDF Documents to XML ?

    Hello, I Have a of lots of Pdf Document That I want To Convert to XML Format I just need to automatically convert the incoming PDF files to XML on a server (automating Acrobat Standard's "SaveAS XML" function) please Suggest me What To Do thanks in Advance For Your replies

  2. #2
    Join Date
    May 2008
    Posts
    4,831

    Re: How To Convert PDF Documents to XML ?

    Hello, you Can try PDF XML Converter(P2X) Which extract the text information from the pdf file and output them into a xml file. All the functions were encapsulated into a COM component, the exposed methods/interface is as same as PDF Plain Text Extractor(P2T), but the output file is in XML format. You can integrate it into your own application and redistribute it royalty free. The output XML format was defined in PDFDocument.xsd

    Output XML sample

    <?xml version="1.0" encoding="UTF-8"?>
    <PDFDocument>
    <PDFInfo>
    <title><![CDATA[ PDF Reference ]]></Title>
    <Subject><![CDATA[PDF Reference 1.4]]></Subject>
    <Author><![CDATA[Smith.H]]></Author>
    <Creator><![CDATA[PDF Writer]]></Creator>
    <Producer><![CDATA[Adobe Acrobat]]></Producer>
    <CreateDate><![CDATA[2002/06/15]]></CreateDate>
    <KeyWords><![CDATA[PDF Reference]]></KeyWords>
    </PDFInfo>
    <Pages>
    <Page>
    <PageNumber>1</PageNumber>
    <PDFElement>
    <Coordinate_X>12</Coordinate_X>
    <Coordinate_Y>34</Coordinate_Y>
    <DataString>
    <![CDATA[
    Hello, this is a data chunk with
    special chars "~@@^%^$(^#\''"'and
    line break.CDATA will deal with
    this kind of data perfectly.
    ]]>
    </DataString>
    </PDFElement>
    .
    .
    .
    </Page>
    .
    .
    .
    </Pages>
    </PDFDocument>

    Download It From Here

  3. #3
    Join Date
    Oct 2008
    Posts
    55

    Re: How To Convert PDF Documents to XML ?

    Hello , After fooling around with several shareware programs that only convert the first few pages of an Acrobat file, or only work for a few days, I found an open source utility on sourceforge that worked so nicely that I wanted to give it wider publicity among people who might find it handy. See http://pdftohtml.sourceforge.net and http://sourceforge.net/projects/pdftohtml. A Windows binary is available.

  4. #4
    Join Date
    May 2008
    Posts
    4,345

    Re: How To Convert PDF Documents to XML ?

    The Investintech PDF-to-XML Conversion Software Development Kit (SDK) is a collection of methods compiled, linked and stored in a dynamic-link library (DLL) file that is required for application development. The purpose of these methods is to convert files from the Portable Document Format (PDF) to an Image (Bitmap, JPEG, GIF, PNG, and TIFF).

    The PDF-to-XML Conversion SDK can be used via COM API to support VB, .NET, Delphi, C/C++ applications.

    Download From Here

Similar Threads

  1. Features of Docsmartz Word to PDF Convert Documents
    By Leoniee in forum Windows Software
    Replies: 5
    Last Post: 04-02-2010, 03:29 AM
  2. Can't convert documents in Php script (via command line)
    By eawade in forum Windows Software
    Replies: 2
    Last Post: 20-01-2010, 06:34 AM
  3. Convert all your documents from one format to another
    By Pyrotechnic in forum Tips & Tweaks
    Replies: 2
    Last Post: 19-03-2009, 06:44 PM
  4. How to convert htm (or html) pages into PDF documents ?
    By EricTheRed in forum Tips & Tweaks
    Replies: 0
    Last Post: 17-03-2009, 05:21 PM
  5. Convert works documents to word documents
    By Arzaan in forum Tips & Tweaks
    Replies: 3
    Last Post: 04-03-2009, 06:46 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,717,389,478.36085 seconds with 16 queries