Tuesday 5 June 2012

Re: [dcphp-dev] counting words in .docx and pdf files

For the PDFs, you might look into using XPDF (http://www.foolabs.com/xpdf/). You'd have to compile xpdf for the environment it will be running on, but then you could just use system() or any similar function to read the PDF. At that point you can use the code below. For a similar example of how to do this, take a look at the "PDF Indexer" extension for Joomla!.

I'm not really sure about docx; I've never had to work with that format. Hopefully this helps get you halfway there, though.

-John

On Jun 5, 2012 8:38 AM, "vit srikanth" <vit.srikanth490@gmail.com> wrote:
counting words in .docx and pdf files?

This is the code for counting the words in .txt and .doc files......
But i want for .docx and for pdf files too.....



<?php
       $f = "document.txt";

       // read into string
       $str = file_get_contents($f);

       // count words
       $numWords = str_word_count($str);
       echo "This file have ". $numWords . " words";
?>


Thank you
Srikanth

--
You received this message because you are subscribed to the Google
Group: "Washington, DC PHP Developers Group" - http://www.dcphp.net
To post, send email to washington-dcphp-group@googlegroups.com
To unsubscribe, send email to washington-dcphp-group+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/washington-dcphp-group?hl=en

--
You received this message because you are subscribed to the Google
Group: "Washington, DC PHP Developers Group" - http://www.dcphp.net
To post, send email to washington-dcphp-group@googlegroups.com
To unsubscribe, send email to washington-dcphp-group+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/washington-dcphp-group?hl=en

0 comments:

Post a Comment