Is there a function that can return all the text in a document? I would like to be able to get the text so it can be added into our full-text index. Currently we’re using FreeSpire.Doc, FreeSpire.PDF, & FreeSpire.XLS to do this.
It seems like it should be built into your product since you have to extract the information to rebuild it on to a web page!
From what I can see, this is just on the client (javascript) object? I’m looking to do this on the server side. This little example represents what I would like to do but I can’t find anything within the ctldoc object that resembles the pages or extracted text. Do you have a sample of how it works in code?
'read the file into server
Dim ctldoc As New DocViewer
Dim config As New DotnetDaddy.DocumentConfig.PdfConfig
config.AllowCopy = True
ctldoc.OpenDocument("c:\temp\test.pdf", config)
'this is fake code on how I would think it could work
Dim sb As New StringBuilder 'string builder
For Each Page In ctldoc.Pages 'loop through each page
sb.Append(CopyPage(Page)) 'put the text into the stringbuilder
Next
Sadly there is nothing like that on Doconut. We are trying to be the best and most solid document Viewer.
For everything related to reading or manipulating documents on the code behind, please refer to our parent company, Aspose(https://products.aspose.com/pdf/)