top of page

How to convert an HTML document file to text (TXT) in Java

When working with HTML files, if you have no need to maintain formatting or graphics, it may be easier to convert your file format to TXT. This will process the content of the file as plain text and provide a smaller, more manageable file that can then be easily used for other purposes like copying text and sharing. This API will allow you to smoothly convert any HTML file to TXT for improved ease of use.


First, we will need to install our library. To do this, add this repository reference to Maven POM:

<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>

Then you can add the dependency reference:

<dependencies>
<dependency>
    <groupId>com.github.Cloudmersive</groupId>
    <artifactId>Cloudmersive.APIClient.Java</artifactId>
    <version>v3.54</version>
</dependency>
</dependencies>

After this, we can now call ConvertDocumentHtmlToTxt:

// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.ConvertWebApi;
ApiClient defaultClient = Configuration.getDefaultApiClient();
// Configure API key authorization: Apikey
    ApiKeyAuth Apikey = (ApiKeyAuth)     
    defaultClient.getAuthentication("Apikey");
    Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. 
    "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
    ConvertWebApi apiInstance = new ConvertWebApi();
    HtmlToTextRequest input = new HtmlToTextRequest(); 
// HtmlToTextRequest | HTML to Text request parameters
try {
    HtmlToTextResponse result =     
    apiInstance.convertWebHtmlToTxt_0(input);
    System.out.println(result);
} 
catch (ApiException e) 
{
    System.err.println("Exception when calling 
    ConvertWebApi#convertWebHtmlToTxt_0");
    e.printStackTrace();
}

Now, you can easily convert any HTML web page to plain text, enhancing your systems’ versatility and improving your workflow.


Read more:


Source: Medium

The Tech Platform

0 comments

Recent Posts

See All
bottom of page