The Tech Platform

Mar 29, 20211 min

How to convert an HTML document file to text (TXT) in Java

When working with HTML files, if you have no need to maintain formatting or graphics, it may be easier to convert your file format to TXT. This will process the content of the file as plain text and provide a smaller, more manageable file that can then be easily used for other purposes like copying text and sharing. This API will allow you to smoothly convert any HTML file to TXT for improved ease of use.


 

First, we will need to install our library. To do this, add this repository reference to Maven POM:

<repositories>
 
<repository>
 
<id>jitpack.io</id>
 
<url>https://jitpack.io</url>
 
</repository>
 
</repositories>

Then you can add the dependency reference:

<dependencies>
 
<dependency>
 
<groupId>com.github.Cloudmersive</groupId>
 
<artifactId>Cloudmersive.APIClient.Java</artifactId>
 
<version>v3.54</version>
 
</dependency>
 
</dependencies>

After this, we can now call ConvertDocumentHtmlToTxt:

// Import classes:
 
//import com.cloudmersive.client.invoker.ApiClient;
 
//import com.cloudmersive.client.invoker.ApiException;
 
//import com.cloudmersive.client.invoker.Configuration;
 
//import com.cloudmersive.client.invoker.auth.*;
 
//import com.cloudmersive.client.ConvertWebApi;
 
ApiClient defaultClient = Configuration.getDefaultApiClient();
 
// Configure API key authorization: Apikey
 
ApiKeyAuth Apikey = (ApiKeyAuth)
 
defaultClient.getAuthentication("Apikey");
 
Apikey.setApiKey("YOUR API KEY");
 
// Uncomment the following line to set a prefix for the API key, e.g.
 
"Token" (defaults to null)
 
//Apikey.setApiKeyPrefix("Token");
 
ConvertWebApi apiInstance = new ConvertWebApi();
 
HtmlToTextRequest input = new HtmlToTextRequest();
 
// HtmlToTextRequest | HTML to Text request parameters
 
try {
 
HtmlToTextResponse result =
 
apiInstance.convertWebHtmlToTxt_0(input);
 
System.out.println(result);
 
}
 
catch (ApiException e)
 
{
 
System.err.println("Exception when calling
 
ConvertWebApi#convertWebHtmlToTxt_0");
 
e.printStackTrace();
 
}

Now, you can easily convert any HTML web page to plain text, enhancing your systems’ versatility and improving your workflow.

Read more:

Convert MPG to MP4 in Java

Convert XLSX to CSV using C# in .NET Framework

Source: Medium

The Tech Platform

www.thetechplatform.com

    0