top of page

How to remove duplicate characters from String in Java?

There are three main ways to remove duplicates characters from String in Java; First to sort the character array of string and then remove duplicate characters in linear time. Second, use an auxiliary data structure like Set to keep track of characters already seen and then recreate String from Set. This would tradeoff space for time, as the space complexity of this solution would be O(n). The third approach would be the brute force way of taking one string at a time and then removing all other occurrences of that string, rearranging the array, and then starting with the next element.

This would be a terrible solution because of the multiple loops required. Btw, in order to even think about these solutions, you need to have a good knowledge of various data structures and algorithms, those are fundamental parts of problem-solving in the programming world.

And, if you feel you lack Data Structure and algorithm skills you can revise them by joining the Data Structures and Algorithms: Deep Dive Using Java course, one of the best to learn DS and Algorithms in Java.

How to remove repeated characters from a given String in Java

Now that, you are familiar with both problems and some approaches to remove duplicate characters from given String in Java let's deep dive into solutions to this classic coding problem and analyze their time and space complexity. Solution 1 - Sorting and Removing Duplicates If you pay a little bit of attention, then you can easily find that removing duplicate characters from String is nothing but removing duplicates from an array. This means you can use all the ways we have used previously. If we get the character array from String and then sort it using MergeSort or QuickSort in O(Nlog N) time, we can easily remove duplicates in linear time, because they will be clubbed together. All you need to do is iterate over sorted character array, compare the current element with the previous element, and discard it if they are the same. At the end of the iteration, your array only contains unique characters. Though this solution has a drawback, it doesn't preserve the original order of the element. So if the interviewer asks you to keep elements in their original order, this solution will fall short, but for the practical purpose, this would work because you are dealing with duplicate characters. If you want to learn more about stable sorting algorithms, I suggest you take a look at the Algorithms and Data Structures - Part 1 and 2 courses on Pluralsight.


Solution 2 - Using a Data Structure like Set

The next solution is an exquisite one, and it demonstrates how you can simplify your algorithm by choosing an appropriate data structure. To remove duplicate characters from String, we need a data structure where insertion and search would be super fast. If you take a hash table, we will get O(1) for insert and find operation. By the way, if you are using Java, then you have a better choice, HashSet, which is a combination of set and hash table data structure. The Set data structure doesn't allow duplicate characters, so if you convert your String to a character array, loop over it, add each element into HashSet, you will end up with a Set without duplicate characters. You can then convert that Set to String. Our solution is based upon this knowledge, but it becomes more elegant by using StringBuffer for creating output String.add() method of Set return false if the element already exists in Set and by using that we can create a StringBuilder with only unique characters. Converting a StringBuilder to String is a trivial task in Java but if you are not familiar with essential Java API then I suggest you join The Complete Java Masterclass by Tim Buchalaka on Udemy, one of the most comprehensive and up-to-date course to learn Java online.

Java Program to remove duplicate characters from String

Now that you have understood both solutions of removing duplicate characters from String let's write code to implement them. In this program, I have two methods for removing duplicates, one uses HashSet, an additional data structure while others remove duplicate characters in place without using any other data structure or additional memory. When you run this program in your Eclipse IDE or from a command-line window, it will ask you to enter a string and then output the string without duplicate characters in the console. This way you can test the solution and code with different inputs like null string, empty string, string without duplicate, string with duplicate, the string just containing duplicates, and with very short or long string inputs.


By the way, if you struggle to convert solutions to code or just struggle to find solutions to coding problems then I highly recommend you to join Grokking the Coding Interview: Patterns for Coding Questions, an interactive course from Eductive, a text-based interactive learning platform. This is one of a kind course that will teach you 15 essential coding patterns like sliding window, fast and slow pointers, Merge intervals, etc which can be used to solve 100+ leetcode problems. This can be really good for anyone preparing for coding interviews or who just wants to learn coding better.



Anyway, now, let's look at the code to remove duplicate characters from Java String:

import java.util.HashSet; 
import java.util.Scanner; 
import java.util.Set;

/*  * Java Program to remove duplicate characters from String.  
* We'll see two solutions for this problem, first one will   
* show you how use of a suitable data structure like HashSet or HashMap  
* simplifies the solution and second one will remove the   
* duplicate characters from String in place to boost performance.   
*/

public class Main 
    {public static void main(String[] args) 
    {
           
    System.out         
           .println("Welcome to Java program to remove duplicate 
           characters from String");     
     Scanner scnr = new Scanner(System.in);       
     System.out.println("Please enter a String with duplicate 
           characters");       
     String input = scnr.nextLine();       
     String output = removeDups(input);     
     System.out.println("String without duplicate characters is " 
            + output);       
     String withoutDups = removeDupsInPlace(input);     
     System.out.println("String without duplicate characters in place 
            is "+ withoutDups);     
     scnr.close();
}
     
/**    
* Java method to remove duplicate characters from String This method uses a    
* HashSet collection to get rid of duplicate characters.    
*     
* @param word    
* @return String without duplicate characters    
*/

public static String removeDups(String word) {     
       Set<Character> chars = new HashSet<>();     
       StringBuilder output = new StringBuilder(word.length());
       
       for (int i = 0; i < word.length(); i++) {
              char ch = word.charAt(i);
              if (chars.add(ch)) {         
                     output.append(ch);
              }
        }
        return output.toString();
 }

/**    
* A java method to remove duplicate characters from String in place. It    
* doesn't use additional buffer like HashSet we have used previously.    
*     
* @param word    
* @return String without duplicates    
*/

public static String removeDupsInPlace(String word) {
       final StringBuilder output = new StringBuilder();
       
       for (int i = 0; i < word.length(); i++) {       
              String character = word.substring(i, i + 1);
              if (output.indexOf(character) < 0) 
              // if not contained         
                     output.append(character);
              }
        return output.toString();
     }
}   

Output:

Welcome to Java program to remove duplicate characters from String 
Please enter a String with duplicate characters 
Java
String without duplicate characters is Jav 
String without duplicate characters in place is Jav

That's all about how to remove duplicate characters from String in Java. Apart from learning to solve these frequently asked coding problems, there are a couple of more things to learn. First, by using a data structure you can mostly simplify your logic and code.

Second, by using additional memory you can bring down the time complexity of your algorithm or make your solution fast. You have also learned how to remove duplicates in place from String, which is very important from the interview point of view.



Source: Java67


The Tech Platform

0 comments

Comments


bottom of page