String compression algorithm

9 posts / 0 new
Last post
heartin
String compression algorithm

Write an algorithm for string compression using counts of repeated characters

E.g. aaaaarrrrbbb will become a5r4b3

Additional note: if the new compressed string is is bigger than the original string, return original string.

Tags: 
Was it useful?
clalam
what should be the output if

what should be the output if the string is "aaabbaazz"?

Option1: a3b2a2z2

Option2: a5b2z2

Was it useful?
heartin
Good question, but I will

Good question, but I will make you only answer this.

Important property of any compression algorithm in real world is that it should be able to reconstruct the original string from the compressed form.

From which one of the above compressed forms (a3b2a2z2 and a5b2z2) do you think you can reconstruct the original string above (aaabbaazz) exactly as it is?

Note: Please answer even if you understood.

You voted 'DOWN'.
Was it useful?
clalam
Thanks for clearing my doubt.

Thanks for clearing my doubt.. We can reconstruct the string only when we construct like a5b2a2z2..

Was it useful?
clalam
Here is my solution:

Here is my solution:

The time complexity is O(n)...

Your suggestions to improve performance is highly appreciated..

public class StringCompression {

    public static void main(String[] args) {
        String input = "aaaaarraarrbbb";
        
        // SOLUTION 1
        char[] inputChArr = input.toCharArray();
        StringBuffer sb = new StringBuffer();
        boolean[] boolArr = new boolean[255];
        int count = 0;
        
        for(int i = 0 ;i<inputChArr.length;i++) {
            if(boolArr[inputChArr[i]]) {
                count++;
            }
            else {
                boolArr[inputChArr[i]] = true;
                if(i!=0 && inputChArr[i] != inputChArr[i-1]) {
                    boolArr[inputChArr[i-1]] = false;
                    sb.append(count + "" + inputChArr[i]);
                    //sb.append(inputChArr[i]);
                }else {
                    //For first Character alone
                    sb.append(inputChArr[i]);
                }
                count = 1;
            }    
        }
        sb.append(count);
        System.out.println("Input String: " + input + "\nCompressed String: " + sb);
        
        
        }

}

Was it useful?
huzefa
explain

clalam : can you please explain your first solution ?

Was it useful?
clalam
Please check the explanation

Please check the explanation below:

Reply me for any doubts..

1) I am just marking the first occurance of each unique character as "true"(255 unique char possiblities in ASCII). 

2) so, from second iteration onwards, it will check in the boolean array for the charcter is already present or not..

              a) if it is already present and repeating just increment the count.

              b) otherwise mark next char as "true" and previous char as "false"(as the it may come again) and                             construct your output by using stringbuffer append.

              c) repeat till end of the array.

3) Print the StringBuffer value outside the loop..

 

for this string: "aaaaarraarrbbb"

for first iteration 

    booleanArr[a(which is 97)] will be true and count will be 1.

    from second iteration onwards.. count will be incremented till 5th iteration as booleanArr[a] is true during these iterations.

    6th iteration, it will go to else part and mark booleanArr[r] as true and booleanArr[a] as false(since a may come after r as in the above input). At this point, your output string contains "a5r"

    the above steps will be repeated till end of the array and finally you will have the required result in your output string..

 

  

 

Was it useful?
huzefa
count

how does this :

sb.append(count);

after the for loop works .... means i know its correct but how you appended only count ?

Was it useful?
clalam
If you observe else if

If you observe else if condition, I am appending count of previous char and current char, that means when "r" comes first time, the SB will become "a5r". But we should also append count of last char of the string where as my condition work here since it comes out of for loop but I have the count of last char in my count variable. Simply append that to your output string..

If you wanna play with it,just try to remove that statement alone and you won't see the count of last character.

Was it useful?

Quick Notes Finder Tags

Activities (1) advanced java (1) agile (3) App Servers (6) archived notes (2) Arrays (1) Best Practices (12) Best Practices (Design) (3) Best Practices (Java) (7) Best Practices (Java EE) (1) BigData (3) Chars & Encodings (6) coding problems (2) Collections (15) contests (3) Core Java (All) (55) course plan (2) Database (12) Design patterns (8) dev tools (3) downloads (2) eclipse (9) Essentials (1) examples (14) Exception (1) Exceptions (4) Exercise (1) exercises (6) Getting Started (18) Groovy (2) hadoop (4) hibernate (77) hibernate interview questions (6) History (1) Hot book (5) http monitoring (2) Inheritance (4) intellij (1) java 8 notes (4) Java 9 (1) Java Concepts (7) Java Core (9) java ee exercises (1) java ee interview questions (2) Java Elements (16) Java Environment (1) Java Features (4) java interview points (4) java interview questions (4) javajee initiatives (1) javajee thoughts (3) Java Performance (6) Java Programmer 1 (11) Java Programmer 2 (7) Javascript Frameworks (1) Java SE Professional (1) JPA 1 - Module (6) JPA 1 - Modules (1) JSP (1) Legacy Java (1) linked list (3) maven (1) Multithreading (16) NFR (1) No SQL (1) Object Oriented (9) OCPJP (4) OCPWCD (1) OOAD (3) Operators (4) Overloading (2) Overriding (2) Overviews (1) policies (1) programming (1) Quartz Scheduler (1) Quizzes (17) RabbitMQ (1) references (2) restful web service (3) Searching (1) security (10) Servlets (8) Servlets and JSP (31) Site Usage Guidelines (1) Sorting (1) source code management (1) spring (4) spring boot (3) Spring Examples (1) Spring Features (1) spring jpa (1) Stack (1) Streams & IO (3) Strings (11) SW Developer Tools (2) testing (1) troubleshooting (1) user interface (1) vxml (8) web services (1) Web Technologies (1) Web Technology Books (1) youtube (1)